[jira] [Commented] (DRILL-5898) Query returns columns in the wrong order

2017-10-24 Thread Robert Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16218126#comment-16218126
 ] 

Robert Hou commented on DRILL-5898:
---

I will update the expected results file.

> Query returns columns in the wrong order
> 
>
> Key: DRILL-5898
> URL: https://issues.apache.org/jira/browse/DRILL-5898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Robert Hou
>Priority: Blocker
> Fix For: 1.12.0
>
>
> This is a regression.  It worked with this commit:
> {noformat}
> f1d1945b3772bb782039fd6811e34a7de66441c8  DRILL-5582: C++ Client: [Threat 
> Modeling] Drillbit may be spoofed by an attacker and this may lead to data 
> being written to the attacker's target instead of Drillbit
> {noformat}
> It fails with this commit, although there are six commits total between the 
> last good one and this one:
> {noformat}
> b0c4e0486d6d4620b04a1bb8198e959d433b4840  DRILL-5876: Use openssl profile 
> to include netty-tcnative dependency with the platform specific classifier
> {noformat}
> Query is:
> {noformat}
> select * from 
> dfs.`/drill/testdata/tpch100_dir_partitioned_5files/lineitem` where 
> dir0=2006 and dir1=12 and dir2=15 and l_discount=0.07 order by l_orderkey, 
> l_extendedprice limit 10
> {noformat}
> Columns are returned in a different order.  Here are the expected results:
> {noformat}
> foxes. furiously final ideas cajol1994-05-27  0.071731.42 4   
> F   653442  4965666.0   1.0 1994-06-23  A   1994-06-22
>   NONESHIP215671  0.07200612  15 (1 time(s))
> lly final account 1994-11-09  0.0745881.783   F   
> 653412  1.320809E7  46.01994-11-24  R   1994-11-08  TAKE 
> BACK RETURNREG AIR 458104  0.08200612  15 (1 time(s))
>  the asymptotes   1997-12-29  0.0760882.8 6   O   653413  
> 1.4271413E7 44.01998-02-04  N   1998-01-20  DELIVER IN 
> PERSON   MAIL21456   0.05200612  15 (1 time(s))
> carefully a   1996-09-23  0.075381.88 2   O   653378  
> 1.6702792E7 3.0 1996-11-14  N   1996-10-15  NONEREG 
> AIR 952809  0.05200612  15 (1 time(s))
> ly final requests. boldly ironic theo 1995-09-04  0.072019.94 2   
> O   653380  2416094.0   2.0 1995-11-14  N   1995-10-18
>   COLLECT COD FOB 166101  0.02200612  15 (1 time(s))
> alongside of the even, e  1996-02-14  0.0786140.322   
> O   653409  5622872.0   48.01996-05-02  N   1996-04-22
>   NONESHIP372888  0.04200612  15 (1 time(s))
> es. regular instruct  1996-10-18  0.0725194.0 1   O   653382  
> 6048060.0   25.01996-08-29  N   1996-08-20  DELIVER IN 
> PERSON   AIR 798079  0.0 200612  15 (1 time(s))
> en package1993-09-19  0.0718718.322   F   653440  
> 1.372054E7  12.01993-09-12  A   1993-09-09  DELIVER IN 
> PERSON   TRUCK   970554  0.0 200612  15 (1 time(s))
> ly regular deposits snooze. unusual, even 1998-01-18  0.07
> 12427.921   O   653413  2822631.0   8.0 1998-02-09
>   N   1998-02-05  TAKE BACK RETURNREG AIR 322636  0.01
> 200612  15 (1 time(s))
>  ironic ideas. bra1996-10-13  0.0764711.533   O   
> 653383  6806672.0   41.01996-12-06  N   1996-11-10  TAKE 
> BACK RETURNAIR 556691  0.01200612  15 (1 time(s))
> {noformat}
> Here are the actual results:
> {noformat}
> 2006  12  15  653383  6806672 556691  3   41.064711.53
> 0.070.01N   O   1996-11-10  1996-10-13  1996-12-06
>   TAKE BACK RETURNAIR  ironic ideas. bra
> 2006  12  15  653378  16702792952809  2   3.0 5381.88 
> 0.070.05N   O   1996-10-15  1996-09-23  1996-11-14
>   NONEREG AIR carefully a
> 2006  12  15  653380  2416094 166101  2   2.0 2019.94 0.07
> 0.02N   O   1995-10-18  1995-09-04  1995-11-14  
> COLLECT COD FOB ly final requests. boldly ironic theo
> 2006  12  15  653413  2822631 322636  1   8.0 12427.92
> 0.070.01N   O   1998-02-05  1998-01-18  1998-02-09
>   TAKE BACK RETURNREG AIR ly regular deposits snooze. unusual, even 
> 2006  12  15  653382  6048060 798079  1   25.0  

[jira] [Assigned] (DRILL-5898) Query returns columns in the wrong order

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou reassigned DRILL-5898:
-

Assignee: Robert Hou  (was: Vitalii Diravka)

> Query returns columns in the wrong order
> 
>
> Key: DRILL-5898
> URL: https://issues.apache.org/jira/browse/DRILL-5898
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Robert Hou
>Priority: Blocker
> Fix For: 1.12.0
>
>
> This is a regression.  It worked with this commit:
> {noformat}
> f1d1945b3772bb782039fd6811e34a7de66441c8  DRILL-5582: C++ Client: [Threat 
> Modeling] Drillbit may be spoofed by an attacker and this may lead to data 
> being written to the attacker's target instead of Drillbit
> {noformat}
> It fails with this commit, although there are six commits total between the 
> last good one and this one:
> {noformat}
> b0c4e0486d6d4620b04a1bb8198e959d433b4840  DRILL-5876: Use openssl profile 
> to include netty-tcnative dependency with the platform specific classifier
> {noformat}
> Query is:
> {noformat}
> select * from 
> dfs.`/drill/testdata/tpch100_dir_partitioned_5files/lineitem` where 
> dir0=2006 and dir1=12 and dir2=15 and l_discount=0.07 order by l_orderkey, 
> l_extendedprice limit 10
> {noformat}
> Columns are returned in a different order.  Here are the expected results:
> {noformat}
> foxes. furiously final ideas cajol1994-05-27  0.071731.42 4   
> F   653442  4965666.0   1.0 1994-06-23  A   1994-06-22
>   NONESHIP215671  0.07200612  15 (1 time(s))
> lly final account 1994-11-09  0.0745881.783   F   
> 653412  1.320809E7  46.01994-11-24  R   1994-11-08  TAKE 
> BACK RETURNREG AIR 458104  0.08200612  15 (1 time(s))
>  the asymptotes   1997-12-29  0.0760882.8 6   O   653413  
> 1.4271413E7 44.01998-02-04  N   1998-01-20  DELIVER IN 
> PERSON   MAIL21456   0.05200612  15 (1 time(s))
> carefully a   1996-09-23  0.075381.88 2   O   653378  
> 1.6702792E7 3.0 1996-11-14  N   1996-10-15  NONEREG 
> AIR 952809  0.05200612  15 (1 time(s))
> ly final requests. boldly ironic theo 1995-09-04  0.072019.94 2   
> O   653380  2416094.0   2.0 1995-11-14  N   1995-10-18
>   COLLECT COD FOB 166101  0.02200612  15 (1 time(s))
> alongside of the even, e  1996-02-14  0.0786140.322   
> O   653409  5622872.0   48.01996-05-02  N   1996-04-22
>   NONESHIP372888  0.04200612  15 (1 time(s))
> es. regular instruct  1996-10-18  0.0725194.0 1   O   653382  
> 6048060.0   25.01996-08-29  N   1996-08-20  DELIVER IN 
> PERSON   AIR 798079  0.0 200612  15 (1 time(s))
> en package1993-09-19  0.0718718.322   F   653440  
> 1.372054E7  12.01993-09-12  A   1993-09-09  DELIVER IN 
> PERSON   TRUCK   970554  0.0 200612  15 (1 time(s))
> ly regular deposits snooze. unusual, even 1998-01-18  0.07
> 12427.921   O   653413  2822631.0   8.0 1998-02-09
>   N   1998-02-05  TAKE BACK RETURNREG AIR 322636  0.01
> 200612  15 (1 time(s))
>  ironic ideas. bra1996-10-13  0.0764711.533   O   
> 653383  6806672.0   41.01996-12-06  N   1996-11-10  TAKE 
> BACK RETURNAIR 556691  0.01200612  15 (1 time(s))
> {noformat}
> Here are the actual results:
> {noformat}
> 2006  12  15  653383  6806672 556691  3   41.064711.53
> 0.070.01N   O   1996-11-10  1996-10-13  1996-12-06
>   TAKE BACK RETURNAIR  ironic ideas. bra
> 2006  12  15  653378  16702792952809  2   3.0 5381.88 
> 0.070.05N   O   1996-10-15  1996-09-23  1996-11-14
>   NONEREG AIR carefully a
> 2006  12  15  653380  2416094 166101  2   2.0 2019.94 0.07
> 0.02N   O   1995-10-18  1995-09-04  1995-11-14  
> COLLECT COD FOB ly final requests. boldly ironic theo
> 2006  12  15  653413  2822631 322636  1   8.0 12427.92
> 0.070.01N   O   1998-02-05  1998-01-18  1998-02-09
>   TAKE BACK RETURNREG AIR ly regular deposits snooze. unusual, even 
> 2006  12  15  653382  6048060 798079  1   25.025194.0 0.07
> 0.0 N   O 

[jira] [Commented] (DRILL-5905) Exclude jdk-tools from project dependencies

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16218013#comment-16218013
 ] 

ASF GitHub Bot commented on DRILL-5905:
---

GitHub user vrozov opened a pull request:

https://github.com/apache/drill/pull/1009

DRILL-5905: Exclude jdk-tools from project dependencies



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/drill DRILL-5905

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1009.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1009


commit 27b1deb03dbf6697715a9f368512f73b7b4e59c8
Author: Vlad Rozov 
Date:   2017-10-25T02:10:37Z

DRILL-5905: Exclude jdk-tools from project dependencies




> Exclude jdk-tools from project dependencies
> ---
>
> Key: DRILL-5905
> URL: https://issues.apache.org/jira/browse/DRILL-5905
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build & Test
>Reporter: Vlad Rozov
>Assignee: Vlad Rozov
>Priority: Minor
>
> hadoop-annotations and hbase-annotations have system scope dependency on JDK 
> tools.jar. This dependency is provided by JDK and should be excluded from the 
> project dependencies



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5905) Exclude jdk-tools from project dependencies

2017-10-24 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-5905:
-

 Summary: Exclude jdk-tools from project dependencies
 Key: DRILL-5905
 URL: https://issues.apache.org/jira/browse/DRILL-5905
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build & Test
Reporter: Vlad Rozov
Assignee: Vlad Rozov
Priority: Minor


hadoop-annotations and hbase-annotations have system scope dependency on JDK 
tools.jar. This dependency is provided by JDK and should be excluded from the 
project dependencies



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5879) Optimize "Like" operator

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217807#comment-16217807
 ] 

ASF GitHub Bot commented on DRILL-5879:
---

Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1001#discussion_r146708658
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -17,37 +17,166 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternContainsMatcher implements SqlPatternMatcher {
+public final class SqlPatternContainsMatcher implements SqlPatternMatcher {
   final String patternString;
   CharSequence charSequenceWrapper;
   final int patternLength;
+  final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.patternString = patternString;
+this.patternString   = patternString;
 this.charSequenceWrapper = charSequenceWrapper;
-patternLength = patternString.length();
+patternLength= patternString.length();
+
+// The idea is to write loops with simple condition checks to allow 
the Java Hotspot achieve
+// better optimizations (especially vectorization)
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
--- End diff --

Padma, I have two reasons to follow the added complexity
1) The new code is encapsulated within the Contains matching logic; doesn't 
increase code complexity
2) 
o I created a test with the original match logic, pattern and input were 
Strings though passed as CharSequence
o Ran the test with the new and old method (1 billion iterations) on MacOS
o pattern length 
o The old match method performed in 43sec where as the new one performed in 
15sec
o The reason for the speedup is the custom matcher functions have less 
instructions (load and comparison)


> Optimize "Like" operator
> 
>
> Key: DRILL-5879
> URL: https://issues.apache.org/jira/browse/DRILL-5879
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
> Environment: * 
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Minor
> Fix For: 1.12.0
>
>
> Query: select  from  where colA like '%a%' or colA like 
> '%xyz%';
> Improvement Opportunities
> # Avoid isAscii computation (full access of the input string) since we're 
> dealing with the same column twice
> # Optimize the "contains" for-loop 
> Implementation Details
> 1)
> * Added a new integer variable "asciiMode" to the VarCharHolder class
> * The default value is -1 which indicates this info is not known
> * Otherwise this value will be set to either 1 or 0 based on the string being 
> in ASCII mode or Unicode
> * The execution plan already shares the same VarCharHolder instance for all 
> evaluations of the same column value
> * The asciiMode will be correctly set during the first LIKE evaluation and 
> will be reused across other LIKE evaluations
> 2) 
> * The "Contains" LIKE operation is quite expensive as the code needs to 
> access the input string to perform character based comparisons
> * Created 4 versions of the same for-loop to a) make the loop simpler to 
> optimize (Vectorization) and b) minimize comparisons
> Benchmarks
> * Lineitem table 100GB
> * Query: select l_returnflag, count(*) from dfs.`` where l_comment 
> not like '%a%' or l_comment like '%the%' group by l_returnflag
> * Before changes: 33sec
> * After changes: 27sec



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5879) Optimize "Like" operator

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217782#comment-16217782
 ] 

ASF GitHub Bot commented on DRILL-5879:
---

Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1001#discussion_r146705325
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java
 ---
@@ -73,6 +73,9 @@ public void eval() {
 out.start =  in.start;
 if (charCount <= length.value || length.value == 0 ) {
   out.end = in.end;
+  if (charCount == (out.end - out.start)) {
+out.asciiMode = 
org.apache.drill.exec.expr.holders.VarCharHolder.CHAR_MODE_IS_ASCII; // we can 
conclude this string is ASCII
--- End diff --

- As previously stated (when responding to Paul'd comment), the expression 
framework is able to use the same VarCharHolder input variable when it is 
shared amongst multiple expressions
- If the original column was of type var-binary, then the expression 
framework will include a cast to var-char
- The cast logic will also compute the string length
- Using this information to deduce whether the string is pure ASCII or not
- UTF-8 encoding uses 1 byte for ASCII and 2, 3, or 4 for other character 
sets
- If the encoded length and character length are equal, then this means 
this is an ASCII string 


> Optimize "Like" operator
> 
>
> Key: DRILL-5879
> URL: https://issues.apache.org/jira/browse/DRILL-5879
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
> Environment: * 
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Minor
> Fix For: 1.12.0
>
>
> Query: select  from  where colA like '%a%' or colA like 
> '%xyz%';
> Improvement Opportunities
> # Avoid isAscii computation (full access of the input string) since we're 
> dealing with the same column twice
> # Optimize the "contains" for-loop 
> Implementation Details
> 1)
> * Added a new integer variable "asciiMode" to the VarCharHolder class
> * The default value is -1 which indicates this info is not known
> * Otherwise this value will be set to either 1 or 0 based on the string being 
> in ASCII mode or Unicode
> * The execution plan already shares the same VarCharHolder instance for all 
> evaluations of the same column value
> * The asciiMode will be correctly set during the first LIKE evaluation and 
> will be reused across other LIKE evaluations
> 2) 
> * The "Contains" LIKE operation is quite expensive as the code needs to 
> access the input string to perform character based comparisons
> * Created 4 versions of the same for-loop to a) make the loop simpler to 
> optimize (Vectorization) and b) minimize comparisons
> Benchmarks
> * Lineitem table 100GB
> * Query: select l_returnflag, count(*) from dfs.`` where l_comment 
> not like '%a%' or l_comment like '%the%' group by l_returnflag
> * Before changes: 33sec
> * After changes: 27sec



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5864) Selecting a non-existing field from a MapR-DB JSON table fails with NPE

2017-10-24 Thread Padma Penumarthy (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Padma Penumarthy updated DRILL-5864:

Labels: ready-to-commit  (was: )

> Selecting a non-existing field from a MapR-DB JSON table fails with NPE
> ---
>
> Key: DRILL-5864
> URL: https://issues.apache.org/jira/browse/DRILL-5864
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators, Storage - MapRDB
>Affects Versions: 1.12.0
>Reporter: Abhishek Girish
>Assignee: Hanumath Rao Maduri
>  Labels: ready-to-commit
> Attachments: OrderByNPE.log, OrderByNPE2.log
>
>
> Query 1
> {code}
> > select C_FIRST_NAME,C_BIRTH_COUNTRY,C_BIRTH_YEAR,C_BIRTH_MONTH,C_BIRTH_DAY 
> > from customer ORDER BY C_BIRTH_COUNTRY ASC, C_FIRST_NAME ASC LIMIT 10;
> Error: SYSTEM ERROR: NullPointerException
>   (java.lang.NullPointerException) null
> org.apache.drill.exec.record.SchemaUtil.coerceContainer():176
> 
> org.apache.drill.exec.physical.impl.xsort.managed.BufferedBatches.convertBatch():124
> org.apache.drill.exec.physical.impl.xsort.managed.BufferedBatches.add():90
> org.apache.drill.exec.physical.impl.xsort.managed.SortImpl.addBatch():265
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch():421
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():357
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():302
> org.apache.drill.exec.record.AbstractRecordBatch.next():164
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():164
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115
> org.apache.drill.exec.record.AbstractRecordBatch.next():164
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext():115
> org.apache.drill.exec.record.AbstractRecordBatch.next():164
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
> org.apache.drill.exec.record.AbstractRecordBatch.next():164
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():134
> org.apache.drill.exec.record.AbstractRecordBatch.next():164
> org.apache.drill.exec.physical.impl.BaseRootExec.next():105
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():95
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748 (state=,code=0)
> {code}
> Plan
> {code}
> 00-00Screen
> 00-01  Project(C_FIRST_NAME=[$0], C_BIRTH_COUNTRY=[$1], 
> C_BIRTH_YEAR=[$2], C_BIRTH_MONTH=[$3], C_BIRTH_DAY=[$4])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[10])
> 00-04Limit(fetch=[10])
> 00-05  SelectionVectorRemover
> 00-06Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
> 00-07  Scan(groupscan=[JsonTableGroupScan 
> [ScanSpec=JsonScanSpec 
> [tableName=maprfs:///drill/testdata/tpch/sf1/maprdb/json/range/customer, 
> condition=null], columns=[`C_FIRST_NAME`, `C_

[jira] [Commented] (DRILL-5878) TableNotFound exception is being reported for a wrong storage plugin.

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217653#comment-16217653
 ] 

ASF GitHub Bot commented on DRILL-5878:
---

Github user HanumathRao commented on the issue:

https://github.com/apache/drill/pull/996
  
@arina-ielchiieva I think this check shouldn't cause much of a performance 
impact as it is in parser code and also it is checked right now by the DRILL's 
custom overload of getTable. If this function throws an exception then it is 
caught and reported to the user. I think there can be improvement to not check 
this by every code path(i.e valid code path as well) and only check when 
super.getTable returns null, in this way we can report error only when we 
couldn't find a table. Please let me know if this approach seems reasonable, so 
that I can go ahead and change the code accordingly.


> TableNotFound exception is being reported for a wrong storage plugin.
> -
>
> Key: DRILL-5878
> URL: https://issues.apache.org/jira/browse/DRILL-5878
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.11.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Minor
> Fix For: 1.12.0
>
>
> Drill is reporting TableNotFound exception for a wrong storage plugin. 
> Consider the following query where employee.json is queried using cp plugin.
> {code}
> 0: jdbc:drill:zk=local> select * from cp.`employee.json` limit 10;
> +--++-++--+-+---++-++--++---+-+-++
> | employee_id  | full_name  | first_name  | last_name  | position_id  
> | position_title  | store_id  | department_id  | birth_date  |   
> hire_date|  salary  | supervisor_id  |  education_level  | 
> marital_status  | gender  |  management_role   |
> +--++-++--+-+---++-++--++---+-+-++
> | 1| Sheri Nowmer   | Sheri   | Nowmer | 1
> | President   | 0 | 1  | 1961-08-26  | 
> 1994-12-01 00:00:00.0  | 8.0  | 0  | Graduate Degree   | S
>| F   | Senior Management  |
> | 2| Derrick Whelply| Derrick | Whelply| 2
> | VP Country Manager  | 0 | 1  | 1915-07-03  | 
> 1994-12-01 00:00:00.0  | 4.0  | 1  | Graduate Degree   | M
>| M   | Senior Management  |
> | 4| Michael Spence | Michael | Spence | 2
> | VP Country Manager  | 0 | 1  | 1969-06-20  | 
> 1998-01-01 00:00:00.0  | 4.0  | 1  | Graduate Degree   | S
>| M   | Senior Management  |
> | 5| Maya Gutierrez | Maya| Gutierrez  | 2
> | VP Country Manager  | 0 | 1  | 1951-05-10  | 
> 1998-01-01 00:00:00.0  | 35000.0  | 1  | Bachelors Degree  | M
>| F   | Senior Management  |
> | 6| Roberta Damstra| Roberta | Damstra| 3
> | VP Information Systems  | 0 | 2  | 1942-10-08  | 
> 1994-12-01 00:00:00.0  | 25000.0  | 1  | Bachelors Degree  | M
>| F   | Senior Management  |
> | 7| Rebecca Kanagaki   | Rebecca | Kanagaki   | 4
> | VP Human Resources  | 0 | 3  | 1949-03-27  | 
> 1994-12-01 00:00:00.0  | 15000.0  | 1  | Bachelors Degree  | M
>| F   | Senior Management  |
> | 8| Kim Brunner| Kim | Brunner| 11   
> | Store Manager   | 9 | 11 | 1922-08-10  | 
> 1998-01-01 00:00:00.0  | 1.0  | 5  | Bachelors Degree  | S
>| F   | Store Management   |
> | 9| Brenda Blumberg| Brenda  | Blumberg   | 11   
> | Store Manager   | 21| 11 | 1979-06-23  | 
> 1998-01-01 00:00:00.0  | 17000.0  | 5  | Graduate Degree   | M
>| F   | Store Management   |
> | 10   | Darren Stanz   | Darren  | Stanz  | 5
> | VP Finance  | 0 | 5  | 1949-08-26  | 
> 1994-12-01 00:00:00.0  | 5.0  | 1  

[jira] [Resolved] (DRILL-5706) Select * on hbase table having multiple regions(one or more empty) returns wrong result intermittently

2017-10-24 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5706.

Resolution: Fixed

> Select * on hbase table having multiple regions(one or more empty) returns 
> wrong result intermittently
> --
>
> Key: DRILL-5706
> URL: https://issues.apache.org/jira/browse/DRILL-5706
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Paul Rogers
>
> 1) Create a hbase table with 4 regions
> {code}
> create 'myhbase', 'cf1', {SPLITS => ['a', 'b', 'c']}
> put 'myhbase','a','cf1:col1','somedata'
> put 'myhbase','b','cf1:col1','somedata'
> put 'myhbase','c','cf1:col1','somedata'
> {code}
> 2) Run select * on the hbase table
> {code}
> select * from hbase.myhbase;
> {code}
> The query returns wrong result intermittently



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5795) Filter pushdown for parquet handles multi rowgroup file

2017-10-24 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5795:
---
Reviewer:   (was: Paul Rogers)

> Filter pushdown for parquet handles multi rowgroup file
> ---
>
> Key: DRILL-5795
> URL: https://issues.apache.org/jira/browse/DRILL-5795
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.11.0
>Reporter: Damien Profeta
>Assignee: Damien Profeta
>  Labels: doc-impacting
> Fix For: 1.12.0
>
> Attachments: multirowgroup_overlap.parquet
>
>
> DRILL-1950 implemented the filter pushdown for parquet file but only in the 
> case of one rowgroup per parquet file. In the case of multiple rowgroups per 
> files, it detects that the rowgroup can be pruned but then tell to the 
> drillbit to read the whole file which leads to performance issue.
> Having multiple rowgroup per file helps to handle partitioned dataset and 
> still read only the relevant subset of data without ending with more file 
> than really needed.
> Let's say for instance you have a Parquet file composed of RG1 and RG2 with 
> only one column a. Min/max in RG1 are 1-2 and min/max in RG2 are 2-3.
> If I do "select a from file where a=3", today it will read the whole file, 
> with the patch it will only read RG2.
> *For documentation*
> Support / Other section in 
> https://drill.apache.org/docs/parquet-filter-pushdown/ should be updated.
> After the fix files with multiple row groups will be supported.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (DRILL-5873) Drill C++ Client should throw proper/complete error message for the ODBC driver to consume

2017-10-24 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra resolved DRILL-5873.
--
Resolution: Fixed

> Drill C++ Client should throw proper/complete error message for the ODBC 
> driver to consume
> --
>
> Key: DRILL-5873
> URL: https://issues.apache.org/jira/browse/DRILL-5873
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Krystal
>Assignee: Parth Chandra
>  Labels: ready-to-commit
>
> The Drill C++ Client should throw a proper/complete error message for the 
> driver to utilize.
> The ODBC driver is directly outputting the exception message thrown by the 
> client by calling the getError() API after the connect() API has failed with 
> an error status.
> For the Java client, similar logic is hard coded at 
> https://github.com/apache/drill/blob/1.11.0/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java#L247.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5286) When rel and target candidate set is the same, planner should not need to do convert for the relNode since it must have been done

2017-10-24 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5286:
---
Reviewer:   (was: Paul Rogers)

> When rel and target candidate set is the same, planner should not need to do 
> convert for the relNode since it must have been done
> -
>
> Key: DRILL-5286
> URL: https://issues.apache.org/jira/browse/DRILL-5286
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Chunhui Shi
>Assignee: Chunhui Shi
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217459#comment-16217459
 ] 

ASF GitHub Bot commented on DRILL-5896:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146658667
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -75,6 +75,8 @@
 
   private TableName hbaseTableName;
   private Scan hbaseScan;
+  private Scan hbaseScan1;
+  Set completeFamilies;
--- End diff --

`private`?


> Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
> --
>
> Key: DRILL-5896
> URL: https://issues.apache.org/jira/browse/DRILL-5896
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> When a hbase query projects both a column family and a column in the column 
> family, the vector for the column is not created in the HbaseRecordReader.
> So, in cases where scan batch is empty we create a NullableInt vector for 
> this column. We need to handle column creation in the reader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217458#comment-16217458
 ] 

ASF GitHub Bot commented on DRILL-5896:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146659105
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -121,16 +125,18 @@ public HBaseRecordReader(Connection connection, 
HBaseSubScan.HBaseSubScanSpec su
 byte[] family = root.getPath().getBytes();
 transformed.add(SchemaPath.getSimplePath(root.getPath()));
 PathSegment child = root.getChild();
-if (!completeFamilies.contains(new String(family, 
StandardCharsets.UTF_8).toLowerCase())) {
-  if (child != null && child.isNamed()) {
-byte[] qualifier = child.getNameSegment().getPath().getBytes();
+if (child != null && child.isNamed()) {
+  byte[] qualifier = child.getNameSegment().getPath().getBytes();
+  hbaseScan1.addColumn(family, qualifier);
+  if (!completeFamilies.contains(new String(family, 
StandardCharsets.UTF_8))) {
--- End diff --

Redundant conversion of `family` to `String`, here and below.


> Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
> --
>
> Key: DRILL-5896
> URL: https://issues.apache.org/jira/browse/DRILL-5896
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> When a hbase query projects both a column family and a column in the column 
> family, the vector for the column is not created in the HbaseRecordReader.
> So, in cases where scan batch is empty we create a NullableInt vector for 
> this column. We need to handle column creation in the reader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217457#comment-16217457
 ] 

ASF GitHub Bot commented on DRILL-5896:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146658771
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -87,6 +89,7 @@ public HBaseRecordReader(Connection connection, 
HBaseSubScan.HBaseSubScanSpec su
 hbaseTableName = TableName.valueOf(
 Preconditions.checkNotNull(subScanSpec, "HBase reader needs a 
sub-scan spec").getTableName());
 hbaseScan = new Scan(subScanSpec.getStartRow(), 
subScanSpec.getStopRow());
+hbaseScan1 = new Scan();
--- End diff --

Better name or comment to explain.


> Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
> --
>
> Key: DRILL-5896
> URL: https://issues.apache.org/jira/browse/DRILL-5896
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> When a hbase query projects both a column family and a column in the column 
> family, the vector for the column is not created in the HbaseRecordReader.
> So, in cases where scan batch is empty we create a NullableInt vector for 
> this column. We need to handle column creation in the reader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5896) Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217460#comment-16217460
 ] 

ASF GitHub Bot commented on DRILL-5896:
---

Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146659717
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -186,6 +192,10 @@ public void setup(OperatorContext context, 
OutputMutator output) throws Executio
   }
 }
   }
+
+  for (String familyName : completeFamilies) {
+getOrCreateFamilyVector(familyName, false);
+  }
--- End diff --

Does this create just the map, or also the vectors within the map? Maybe a 
comment to explain the goals?


> Handle vector creation in HbaseRecordReader to avoid NullableInt vectors later
> --
>
> Key: DRILL-5896
> URL: https://issues.apache.org/jira/browse/DRILL-5896
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> When a hbase query projects both a column family and a column in the column 
> family, the vector for the column is not created in the HbaseRecordReader.
> So, in cases where scan batch is empty we create a NullableInt vector for 
> this column. We need to handle column creation in the reader.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5890) Tests Leak Many Open File Descriptors

2017-10-24 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-5890:
-
Labels:   (was: read)

> Tests Leak Many Open File Descriptors
> -
>
> Key: DRILL-5890
> URL: https://issues.apache.org/jira/browse/DRILL-5890
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: salim achouche
>
> Salim and I have discovered that the tests leak many open file descriptors 
> and the tests can hang with even a 64k open file limit. Also doing an lsof 
> periodically shows the number of open files steadily grows over time as the 
> tests run. Fixing this would likely speed up the unit tests and prevent 
> developers from scratching their heads about why the tests are hanging or 
> throwing Too Many Open file exceptions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5890) Tests Leak Many Open File Descriptors

2017-10-24 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra updated DRILL-5890:
-
Labels: read  (was: )

> Tests Leak Many Open File Descriptors
> -
>
> Key: DRILL-5890
> URL: https://issues.apache.org/jira/browse/DRILL-5890
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Timothy Farkas
>Assignee: salim achouche
>
> Salim and I have discovered that the tests leak many open file descriptors 
> and the tests can hang with even a 64k open file limit. Also doing an lsof 
> periodically shows the number of open files steadily grows over time as the 
> tests run. Fixing this would likely speed up the unit tests and prevent 
> developers from scratching their heads about why the tests are hanging or 
> throwing Too Many Open file exceptions.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5869) Empty maps not handled

2017-10-24 Thread Hanumath Rao Maduri (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanumath Rao Maduri reassigned DRILL-5869:
--

Assignee: Hanumath Rao Maduri

> Empty maps not handled 
> ---
>
> Key: DRILL-5869
> URL: https://issues.apache.org/jira/browse/DRILL-5869
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Hanumath Rao Maduri
>
> Consider the below json -
> {code}
> {a:{}}
> {code}
> A query on the column 'a' throws NPE -
> {code}
> select a from temp.json;
> {code}
> Stack trace -
> {code}
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> NullPointerException
> Fragment 0:0
> [Error Id: 7f81fa02-4b20-4401-9d18-bd901653d11d on pns182.qa.lab:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_144]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_144]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_144]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.test.generated.ProjectorGen0.setup(ProjectorTemplate.java:91)
>  ~[na:na]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput(ProjectRecordBatch.java:497)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:505)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:82)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:105) 
> ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:95) 
> ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:234)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:227)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at java.security.AccessController.doPrivileged(Native Method) 
> ~[na:1.8.0_144]
>   at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_144]
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
>  ~[hadoop-common-2.7.0-mapr-1607.jar:na]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:227)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   ... 4 common frames omitted
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5904) TestCTAS.testPartitionByForAllType fails sporadically on some build machines

2017-10-24 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-5904:
-

 Summary: TestCTAS.testPartitionByForAllType fails sporadically on 
some build machines
 Key: DRILL-5904
 URL: https://issues.apache.org/jira/browse/DRILL-5904
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas


Vlad found that the TestCTAS.testPartitionByForAllType test sporadically fails 
with this stack trace sometimes.

testPartitionByForAllTypes(org.apache.drill.exec.sql.TestCTAS)
java.lang.Exception: test timed out after 10 milliseconds
at java.io.UnixFileSystem.canonicalize0(Native Method) ~[na:1.7.0_131]
at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:172) 
~[na:1.7.0_131]
at java.io.File.getCanonicalPath(File.java:618) ~[na:1.7.0_131]
at java.io.File.getCanonicalFile(File.java:643) ~[na:1.7.0_131]
at org.apache.commons.io.FileUtils.isSymlink(FileUtils.java:2935) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1534) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.deleteQuietly(FileUtils.java:1566) 
~[commons-io-2.4.jar:2.4]
at 
org.apache.drill.exec.sql.TestCTAS.testPartitionByForAllTypes(TestCTAS.java:292)
 ~[test-classes/:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.7.0_131]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
~[na:1.7.0_131]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.7.0_131]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_131]
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 ~[junit-4.11.jar:na]
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 ~[junit-4.11.jar:na]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 ~[junit-4.11.jar:na]
at 
mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
 ~[jmockit-1.3.jar:na]
at 
mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
 ~[jmockit-1.3.jar:na]
at 
mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
 ~[jmockit-1.3.jar:na]
at sun.reflect.GeneratedMethodAccessor97.invoke(Unknown Source) ~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.7.0_131]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_131]
at 
mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
 ~[jmockit-1.3.jar:na]
at 
mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76) 
~[jmockit-1.3.jar:na]
at 
mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) 
~[jmockit-1.3.jar:na]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java) 
~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 ~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
 ~[junit-4.11.jar:na]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5706) Select * on hbase table having multiple regions(one or more empty) returns wrong result intermittently

2017-10-24 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216919#comment-16216919
 ] 

Vitalii Diravka commented on DRILL-5706:


[~paul-rogers] Can we close it, since it was fixed in context of DRILL-5830?

> Select * on hbase table having multiple regions(one or more empty) returns 
> wrong result intermittently
> --
>
> Key: DRILL-5706
> URL: https://issues.apache.org/jira/browse/DRILL-5706
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Paul Rogers
>
> 1) Create a hbase table with 4 regions
> {code}
> create 'myhbase', 'cf1', {SPLITS => ['a', 'b', 'c']}
> put 'myhbase','a','cf1:col1','somedata'
> put 'myhbase','b','cf1:col1','somedata'
> put 'myhbase','c','cf1:col1','somedata'
> {code}
> 2) Run select * on the hbase table
> {code}
> select * from hbase.myhbase;
> {code}
> The query returns wrong result intermittently



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query

2017-10-24 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5900:


Assignee: Boaz Ben-Zvi  (was: Arina Ielchiieva)

> Regression: TPCH query encounters random IllegalStateException: Memory was 
> leaked by query
> --
>
> Key: DRILL-5900
> URL: https://issues.apache.org/jira/browse/DRILL-5900
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
> Attachments: 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f.sys.drill, 
> drillbit.log.node81, drillbit.log.node88
>
>
> This is a random failure in the TPCH-SF100-baseline run.  The test is 
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql.
>   This test has passed before.
> TPCH query 6:
> {noformat}
> SELECT
>   SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>   lineitem L,
>   part P
> WHERE
>   P.P_PARTKEY = L.L_PARTKEY
>   AND P.P_BRAND = 'BRAND#13'
>   AND P.P_CONTAINER = 'JUMBO CAN'
>   AND L.L_QUANTITY < (
> SELECT
>   0.2 * AVG(L2.L_QUANTITY)
> FROM
>   lineitem L2
> WHERE
>   L2.L_PARTKEY = P.P_PARTKEY
>   )
> {noformat}
> Error is:
> {noformat}
> 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
> Memory was leaked by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Memory was leaked by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> Caused by: java.lang.IllegalStateException: Memory was leaked by query. 
> Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) 
> ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) 
> ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> ... 5 common frames omitted
> 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:6:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:6:0: State to report: FINISHED
> {noformat}
> sys.version is:
> 1.12.0-SNAPSHOT   b0c4e0486d6d462

[jira] [Assigned] (DRILL-5903) Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5903:


Assignee: Arina Ielchiieva  (was: Pritesh Maker)

> Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete."
> ---
>
> Key: DRILL-5903
> URL: https://issues.apache.org/jira/browse/DRILL-5903
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata, Storage - Parquet
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Arina Ielchiieva
>Priority: Critical
> Attachments: 26122f83-6956-5aa8-d8de-d4808f572160.sys.drill, 
> drillbit.log
>
>
> This is a query from the Functional-Baseline-100.171 run.  The test is 
> /root/drillAutomation/mapr/framework/resources/Functional/parquet_storage/parquet_date/mc_parquet_date/generic/mixed1_partitioned5.q.
> Query is:
> {noformat}
> select a.int_col, b.date_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
> inner join ( select date_col, int_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
> where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as 
> date)= date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
> a.int_col, b.date_col
> {noformat}
> From drillbit.log:
> {noformat}
> fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM 
> vwOnParq_wCst_35
> 2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
> o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete. Total runnable size 3, parallelism 3.
> 2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
> but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 
> 3, parallelism 3.
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
> 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
> runnable size 3, parallelism 3.
> [Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
> [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811)
>  [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
>   at

[jira] [Assigned] (DRILL-1131) Drill should ignore files in starting with . _

2017-10-24 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-1131:


Assignee: Timothy Farkas  (was: Pritesh Maker)

> Drill should ignore files in starting with . _
> --
>
> Key: DRILL-1131
> URL: https://issues.apache.org/jira/browse/DRILL-1131
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Parquet
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Timothy Farkas
> Fix For: Future
>
>
> Files containing . and _ as the first characters are ignored by hive and 
> others are these are typically logs and status files written out by tools 
> like mapreduce. Drill should not read them when querying a directory 
> containing a list of parquet files.
> Currently it fails with the error:
> message: "Failure while setting up Foreman. < AssertionError:[ Internal 
> error: Error while applying rule DrillPushProjIntoScan, args 
> [rel#78:ProjectRel.NONE.ANY([]).[](child=rel#15:Subset#1.ENUMERABLE.ANY([]).[],p_partkey=$1,p_type=$2),
>  rel#8:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[dfs, 
> drillTestDirDencTpchSF100, part])] ] < DrillRuntimeException:[ 
> java.io.IOException: Could not read footer: java.io.IOException: Could not 
> read footer for file com.mapr.fs.MapRFileStatus@99c9d45e ] < IOException:[ 
> Could not read footer: java.io.IOException: Could not read footer for file 
> com.mapr.fs.MapRFileStatus@99c9d45e ] < IOException:[ Could not read footer 
> for file com.mapr.fs.MapRFileStatus@99c9d45e ] < IOException:[ Open failed 
> for file: /drill/testdata/dencSF100/part/.impala_insert_staging, error: 
> Invalid argument (22) ]"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5891) When Drill runs out of memory for a HashAgg, it should tell the user how much memory to allocate

2017-10-24 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5891:


Assignee: Boaz Ben-Zvi  (was: Pritesh Maker)

> When Drill runs out of memory for a HashAgg, it should tell the user how much 
> memory to allocate
> 
>
> Key: DRILL-5891
> URL: https://issues.apache.org/jira/browse/DRILL-5891
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Boaz Ben-Zvi
>
> Query is:
> select count(\*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` 
> group by no_nulls_col, nulls_col;
> Error is:
> Error: RESOURCE ERROR: Not enough memory for internal partitioning and 
> fallback mechanism for HashAgg to use unbounded memory is disabled. Either 
> enable fallback config drill.exec.hashagg.fallback.enabled using Alter 
> session/system command or increase memory limit for Drillbit
> From drillbit.log:
> {noformat}
> 2017-10-18 13:30:17,135 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:8] TRACE 
> o.a.d.e.p.i.aggregate.HashAggregator - Incoming sizer: Actual batch schema & 
> sizes {
>   no_nulls_col(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual 
> size: 130, data size: 132892)
>   nulls_col(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 
> 112, data size: 113673)
>   EXPR$0(type: REQUIRED BIGINT, count: 1023, std size: 8, actual size: 8, 
> data size: 8184)
>   EXPR$1(type: OPTIONAL VARCHAR, count: 1023, std size: 54, actual size: 18, 
> data size: 18414)
>   Records: 1023, Total size: 524288, Data size: 273163, Gross row width: 513, 
> Net row width: 268, Density: 53%}
> 2017-10-18 13:30:17,135 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:8] TRACE 
> o.a.d.e.p.i.aggregate.HashAggregator - 2nd phase. Estimated internal row 
> width: 166 Values row width: 66 batch size: 12779520  memory limit: 63161283  
> max column width: 50
> 2017-10-18 13:30:17,139 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:3:2] TRACE 
> o.a.d.e.p.impl.common.HashTable - HT allocated 4784128 for varchar of max 
> width 50
> 2017-10-18 13:30:17,139 [26184629-3f4c-856a-e99e-97cdf0d29321:frag:1:15] INFO 
>  o.a.d.e.p.i.aggregate.HashAggregator - User Error Occurred: Not enough 
> memory for internal partitioning and fallback mechanism for HashAgg to use 
> unbounded memory is disabled. Either enable fallback config 
> drill.exec.hashagg.fallback.enabled using Alter session/system command or 
> increase memory limit for Drillbit
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Not enough 
> memory for internal partitioning and fallback mechanism for HashAgg to use 
> unbounded memory is disabled. Either enable fallback config 
> drill.exec.hashagg.fallback.enabled using Alter session/system command or 
> increase memory limit for Drillbit
> {noformat}
> I would recommend that we add a log message with the "alter" command to 
> increase the amount of memory allocated, and how much memory to allocate.  
> Otherwise, the user may not know what to do.
> I would also not suggest enabling "drill.exec.hashagg.fallback.enabled" 
> except as a last resort.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query

2017-10-24 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5900:


Assignee: Arina Ielchiieva  (was: Pritesh Maker)

> Regression: TPCH query encounters random IllegalStateException: Memory was 
> leaked by query
> --
>
> Key: DRILL-5900
> URL: https://issues.apache.org/jira/browse/DRILL-5900
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Arina Ielchiieva
>Priority: Blocker
> Attachments: 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f.sys.drill, 
> drillbit.log.node81, drillbit.log.node88
>
>
> This is a random failure in the TPCH-SF100-baseline run.  The test is 
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql.
>   This test has passed before.
> TPCH query 6:
> {noformat}
> SELECT
>   SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>   lineitem L,
>   part P
> WHERE
>   P.P_PARTKEY = L.L_PARTKEY
>   AND P.P_BRAND = 'BRAND#13'
>   AND P.P_CONTAINER = 'JUMBO CAN'
>   AND L.L_QUANTITY < (
> SELECT
>   0.2 * AVG(L2.L_QUANTITY)
> FROM
>   lineitem L2
> WHERE
>   L2.L_PARTKEY = P.P_PARTKEY
>   )
> {noformat}
> Error is:
> {noformat}
> 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR 
> o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
> Memory was leaked by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Memory was leaked by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_51]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_51]
> at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
> Caused by: java.lang.IllegalStateException: Memory was leaked by query. 
> Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> at 
> org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) 
> ~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435)
>  ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) 
> ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> ... 5 common frames omitted
> 2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:6:0] INFO  
> o.a.d.e.w.f.FragmentStatusReporter - 
> 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:6:0: State to report: FINISHED
> {noformat}
> sys.version is:
> 1.12.0-SNAPSHOT   b0c4e0486d

[jira] [Assigned] (DRILL-5902) Regression: Queries encounter random failure due to RPC connection timed out

2017-10-24 Thread Pritesh Maker (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5902:


Assignee: Arina Ielchiieva  (was: Pritesh Maker)

> Regression: Queries encounter random failure due to RPC connection timed out
> 
>
> Key: DRILL-5902
> URL: https://issues.apache.org/jira/browse/DRILL-5902
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Arina Ielchiieva
>Priority: Critical
> Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, 
> node196.drillbit.log
>
>
> Multiple random failures (25) occurred with the latest 
> Functional-Baseline-88.193 run.  Here is a sample query:
> {noformat}
> /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql
> -- Kitchen sink
> -- Use all supported functions
> select
> rank()  over W,
> dense_rank()over W,
> percent_rank()  over W,
> cume_dist() over W,
> avg(c_integer + c_integer)  over W,
> sum(c_integer/100)  over W,
> count(*)over W,
> min(c_integer)  over W,
> max(c_integer)  over W,
> row_number()over W
> from
> j7
> where
> c_boolean is not null
> window  W as (partition by c_bigint, c_date, c_time, c_boolean order by 
> c_integer)
> {noformat}
> From the logs:
> {noformat}
> 2017-10-23 04:14:36,536 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> {noformat}
> {noformat}
> 2017-10-23 04:14:53,941 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> 
> /10.10.88.193:38281 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> 2017-10-23 04:14:53,952 [UserServer-1] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> 
> FAILED
> 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> 
> FINISHED
> 2017-10-23 04:14:53,956 [UserServer-1] WARN  
> o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
> response.
> java.lang.IllegalArgumentException: Self-suppression not permitted
> at java.lang.Throwable.addSuppressed(Throwable.java:1043) 
> ~[na:1.7.0_45]
> at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exe

[jira] [Assigned] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order

2017-10-24 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka reassigned DRILL-5822:
--

Assignee: (was: Vitalii Diravka)

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
> Fix For: 1.12.0
>
>
> Repro steps:
> 1) {code}alter session set `planner.slice_target`=1;{code}
> 2) ORDER BY clause in the query.
> Scenarios:
> {code}
> 0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.082 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--+--+--+--+
> | n_nationkey  |  n_name  | n_regionkey  |  n_comment 
>   |
> +--+--+--+--+
> | 0| ALGERIA  | 0|  haggle. carefully final deposits 
> detect slyly agai  |
> +--+--+--+--+
> 1 row selected (0.141 seconds)
> 0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
> +---++
> |  ok   |summary |
> +---++
> | true  | planner.slice_target updated.  |
> +---++
> 1 row selected (0.091 seconds)
> 0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by 
> n_name limit 1;
> +--+--+--+--+
> |  n_comment   |  n_name  | 
> n_nationkey  | n_regionkey  |
> +--+--+--+--+
> |  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0 
>| 0|
> +--+--+--+--+
> 1 row selected (0.201 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order

2017-10-24 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5822:
---
Description: 
Repro steps:
1) {code}alter session set `planner.slice_target`=1;{code}
2) ORDER BY clause in the query.

Scenarios:
{code}
0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
+---++
|  ok   |summary |
+---++
| true  | planner.slice_target updated.  |
+---++
1 row selected (0.082 seconds)
0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name 
limit 1;
+--+--+--+--+
| n_nationkey  |  n_name  | n_regionkey  |  n_comment   
|
+--+--+--+--+
| 0| ALGERIA  | 0|  haggle. carefully final deposits 
detect slyly agai  |
+--+--+--+--+
1 row selected (0.141 seconds)
0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
+---++
|  ok   |summary |
+---++
| true  | planner.slice_target updated.  |
+---++
1 row selected (0.091 seconds)
0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name 
limit 1;
+--+--+--+--+
|  n_comment   |  n_name  | n_nationkey 
 | n_regionkey  |
+--+--+--+--+
|  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0   
 | 0|
+--+--+--+--+
1 row selected (0.201 seconds)
{code}

  was:
Repro steps
1) Have multiple json files in a directory having the same schema
2) Also have one or more empty files 

Scenarios
1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
{code}Result:
+--++--+-+---++-+--+++
| row_key  | p_partkey  |  p_name  | p_mfgr 
 |  p_brand  |   p_type   | p_size  | p_container  | 
p_retailprice  |   p_comment|
+--++--+-+---++-+--+++
| 1| 1  | goldenrod lace spring peru powder| 
Manufacturer#1  | Brand#13  | PROMO BURNISHED COPPER | 7   | JUMBO PKG  
  | 901.0  | ly. slyly ironi|
| 2| 2  | blush rosy metallic lemon navajo | 
Manufacturer#1  | Brand#13  | LARGE BRUSHED BRASS| 1   | LG CASE
  | 902.0  | lar accounts amo   |
{code}
 2) One minor fragment per file
{code}alter session set `planner.slice_target`=1;
select * from dfs.`/json_dir`;{code}
Result:
{code}
+---++--+-+--+++-++--+
|  p_brand  |   p_comment| p_container  | p_mfgr  | 
 p_name  | p_partkey  | p_retailprice  | p_size  |  
 p_type   | row_key  |
+---++--+-+--+++-++--+
| Brand#13  | ly. slyly ironi| JUMBO PKG| Manufacturer#1  | 
goldenrod lace spring peru powder| 1  | 901.0  | 7  
 | PROMO BURNISHED COPPER | 1|
| Brand#13  | lar accounts amo   | LG CASE  | Manufacturer#1  | blush 
rosy metallic lemon navajo | 2  | 902.0  | 1   | 
LARGE BRUSHED BRASS| 2|
{code}
 


> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>   

[jira] [Updated] (DRILL-5822) The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 doesn't preserve column order

2017-10-24 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5822:
---
Summary: The query with "SELECT *" with "ORDER BY" clause and 
`planner.slice_target`=1 doesn't preserve column order  (was: Select * on 
directory containing multiple json files (one or more empty) with same schema 
doesn't preserve column order)

> The query with "SELECT *" with "ORDER BY" clause and `planner.slice_target`=1 
> doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
> Fix For: 1.12.0
>
>
> Repro steps
> 1) Have multiple json files in a directory having the same schema
> 2) Also have one or more empty files 
> Scenarios
> 1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
> {code}Result:
> +--++--+-+---++-+--+++
> | row_key  | p_partkey  |  p_name  | 
> p_mfgr  |  p_brand  |   p_type   | p_size  | p_container  
> | p_retailprice  |   p_comment|
> +--++--+-+---++-+--+++
> | 1| 1  | goldenrod lace spring peru powder| 
> Manufacturer#1  | Brand#13  | PROMO BURNISHED COPPER | 7   | JUMBO 
> PKG| 901.0  | ly. slyly ironi|
> | 2| 2  | blush rosy metallic lemon navajo | 
> Manufacturer#1  | Brand#13  | LARGE BRUSHED BRASS| 1   | LG CASE  
> | 902.0  | lar accounts amo   |
> {code}
>  2) One minor fragment per file
> {code}alter session set `planner.slice_target`=1;
> select * from dfs.`/json_dir`;{code}
> Result:
> {code}
> +---++--+-+--+++-++--+
> |  p_brand  |   p_comment| p_container  | p_mfgr  |   
>p_name  | p_partkey  | p_retailprice  | p_size  |  
>  p_type   | row_key  |
> +---++--+-+--+++-++--+
> | Brand#13  | ly. slyly ironi| JUMBO PKG| Manufacturer#1  | 
> goldenrod lace spring peru powder| 1  | 901.0  | 7
>| PROMO BURNISHED COPPER | 1|
> | Brand#13  | lar accounts amo   | LG CASE  | Manufacturer#1  | blush 
> rosy metallic lemon navajo | 2  | 902.0  | 1   | 
> LARGE BRUSHED BRASS| 2|
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (DRILL-5822) Select * on directory containing multiple json files (one or more empty) with same schema doesn't preserve column order

2017-10-24 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216742#comment-16216742
 ] 

Vitalii Diravka commented on DRILL-5822:


[~prasadns14] 
With "order by" clause I have reproduced the issue. Moreover empty files are 
not necessary to hit the issue. 
The necessary conditions are: 
1. `planner.slice_target` = 1; 
2. ORDER BY clause in the query.

I brought this jira description into correspondence.

Eventually DRILL-5845 is other issue. But this issue is reproduced for 
TopNBatch operator as well: 
{code}
0: jdbc:drill:zk=local> alter session reset `planner.slice_target`;
+---++
|  ok   |summary |
+---++
| true  | planner.slice_target updated.  |
+---++
1 row selected (0.082 seconds)
0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name 
limit 1;
+--+--+--+--+
| n_nationkey  |  n_name  | n_regionkey  |  n_comment   
|
+--+--+--+--+
| 0| ALGERIA  | 0|  haggle. carefully final deposits 
detect slyly agai  |
+--+--+--+--+
1 row selected (0.141 seconds)
0: jdbc:drill:zk=local> alter session set `planner.slice_target`=1;
+---++
|  ok   |summary |
+---++
| true  | planner.slice_target updated.  |
+---++
1 row selected (0.091 seconds)
0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name 
limit 1;
+--+--+--+--+
|  n_comment   |  n_name  | n_nationkey 
 | n_regionkey  |
+--+--+--+--+
|  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0   
 | 0|
+--+--+--+--+
1 row selected (0.201 seconds)
{code} 


> Select * on directory containing multiple json files (one or more empty) with 
> same schema doesn't preserve column order
> ---
>
> Key: DRILL-5822
> URL: https://issues.apache.org/jira/browse/DRILL-5822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Vitalii Diravka
> Fix For: 1.12.0
>
>
> Repro steps
> 1) Have multiple json files in a directory having the same schema
> 2) Also have one or more empty files 
> Scenarios
> 1) Only one minor fragment{code}select * from dfs.`/json_dir`;{code}
> {code}Result:
> +--++--+-+---++-+--+++
> | row_key  | p_partkey  |  p_name  | 
> p_mfgr  |  p_brand  |   p_type   | p_size  | p_container  
> | p_retailprice  |   p_comment|
> +--++--+-+---++-+--+++
> | 1| 1  | goldenrod lace spring peru powder| 
> Manufacturer#1  | Brand#13  | PROMO BURNISHED COPPER | 7   | JUMBO 
> PKG| 901.0  | ly. slyly ironi|
> | 2| 2  | blush rosy metallic lemon navajo | 
> Manufacturer#1  | Brand#13  | LARGE BRUSHED BRASS| 1   | LG CASE  
> | 902.0  | lar accounts amo   |
> {code}
>  2) One minor fragment per file
> {code}alter session set `planner.slice_target`=1;
> select * from dfs.`/json_dir`;{code}
> Result:
> {code}
> +---++--+-+--+++-++--+
> |  p_brand  |   p_comment| p_container  | p_mfgr  |   
>p_name  | p_partkey  | p_retailprice  | p_size  |  
>  p_type   | row_key  |
> +---++--+-+

[jira] [Commented] (DRILL-5878) TableNotFound exception is being reported for a wrong storage plugin.

2017-10-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216614#comment-16216614
 ] 

ASF GitHub Bot commented on DRILL-5878:
---

Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/996
  
So does it mean that schema validation will be done twice: first in Drill 
and then in Calcite? Will it influence query parsing performance?


> TableNotFound exception is being reported for a wrong storage plugin.
> -
>
> Key: DRILL-5878
> URL: https://issues.apache.org/jira/browse/DRILL-5878
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Affects Versions: 1.11.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Minor
> Fix For: 1.12.0
>
>
> Drill is reporting TableNotFound exception for a wrong storage plugin. 
> Consider the following query where employee.json is queried using cp plugin.
> {code}
> 0: jdbc:drill:zk=local> select * from cp.`employee.json` limit 10;
> +--++-++--+-+---++-++--++---+-+-++
> | employee_id  | full_name  | first_name  | last_name  | position_id  
> | position_title  | store_id  | department_id  | birth_date  |   
> hire_date|  salary  | supervisor_id  |  education_level  | 
> marital_status  | gender  |  management_role   |
> +--++-++--+-+---++-++--++---+-+-++
> | 1| Sheri Nowmer   | Sheri   | Nowmer | 1
> | President   | 0 | 1  | 1961-08-26  | 
> 1994-12-01 00:00:00.0  | 8.0  | 0  | Graduate Degree   | S
>| F   | Senior Management  |
> | 2| Derrick Whelply| Derrick | Whelply| 2
> | VP Country Manager  | 0 | 1  | 1915-07-03  | 
> 1994-12-01 00:00:00.0  | 4.0  | 1  | Graduate Degree   | M
>| M   | Senior Management  |
> | 4| Michael Spence | Michael | Spence | 2
> | VP Country Manager  | 0 | 1  | 1969-06-20  | 
> 1998-01-01 00:00:00.0  | 4.0  | 1  | Graduate Degree   | S
>| M   | Senior Management  |
> | 5| Maya Gutierrez | Maya| Gutierrez  | 2
> | VP Country Manager  | 0 | 1  | 1951-05-10  | 
> 1998-01-01 00:00:00.0  | 35000.0  | 1  | Bachelors Degree  | M
>| F   | Senior Management  |
> | 6| Roberta Damstra| Roberta | Damstra| 3
> | VP Information Systems  | 0 | 2  | 1942-10-08  | 
> 1994-12-01 00:00:00.0  | 25000.0  | 1  | Bachelors Degree  | M
>| F   | Senior Management  |
> | 7| Rebecca Kanagaki   | Rebecca | Kanagaki   | 4
> | VP Human Resources  | 0 | 3  | 1949-03-27  | 
> 1994-12-01 00:00:00.0  | 15000.0  | 1  | Bachelors Degree  | M
>| F   | Senior Management  |
> | 8| Kim Brunner| Kim | Brunner| 11   
> | Store Manager   | 9 | 11 | 1922-08-10  | 
> 1998-01-01 00:00:00.0  | 1.0  | 5  | Bachelors Degree  | S
>| F   | Store Management   |
> | 9| Brenda Blumberg| Brenda  | Blumberg   | 11   
> | Store Manager   | 21| 11 | 1979-06-23  | 
> 1998-01-01 00:00:00.0  | 17000.0  | 5  | Graduate Degree   | M
>| F   | Store Management   |
> | 10   | Darren Stanz   | Darren  | Stanz  | 5
> | VP Finance  | 0 | 5  | 1949-08-26  | 
> 1994-12-01 00:00:00.0  | 5.0  | 1  | Partial College   | M
>| M   | Senior Management  |
> | 11   | Jonathan Murraiin  | Jonathan| Murraiin   | 11   
> | Store Manager   | 1 | 11 | 1967-06-20  | 
> 1998-01-01 00:00:00.0  | 15000.0  | 5  | Graduate Degree   | S
>| M   | Store Management   |
> +--++-++--+--

[jira] [Assigned] (DRILL-5902) Regression: Queries encounter random failure due to RPC connection timed out

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou reassigned DRILL-5902:
-

Assignee: Pritesh Maker

> Regression: Queries encounter random failure due to RPC connection timed out
> 
>
> Key: DRILL-5902
> URL: https://issues.apache.org/jira/browse/DRILL-5902
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Pritesh Maker
>Priority: Critical
> Attachments: 261230f7-e3b9-0cee-22d8-921cb56e3e12.sys.drill, 
> node196.drillbit.log
>
>
> Multiple random failures (25) occurred with the latest 
> Functional-Baseline-88.193 run.  Here is a sample query:
> {noformat}
> /root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql
> -- Kitchen sink
> -- Use all supported functions
> select
> rank()  over W,
> dense_rank()over W,
> percent_rank()  over W,
> cume_dist() over W,
> avg(c_integer + c_integer)  over W,
> sum(c_integer/100)  over W,
> count(*)over W,
> min(c_integer)  over W,
> max(c_integer)  over W,
> row_number()over W
> from
> j7
> where
> c_boolean is not null
> window  W as (partition by c_bigint, c_date, c_time, c_boolean order by 
> c_integer)
> {noformat}
> From the logs:
> {noformat}
> 2017-10-23 04:14:36,536 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> 2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler 
> - Dropping request for early fragment termination for path 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 
> 261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
> {noformat}
> {noformat}
> 2017-10-23 04:14:53,941 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.88.196:31010 <--> 
> /10.10.88.193:38281 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> 2017-10-23 04:14:53,952 [UserServer-1] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested RUNNING --> 
> FAILED
> 2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: State change requested FAILED --> 
> FINISHED
> 2017-10-23 04:14:53,956 [UserServer-1] WARN  
> o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
> response.
> java.lang.IllegalArgumentException: Self-suppression not permitted
> at java.lang.Throwable.addSuppressed(Throwable.java:1043) 
> ~[na:1.7.0_45]
> at 
> org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.f

[jira] [Updated] (DRILL-5900) Regression: TPCH query encounters random IllegalStateException: Memory was leaked by query

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5900:
--
Description: 
This is a random failure in the TPCH-SF100-baseline run.  The test is 
/root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql.
  This test has passed before.

TPCH query 6:
{noformat}
SELECT
  SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
FROM
  lineitem L,
  part P
WHERE
  P.P_PARTKEY = L.L_PARTKEY
  AND P.P_BRAND = 'BRAND#13'
  AND P.P_CONTAINER = 'JUMBO CAN'
  AND L.L_QUANTITY < (
SELECT
  0.2 * AVG(L2.L_QUANTITY)
FROM
  lineitem L2
WHERE
  L2.L_PARTKEY = P.P_PARTKEY
  )
{noformat}

Error is:
{noformat}
2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:8:2] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
Memory was leaked by query. Memory leaked: (2097152)
Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
(res/actual/peak/limit)


Fragment 8:2

[Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
IllegalStateException: Memory was leaked by query. Memory leaked: (2097152)
Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
(res/actual/peak/limit)


Fragment 8:2

[Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_51]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_51]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51]
Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory 
leaked: (2097152)
Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
(res/actual/peak/limit)

at 
org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:519) 
~[drill-memory-base-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.AbstractOperatorExecContext.close(AbstractOperatorExecContext.java:86)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:108)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:435)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:424) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:324)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:155)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
... 5 common frames omitted
2017-10-23 10:34:55,989 [2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:frag:6:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2611d7c0-b0c9-a93e-c64d-a4ef8f4baf8f:6:0: 
State to report: FINISHED
{noformat}

sys.version is:
1.12.0-SNAPSHOT b0c4e0486d6d4620b04a1bb8198e959d433b4840DRILL-5876: Use 
openssl profile to include netty-tcnative dependency with the platform specific 
classifier  20.10.2017 @ 16:52:35 PDT

The previous version that ran clean is this commit:
{noformat}
1.12.0-SNAPSHOT f1d1945b3772bb782039fd6811e34a7de66441c8DRILL-5582: C++ 
Client: [Threat Modeling] Drillbit may be spoofed by an attacker and this may 
lead to data being written to the attacker's target instead of Drillbit   
19.10.2017 @ 17:13:05 PDT
{noformat}

But since the failure is random, the problem could have been introduced earlier.

  was:
This is a random failure.  This test has passed before.

TPCH query 6:
{noformat}
SELECT
  SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
FROM
  lineitem L,
  part P
WHERE
  P.P_PARTKEY = L.L_PARTKEY
  AND P.P_BRAND = 'BRAND#13'
  AND P.P_CONTAINER = 'JUMBO CAN'
  AND L.L_QUANTITY < (
SELECT
  

[jira] [Assigned] (DRILL-5903) Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou reassigned DRILL-5903:
-

Assignee: Pritesh Maker

> Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete."
> ---
>
> Key: DRILL-5903
> URL: https://issues.apache.org/jira/browse/DRILL-5903
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata, Storage - Parquet
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Assignee: Pritesh Maker
>Priority: Critical
> Attachments: 26122f83-6956-5aa8-d8de-d4808f572160.sys.drill, 
> drillbit.log
>
>
> This is a query from the Functional-Baseline-100.171 run.  The test is 
> /root/drillAutomation/mapr/framework/resources/Functional/parquet_storage/parquet_date/mc_parquet_date/generic/mixed1_partitioned5.q.
> Query is:
> {noformat}
> select a.int_col, b.date_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
> inner join ( select date_col, int_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
> where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as 
> date)= date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
> a.int_col, b.date_col
> {noformat}
> From drillbit.log:
> {noformat}
> fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM 
> vwOnParq_wCst_35
> 2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
> o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete. Total runnable size 3, parallelism 3.
> 2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
> but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 
> 3, parallelism 3.
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
> 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
> runnable size 3, parallelism 3.
> [Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
> [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811)
>  [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
>   at 
> org.apache.calcite.tools.Progr

[jira] [Closed] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou closed DRILL-5901.
-

> Drill test framework can have successful run even if a random failure occurs
> 
>
> Key: DRILL-5901
> URL: https://issues.apache.org/jira/browse/DRILL-5901
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>
> From Jenkins:
> http://10.10.104.91:8080/view/Nightly/job/TPCH-SF100-baseline/574/console
> Random Failures:
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql
> Query: 
> SELECT
>   SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>   lineitem L,
>   part P
> WHERE
>   P.P_PARTKEY = L.L_PARTKEY
>   AND P.P_BRAND = 'BRAND#13'
>   AND P.P_CONTAINER = 'JUMBO CAN'
>   AND L.L_QUANTITY < (
> SELECT
>   0.2 * AVG(L2.L_QUANTITY)
> FROM
>   lineitem L2
> WHERE
>   L2.L_PARTKEY = P.P_PARTKEY
>   )
> Failed with exception
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked 
> by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> org.apache.drill.exec.memory.BaseAllocator.close():519
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86
> org.apache.drill.exec.ops.OperatorContextImpl.close():108
> org.apache.drill.exec.ops.FragmentContext.suppressingClose():435
> org.apache.drill.exec.ops.FragmentContext.close():424
> 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():267
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():744
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473)
>   at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> org.apache.drill.exec.memory.BaseAllocator.close():519
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86
> org.apache.drill.exec.ops.OperatorContextImpl.close():108
> org.apache.drill.exec.ops.Frag

[jira] [Resolved] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou resolved DRILL-5901.
---
Resolution: Not A Bug

This is a bug in the Drill Test Framework, not in Drill itself.

> Drill test framework can have successful run even if a random failure occurs
> 
>
> Key: DRILL-5901
> URL: https://issues.apache.org/jira/browse/DRILL-5901
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>
> From Jenkins:
> http://10.10.104.91:8080/view/Nightly/job/TPCH-SF100-baseline/574/console
> Random Failures:
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql
> Query: 
> SELECT
>   SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>   lineitem L,
>   part P
> WHERE
>   P.P_PARTKEY = L.L_PARTKEY
>   AND P.P_BRAND = 'BRAND#13'
>   AND P.P_CONTAINER = 'JUMBO CAN'
>   AND L.L_QUANTITY < (
> SELECT
>   0.2 * AVG(L2.L_QUANTITY)
> FROM
>   lineitem L2
> WHERE
>   L2.L_PARTKEY = P.P_PARTKEY
>   )
> Failed with exception
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked 
> by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> org.apache.drill.exec.memory.BaseAllocator.close():519
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86
> org.apache.drill.exec.ops.OperatorContextImpl.close():108
> org.apache.drill.exec.ops.FragmentContext.suppressingClose():435
> org.apache.drill.exec.ops.FragmentContext.close():424
> 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():267
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():744
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473)
>   at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> org.apache.drill.exec.memory.BaseAllocator.close():519
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86
>

[jira] [Updated] (DRILL-5902) Regression: Queries encounter random failure due to RPC connection timed out

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5902:
--
Description: 
Multiple random failures (25) occurred with the latest 
Functional-Baseline-88.193 run.  Here is a sample query:

{noformat}
/root/drillAutomation/prasadns14/framework/resources/Functional/window_functions/multiple_partitions/q27.sql
-- Kitchen sink
-- Use all supported functions
select
rank()  over W,
dense_rank()over W,
percent_rank()  over W,
cume_dist() over W,
avg(c_integer + c_integer)  over W,
sum(c_integer/100)  over W,
count(*)over W,
min(c_integer)  over W,
max(c_integer)  over W,
row_number()over W
from
j7
where
c_boolean is not null
window  W as (partition by c_bigint, c_date, c_time, c_boolean order by 
c_integer)
{noformat}

>From the logs:
{noformat}
2017-10-23 04:14:36,536 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:1 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:5 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:9 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:13 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
2017-10-23 04:14:36,537 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:17 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:21 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
2017-10-23 04:14:36,538 [BitServer-7] WARN  o.a.d.e.w.b.ControlMessageHandler - 
Dropping request for early fragment termination for path 
261230e8-d03e-9ca9-91bf-c1039deecde2:1:25 -> 
261230e8-d03e-9ca9-91bf-c1039deecde2:0:0 as path to executor unavailable.
{noformat}

{noformat}
2017-10-23 04:14:53,941 [UserServer-1] INFO  o.a.drill.exec.rpc.user.UserServer 
- RPC connection /10.10.88.196:31010 <--> /10.10.88.193:38281 (user server) 
timed out.  Timeout was set to 30 seconds. Closing connection.
2017-10-23 04:14:53,952 [UserServer-1] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: 
State change requested RUNNING --> FAILED
2017-10-23 04:14:53,952 [261230f8-2698-15b2-952f-d4ade8d6b180:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 261230f8-2698-15b2-952f-d4ade8d6b180:0:0: 
State change requested FAILED --> FINISHED
2017-10-23 04:14:53,956 [UserServer-1] WARN  
o.apache.drill.exec.rpc.RequestIdMap - Failure while attempting to fail rpc 
response.
java.lang.IllegalArgumentException: Self-suppression not permitted
at java.lang.Throwable.addSuppressed(Throwable.java:1043) ~[na:1.7.0_45]
at 
org.apache.drill.common.DeferredException.addException(DeferredException.java:88)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.DeferredException.addThrowable(DeferredException.java:97)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.fail(FragmentExecutor.java:413)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.access$700(FragmentExecutor.java:55)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor$ExecutorStateImpl.fail(FragmentExecutor.java:427)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext.fail(FragmentContext.java:213) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext$1.accept(FragmentContext.java:95) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext$1.accept(FragmentContext.ja

[jira] [Comment Edited] (DRILL-5903) Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Robert Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216437#comment-16216437
 ] 

Robert Hou edited comment on DRILL-5903 at 10/24/17 7:24 AM:
-

I went back a month to September 19.  This problem occurred on these dates, 
although it can be different mixed partition tests.

September 26
October 6
October 10
October 14
October 18
October 22
October 23


was (Author: rhou):
I went back a month to September 19.  This problem occurred on these dates:

September 26
October 6
October 10
October 14
October 18
October 22
October 23

> Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete."
> ---
>
> Key: DRILL-5903
> URL: https://issues.apache.org/jira/browse/DRILL-5903
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata, Storage - Parquet
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Priority: Critical
> Attachments: 26122f83-6956-5aa8-d8de-d4808f572160.sys.drill, 
> drillbit.log
>
>
> This is a query from the Functional-Baseline-100.171 run.  The test is 
> /root/drillAutomation/mapr/framework/resources/Functional/parquet_storage/parquet_date/mc_parquet_date/generic/mixed1_partitioned5.q.
> Query is:
> {noformat}
> select a.int_col, b.date_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
> inner join ( select date_col, int_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
> where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as 
> date)= date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
> a.int_col, b.date_col
> {noformat}
> From drillbit.log:
> {noformat}
> fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM 
> vwOnParq_wCst_35
> 2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
> o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete. Total runnable size 3, parallelism 3.
> 2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
> but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 
> 3, parallelism 3.
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
> 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
> runnable size 3, parallelism 3.
> [Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
> [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushPro

[jira] [Updated] (DRILL-5903) Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5903:
--
Description: 
This is a query from the Functional-Baseline-100.171 run.  The test is 

Query is:
{noformat}
select a.int_col, b.date_col from 
dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
inner join ( select date_col, int_col from 
dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as date)= 
date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
a.int_col, b.date_col
{noformat}

>From drillbit.log:
{noformat}
fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM vwOnParq_wCst_35
2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
parquet metadata' are not complete. Total runnable size 3, parallelism 3.
2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, 
parallelism 3.
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
runnable size 3, parallelism 3.


[Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
 [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811)
 [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:310) 
[calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:342)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:241)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:291)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]

[jira] [Updated] (DRILL-5903) Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-5903:
--
Description: 
This is a query from the Functional-Baseline-100.171 run.  The test is 
/root/drillAutomation/mapr/framework/resources/Functional/parquet_storage/parquet_date/mc_parquet_date/generic/mixed1_partitioned5.q.

Query is:
{noformat}
select a.int_col, b.date_col from 
dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
inner join ( select date_col, int_col from 
dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as date)= 
date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
a.int_col, b.date_col
{noformat}

>From drillbit.log:
{noformat}
fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM vwOnParq_wCst_35
2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
parquet metadata' are not complete. Total runnable size 3, parallelism 3.
2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, 
parallelism 3.
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
runnable size 3, parallelism 3.


[Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
 [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811)
 [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:310) 
[calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:342)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:241)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:291)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.

[jira] [Commented] (DRILL-5903) Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Robert Hou (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216437#comment-16216437
 ] 

Robert Hou commented on DRILL-5903:
---

I went back a month to September 19.  This problem occurred on these dates:

September 26
October 6
October 10
October 14
October 18
October 22
October 23

> Regression: Query encounters "Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete."
> ---
>
> Key: DRILL-5903
> URL: https://issues.apache.org/jira/browse/DRILL-5903
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata, Storage - Parquet
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>Priority: Critical
> Attachments: 26122f83-6956-5aa8-d8de-d4808f572160.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> select a.int_col, b.date_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
> inner join ( select date_col, int_col from 
> dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
> where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as 
> date)= date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
> a.int_col, b.date_col
> {noformat}
> From drillbit.log:
> {noformat}
> fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM 
> vwOnParq_wCst_35
> 2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
> o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
> parquet metadata' are not complete. Total runnable size 3, parallelism 3.
> 2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
> o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
> but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 
> 3, parallelism 3.
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
> 15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
> runnable size 3, parallelism 3.
> [Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
>  ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
> [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62)
>  [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
>  [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811)
>  [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:310) 
> [calcite-core-1.4.0-dril