[jira] [Commented] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2018-07-03 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532224#comment-16532224
 ] 

Khurram Faraaz commented on DRILL-6517:
---

[~sachouche]

The easiest way to repro this Exception is, run TPC-DS query 72 on a 4 node 
cluster (physical machines) and after few mins ( > 5 mins), Cancel the query 
from the Web UI. 

 

Click on the query on the profiles tab while it is under execution, then click 
on Edit Query tab, and then choose Cancel option, to Cancel the query. You will 
notice that the query is Canceled and in the drillbit.log you will see the 
Exception, "IllegalStateException: Record count not set for this vector 
container".

 

 

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Critical
> Fix For: 1.14.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:690)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:662)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:73)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:79)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch

[jira] [Commented] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2018-07-03 Thread salim achouche (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532150#comment-16532150
 ] 

salim achouche commented on DRILL-6517:
---

* I ran the query around 10 times and it succeeded each time (running in 29 
minutes)
 * Bounced the Drillbit cluster and immediately one of the nodes became 
unresponsive
 * I launched a script to gather jstacks each minute; somehow the jstack failed 
and got the below kernel messages
 * VMware blogs indicated the VM is running out of resources
 * The interesting part is that the java illegal exception showed up again when 
cancellation happened{color:#f79232}Caused by: java.lang.IllegalStateException: 
Record count not set for this vector container{color}

{color:#FF}Message from syslogd@mfs133 at Jul  3 18:48:27 ...{color}

{color:#FF} kernel:NMI watchdog: BUG: soft lockup - CPU#6 stuck for 21s! 
[java:12219] {color}

{color:#FF}Message from syslogd@mfs133 at Jul  3 18:48:27 ...{color}

{color:#FF} kernel:NMI watchdog: BUG: soft lockup - CPU#3 stuck for 25s! 
[java:16991]{color}

{color:#FF}Message from syslogd@mfs133 at Jul  3 18:48:27 ...{color}

{color:#FF} kernel:NMI watchdog: BUG: soft lockup - CPU#4 stuck for 25s! 
[java:17633]{color}

{color:#FF}Message from syslogd@mfs133 at Jul  3 18:48:27 ...{color}

{color:#FF} kernel:NMI watchdog: BUG: soft lockup - CPU#5 stuck for 25s! 
[java:27059]{color}

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Critical
> Fix For: 1.14.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRe

[jira] [Commented] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532122#comment-16532122
 ] 

ASF GitHub Bot commented on DRILL-6543:
---

Ben-Zvi commented on a change in pull request #1351: DRILL-6543: Disable Hash 
Join fallback, add percent_reserved_allowance_from_direct
URL: https://github.com/apache/drill/pull/1351#discussion_r199986966
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/util/MemoryAllocationUtilities.java
 ##
 @@ -138,16 +139,36 @@ public static long computeOperatorMemory(OptionSet 
optionManager, long maxAllocP
   @VisibleForTesting
   public static long computeQueryMemory(DrillConfig config, OptionSet 
optionManager, long directMemory) {
 
+// Get the options
+double percent_per_query = 
optionManager.getOption(ExecConstants.PERCENT_MEMORY_PER_QUERY);
 
 Review comment:
   Fixed 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532108#comment-16532108
 ] 

ASF GitHub Bot commented on DRILL-6543:
---

Ben-Zvi commented on a change in pull request #1351: DRILL-6543: Disable Hash 
Join fallback, add percent_reserved_allowance_from_direct
URL: https://github.com/apache/drill/pull/1351#discussion_r199985944
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/util/MemoryAllocationUtilities.java
 ##
 @@ -138,16 +139,36 @@ public static long computeOperatorMemory(OptionSet 
optionManager, long maxAllocP
   @VisibleForTesting
   public static long computeQueryMemory(DrillConfig config, OptionSet 
optionManager, long directMemory) {
 
+// Get the options
+double percent_per_query = 
optionManager.getOption(ExecConstants.PERCENT_MEMORY_PER_QUERY);
+long max_query_per_node = 
optionManager.getOption(ExecConstants.MAX_QUERY_MEMORY_PER_NODE);
+double percent_allowance = 
optionManager.getOption(ExecConstants.PERCENT_RESERVED_ALLOWANCE_FROM_DIRECT);
+
+// verify that the allowance is kept
+if ( percent_per_query + percent_allowance > 1.0 ) {
 
 Review comment:
   The code in the PR "enforces" keeping the allowance (out of the direct 
memory); any user settings that violates this limit returns an error 
(unfortunately can only be done when a query (using buffered ops) is executed).
 This mainly serves as a reminder (for the common user that does not read 
the documentation). It does not help with concurrent query execution (though 
users using concurrent are usually more sophisticated, so may know about the 
allowance). 
The example suggested above applies the *allowance* onto the final 
computed memory, which is not the intention. For example, Direct Memory of 
20GB, and Mem Per Query of 4GB -- then the code about would subtract 25% and 
the Mem Per Query would be only 3GB (and confuse the user).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6546:
-
Labels: ready-to-commit  (was: )

> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532077#comment-16532077
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

amansinha100 commented on issue #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402324758
 
 
   Yes, each unnest will operate on single column, so the plan will have 2 
LateralJoin and as long as there is a Project inserted on the left side of each 
one, then I suppose the code will work fine.   
   
   Overall, I am +1. 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532074#comment-16532074
 ] 

ASF GitHub Bot commented on DRILL-6579:
---

vrozov commented on a change in pull request #1361: DRILL-6579: Added sanity 
checks to the Parquet reader to avoid infini…
URL: https://github.com/apache/drill/pull/1361#discussion_r199978503
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenNullableFixedEntryReader.java
 ##
 @@ -38,14 +39,16 @@
   /** {@inheritDoc} */
   @Override
   final VarLenColumnBulkEntry getEntry(int valuesToRead) {
-assert columnPrecInfo.precision >= 0 : "Fixed length precision cannot be 
lower than zero";
+Preconditions.checkArgument(columnPrecInfo.precision >= 0, "Fixed length 
precision cannot be lower than zero");
 
 Review comment:
   Should the check be here or in the constructor?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532075#comment-16532075
 ] 

ASF GitHub Bot commented on DRILL-6579:
---

vrozov commented on a change in pull request #1361: DRILL-6579: Added sanity 
checks to the Parquet reader to avoid infini…
URL: https://github.com/apache/drill/pull/1361#discussion_r199978278
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenFixedEntryReader.java
 ##
 @@ -37,14 +38,15 @@
   /** {@inheritDoc} */
   @Override
   final VarLenColumnBulkEntry getEntry(int valuesToRead) {
-assert columnPrecInfo.precision >= 0 : "Fixed length precision cannot be 
lower than zero";
+Preconditions.checkArgument(columnPrecInfo.precision >= 0, "Fixed length 
precision cannot be lower than zero");
 
 Review comment:
   checkState?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2018-07-03 Thread salim achouche (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16532019#comment-16532019
 ] 

salim achouche commented on DRILL-6517:
---

After debugging this issue, noticed the thrown exception was masking the real 
problem:
 * Launched the query the first time on a 4 nodes cluster (made up of VMs)
 * Query memory per node 10Gb; spilling not enabled (at least not explicitly)
 * The query ran in 35min and succeded
 * Re-launched the same query but this time node-3 was irresponsive 
 * After one hour the query failed; the client error was that node-3 was lost 
 * Within the Drillbit logs, the set-count error issue was thrown though after 
the foreman cancelled the query

I'll now focus on understanding why the system is getting in this state when 
running for the second time; the fact that I am using VMs is not helping as 
network issues are very common.

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Critical
> Fix For: 1.14.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:690)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:662)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:73)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org

[jira] [Commented] (DRILL-6553) Fix TopN for unnest operator

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531992#comment-16531992
 ] 

ASF GitHub Bot commented on DRILL-6553:
---

HanumathRao commented on issue #1353: DRILL-6553: Fix TopN for unnest operator
URL: https://github.com/apache/drill/pull/1353#issuecomment-402303904
 
 
   @vvysotskyi Thanks for making the changes. Changes look good to me. +1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix TopN for unnest operator
> 
>
> Key: DRILL-6553
> URL: https://issues.apache.org/jira/browse/DRILL-6553
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Plan for the query with unnest is chosen non-optimally:
> {code:sql}
> select customer.c_custkey, customer.c_name, t.o.o_orderkey,t.o.o_totalprice
> from dfs.`lateraljoin/multipleFiles` customer,
> unnest(customer.c_orders) t(o)
> order by customer.c_custkey, t.o.o_orderkey, t.o.o_totalprice
> limit 50
> {code}
> Plan:
> {noformat}
> 00-00Screen
> 00-01  ProjectAllowDup(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$0], sort1=[$2], sort2=[$3], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-06Project(c_custkey=[$2], c_name=[$3], EXPR$2=[ITEM($4, 
> 'o_orderkey')], EXPR$3=[ITEM($4, 'o_totalprice')])
> 00-07  LateralJoin(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{1}])
> 00-09Project(T0¦¦**=[$0], c_orders=[$1], c_custkey=[$2], 
> c_name=[$3])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`**`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> 00-08Project(c_orders0=[$0])
> 00-10  Unnest [srcOp=00-07] 
> {noformat}
> A similar query, but with flatten:
> {code:sql}
> select f.c_custkey, f.c_name, f.o.o_orderkey, f.o.o_totalprice from (select 
> c_custkey, c_name, flatten(c_orders) as o from 
> dfs.`lateraljoin/multipleFiles` customer) f order by f.c_custkey, 
> f.o.o_orderkey, f.o.o_totalprice limit 50
> {code}
> has plan:
> {noformat}
> 00-00Screen
> 00-01  Project(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  TopN(limit=[50])
> 00-06Project(c_custkey=[$0], c_name=[$1], EXPR$2=[ITEM($2, 
> 'o_orderkey')], EXPR$3=[ITEM($2, 'o_totalprice')])
> 00-07  Flatten(flattenField=[$2])
> 00-08Project(c_custkey=[$0], c_name=[$1], o=[$2])
> 00-09  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`c_custkey`, `c_name`, `c_orders`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> {noformat}
> The main difference is that for the case of unnest, a project wasn't pushed 
> to the scan and Limit with Sort weren't converted to TopN. 
> The first problem is tracked by DRILL-6545 and this Jira aims to fix the 
> problem with TopN



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531987#comment-16531987
 ] 

ASF GitHub Bot commented on DRILL-6579:
---

sachouche commented on issue #1361: DRILL-6579: Added sanity checks to the 
Parquet reader to avoid infini…
URL: https://github.com/apache/drill/pull/1361#issuecomment-402303054
 
 
   @vrozov, updated PR with your feedback.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6453) TPC-DS query 72 has regressed

2018-07-03 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531980#comment-16531980
 ] 

Khurram Faraaz commented on DRILL-6453:
---

[~ben-zvi] [~priteshm]

Apache Drill 1.14.0 on a 4 node cluster, TPC-DS query 72 fails (in Canceled 
state) after running for 2hrs and 11 mins 
and we see the same Exception as before towards the end of the drillbit.log 
file.


git.commit.id.abbrev=f481a7c

{noformat}
message: "SYSTEM ERROR: IllegalStateException: Record count not set for this 
vector container\n\nFragment 4:87\n\n[Error Id: 
ed305d45-742f-48df-b1ad-6813bb5fdfc4 on qa102-48.qa.lab:31010]"
 exception {
 exception_class: "java.lang.IllegalStateException"
 message: "Record count not set for this vector container"
 stack_trace {
 class_name: "com.google.common.base.Preconditions"
 file_name: "Preconditions.java"
 line_number: 173
 method_name: "checkState"
 is_native_method: false
 }
 stack_trace {
 class_name: "org.apache.drill.exec.record.VectorContainer"
 file_name: "VectorContainer.java"
 line_number: 394
 method_name: "getRecordCount"
 is_native_method: false
 }
 stack_trace {
 class_name: "org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch"
 file_name: "RemovingRecordBatch.java"
 line_number: 49
 method_name: "getRecordCount"
 is_native_method: false
 }
 stack_trace {
 class_name: "org.apache.drill.exec.record.RecordBatchSizer"
 file_name: "RecordBatchSizer.java"
 line_number: 714
 method_name: ""
 is_native_method: false
 }
 stack_trace {
 class_name: "org.apache.drill.exec.record.RecordBatchSizer"
 file_name: "RecordBatchSizer.java"
 line_number: 686
 method_name: ""
 is_native_method: false
 }
 stack_trace {
 class_name{ "org.apache.drill.exec.record.JoinBatchMemoryManager"
 file_name: "JoinBatchMemoryManager.java"
 line_number: 74
 method_name: "update"

{noformat}

> TPC-DS query 72 has regressed
> -
>
> Key: DRILL-6453
> URL: https://issues.apache.org/jira/browse/DRILL-6453
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it 
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT 
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took 
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not 
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to 
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in 
> CANCELLATION_REQUESTED state on UI.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531941#comment-16531941
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402295742
 
 
   @vrozov and @Ben-Zvi 
   Updated the PR according to latest feedbacks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5365:
--
Priority: Major  (was: Minor)

> FileNotFoundException when reading a parquet file
> -
>
> Key: DRILL-5365
> URL: https://issues.apache.org/jira/browse/DRILL-5365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.14.0
>
>
> The parquet file is generated through the following CTAS.
> To reproduce the issue: 1) two or more nodes cluster; 2) enable 
> impersonation; 3) set "fs.default.name": "file:///" in hive storage plugin; 
> 4) restart drillbits; 5) as a regular user, on node A, drop the table/file; 
> 6) ctas from a large enough hive table as source to recreate the table/file; 
> 7) query the table from node A should work; 8) query from node B as same user 
> should reproduce the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531929#comment-16531929
 ] 

ASF GitHub Bot commented on DRILL-5365:
---

ilooner commented on issue #1296: DRILL-5365: Prevent plugin config from 
changing default fs. Make DrillFileSystem Immutable.
URL: https://github.com/apache/drill/pull/1296#issuecomment-402292968
 
 
   I also forgot to mention a third possible bug, which is that 
FileSystemConfigurations defined in the HiveStoragePlugin are passed to 
DrillFileSystem in HiveDrillNativeParquetRowGroupScan.getFsConf(). I'm not sure 
if that is by design or not, so I think we need to think about that one a bit 
more.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> FileNotFoundException when reading a parquet file
> -
>
> Key: DRILL-5365
> URL: https://issues.apache.org/jira/browse/DRILL-5365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> The parquet file is generated through the following CTAS.
> To reproduce the issue: 1) two or more nodes cluster; 2) enable 
> impersonation; 3) set "fs.default.name": "file:///" in hive storage plugin; 
> 4) restart drillbits; 5) as a regular user, on node A, drop the table/file; 
> 6) ctas from a large enough hive table as source to recreate the table/file; 
> 7) query the table from node A should work; 8) query from node B as same user 
> should reproduce the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531926#comment-16531926
 ] 

ASF GitHub Bot commented on DRILL-5365:
---

ilooner commented on issue #1296: DRILL-5365: Prevent plugin config from 
changing default fs. Make DrillFileSystem Immutable.
URL: https://github.com/apache/drill/pull/1296#issuecomment-402292010
 
 
   ## Problem
   
   @paul-rogers @vdiravka I have an update. Chun was unable to reproduce the 
issue, however after staring at the code for several days, I think I see the 
source of the issue. I believe the issue mostly stems from two bugs in the 
FileSystem.get method provided to us by hadoop.
   
 1. **Bug # 1** is that FileSystem.get does not actually take into account 
the Configuration object you give it. This means that if you ask for two 
different FileSystem system objects with different configurations for dfs, 
FileSystem.get assumes both are the same and the first dfs FileSystem and 
config requested wins and is the one that is always returned. That's pretty 
crazy to me, but you can check the equals method for **FileSystem.Cache.Key** 
yourself.
2. **Bug # 2** The **fs.default.name** property is not honored when doing 
FileSystem.get, only **fs.defaultFS** is honored. You can see this by tracing 
through the code:
   1. FileSystem.get(Configuration conf) calls 
FileSystem.getDefaultUri(Configuration conf) 
   2. FileSystem.getDefaultUri(Configuration conf) looks up 
FileSystem.FS_DEFAULT_NAME_KEY in the Configuration
   3. FileSystem.FS_DEFAULT_NAME_KEY is **fs.defaultFS**.
   4. If **fs.defaultFS** is not defined then the uri for caching purposes 
is assumed to be **file://**
   
   These bugs would basically completely break things if hive plugin and the 
filesystem plugin were storing files in different places. I suspect we aren't 
seeing any issues because most people only have one dfs cluster to store their 
data for both hive and other applications.
   
   ## Solution
   
* Bypass the caching in FileSystem.get and do our own caching in 
DrillFileSystem, which takes into account Configuration objects.
* Set **fs.defaultFS**  as well as **fs.default.name** in our configuration 
objects.
* Make DrillFileSystem immutable and make things cleaner and more explicit 
in general.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> FileNotFoundException when reading a parquet file
> -
>
> Key: DRILL-5365
> URL: https://issues.apache.org/jira/browse/DRILL-5365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> The parquet file is generated through the following CTAS.
> To reproduce the issue: 1) two or more nodes cluster; 2) enable 
> impersonation; 3) set "fs.default.name": "file:///" in hive storage plugin; 
> 4) restart drillbits; 5) as a regular user, on node A, drop the table/file; 
> 6) ctas from a large enough hive table as source to recreate the table/file; 
> 7) query the table from node A should work; 8) query from node B as same user 
> should reproduce the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531917#comment-16531917
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi removed a comment on issue #1346: DRILL-6546: Allow unnest function 
with nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402290003
 
 
   AFAIK, currently allowed only unnest with single column. When several 
unnests are used, they will have their own Correlate rel nodes. Pp


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531916#comment-16531916
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi edited a comment on issue #1346: DRILL-6546: Allow unnest function 
with nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402289998
 
 
   AFAIK, currently allowed only unnest with single column. When several 
unnests are used, they will have their own Correlate rel nodes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6576) Unnest reports incoming record counts incorrectly

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531911#comment-16531911
 ] 

ASF GitHub Bot commented on DRILL-6576:
---

parthchandra closed pull request #1362: DRILL-6576: Unnest reports incoming 
record counts incorrectly
URL: https://github.com/apache/drill/pull/1362
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestImpl.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestImpl.java
index 06713a5164..ffc64f9237 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestImpl.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestImpl.java
@@ -103,7 +103,7 @@ public final int unnestRecords(final int recordCount) {
 innerValueIndex += count;
 return count;
 
-}
+  }
 
   @Override
   public final void setup(FragmentContext context, RecordBatch incoming, 
RecordBatch outgoing,
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
index e985c4defe..bc01a70477 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
@@ -199,6 +199,7 @@ public IterOutcome innerNext() {
 schemaChanged(); // checks if schema has changed (redundant in this 
case becaause it has) AND saves the
  // current field metadata for check in subsequent 
iterations
 setupNewSchema();
+stats.batchReceived(0, incoming.getRecordCount(), true);
   } catch (SchemaChangeException ex) {
 kill(false);
 logger.error("Failure during query", ex);
@@ -207,32 +208,30 @@ public IterOutcome innerNext() {
   } finally {
 stats.stopSetup();
   }
-  // since we never called next on an upstream operator, incoming stats are
-  // not updated. update input stats explicitly.
-  stats.batchReceived(0, incoming.getRecordCount(), true);
   return IterOutcome.OK_NEW_SCHEMA;
 } else {
   assert state != BatchState.FIRST : "First batch should be OK_NEW_SCHEMA";
   container.zeroVectors();
-
   // Check if schema has changed
-  if (lateral.getRecordIndex() == 0 && schemaChanged()) {
-hasRemainder = true; // next call to next will handle the actual 
data.
-try {
-  setupNewSchema();
-} catch (SchemaChangeException ex) {
-  kill(false);
-  logger.error("Failure during query", ex);
-  context.getExecutorState().fail(ex);
-  return IterOutcome.STOP;
-}
-stats.batchReceived(0, incoming.getRecordCount(), true);
-return OK_NEW_SCHEMA;
-  }
   if (lateral.getRecordIndex() == 0) {
-unnest.resetGroupIndex();
+boolean isNewSchema = schemaChanged();
+if (isNewSchema) {
+  hasRemainder = true; // next call to next will handle the actual 
data.
+  stats.batchReceived(0, incoming.getRecordCount(), isNewSchema);
+  try {
+setupNewSchema();
+  } catch (SchemaChangeException ex) {
+kill(false);
+logger.error("Failure during query", ex);
+context.getExecutorState().fail(ex);
+return IterOutcome.STOP;
+  }
+  return OK_NEW_SCHEMA;
+} else {
+  unnest.resetGroupIndex();
+  stats.batchReceived(0, incoming.getRecordCount(), isNewSchema);
+}
   }
-  stats.batchReceived(0, incoming.getRecordCount(), false);
   return doWork();
 }
 
@@ -243,7 +242,8 @@ public VectorContainer getOutgoingContainer() {
 return this.container;
   }
 
-  @SuppressWarnings("resource") private void setUnnestVector() {
+  @SuppressWarnings("resource")
+  private void setUnnestVector() {
 final TypedFieldId typedFieldId = 
incoming.getValueVectorId(popConfig.getColumn());
 final MaterializedField field = 
incoming.getSchema().getColumn(typedFieldId.getFieldIds()[0]);
 final RepeatedValueVector vector;
@@ -347,7 +347,8 @@ protected IterOutcome doWork() {
 return tp;
   }
 
-  @Override protected boolean setupNewSchema() throws SchemaChangeException {
+  @Override
+  protected boolean setupNewSchema() throws SchemaChangeException {
 Preconditions.checkNotNull(lateral);
 container.clear();
 recordCount = 0;


 

--

[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531915#comment-16531915
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi opened a new pull request #1346: DRILL-6546: Allow unnest function 
with nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346
 
 
   - Added new rule `ProjectComplexRexNodeCorrelateTransposeRule` which takes a 
complex expression from the `Project` below `Uncollect` rel node and creates 
new project with expressions from the left side of `Correlate` and this complex 
expression. 
   For example, part of the plan before applying the rule:
   ```
   LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{1}]): rowcount = 1.0, cumulative cost = {inf}, id = 100
 EnumerableTableScan(subset=[rel#94:Subset#0.ENUMERABLE.ANY([]).[]], 
table=[[cp, lateraljoin/nested-customer.parquet]]): rowcount = 100.0, 
cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
= 7
 Uncollect(subset=[rel#99:Subset#3.NONE.ANY([]).[]]): rowcount = 1.0, 
cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 98
   LogicalProject(subset=[rel#97:Subset#2.NONE.ANY([]).[]], 
EXPR$0=[ITEM($cor0.orders, 'items')]): rowcount = 1.0, cumulative cost = {1.0 
rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 96
 LogicalValues(subset=[rel#95:Subset#1.NONE.ANY([]).[0]], tuples=[[{ 0 
}]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 
network, 0.0 memory}, id = 8
   ```
   Plan after applying the rule:
   ```
   LogicalProject(**=[$0], orders=[$1], $complexRexNode0=[$3]): rowcount = 1.0, 
cumulative cost = {inf}, id = 116
 LogicalCorrelate(correlation=[$cor1], joinType=[inner], 
requiredColumns=[{2}]): rowcount = 1.0, cumulative cost = {inf}, id = 115
   LogicalProject(**=[$0], orders=[$1], $complexRexNode=[ITEM($1, 
'items')]): rowcount = 100.0, cumulative cost = {100.0 rows, 300.0 cpu, 0.0 io, 
0.0 network, 0.0 memory}, id = 112
 EnumerableTableScan(subset=[rel#94:Subset#0.ENUMERABLE.ANY([]).[]], 
table=[[cp, lateraljoin/nested-customer.parquet]]): rowcount = 100.0, 
cumulative cost = {100.0 rows, 101.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
= 7
   Uncollect: rowcount = 1.0, cumulative cost = {2.0 rows, 2.0 cpu, 0.0 io, 
0.0 network, 0.0 memory}, id = 114
 LogicalProject($complexRexNode=[$cor1.$complexRexNode]): rowcount = 
1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
= 113
   LogicalValues(subset=[rel#95:Subset#1.NONE.ANY([]).[0]], tuples=[[{ 
0 }]]): rowcount = 1.0, cumulative cost = {1.0 rows, 1.0 cpu, 0.0 io, 0.0 
network, 0.0 memory}, id = 8
   ```
   - Made change to convert `DrillCompoundIdentifier` inside unnest to the item 
call to avoid column not found error when nested column is used.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531912#comment-16531912
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi commented on issue #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402289998
 
 
   AFAIK, currently allowed only unnest with single column. When several 
unnests are used, they will have their own Correlate rel nodes. Pp


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531914#comment-16531914
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi closed pull request #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java
index 519d5036e7..e5a3746a42 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/PlannerPhase.java
@@ -37,6 +37,7 @@
 import org.apache.drill.exec.planner.logical.DrillJoinRule;
 import org.apache.drill.exec.planner.logical.DrillLimitRule;
 import org.apache.drill.exec.planner.logical.DrillMergeProjectRule;
+import 
org.apache.drill.exec.planner.logical.ProjectComplexRexNodeCorrelateTransposeRule;
 import 
org.apache.drill.exec.planner.logical.DrillProjectLateralJoinTransposeRule;
 import 
org.apache.drill.exec.planner.logical.DrillProjectPushIntoLateralJoinRule;
 import org.apache.drill.exec.planner.logical.DrillProjectRule;
@@ -311,6 +312,8 @@ static RuleSet 
getDrillUserConfigurableLogicalRules(OptimizerRulesContext optimi
   RuleInstance.PROJECT_WINDOW_TRANSPOSE_RULE,
   DrillPushProjectIntoScanRule.INSTANCE,
 
+  ProjectComplexRexNodeCorrelateTransposeRule.INSTANCE,
+
   /*
Convert from Calcite Logical to Drill Logical Rules.
*/
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
index 28e5246b0e..2f895e2cd6 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
@@ -73,7 +73,7 @@ protected RelDataType deriveRowType() {
 return 
constructRowType(SqlValidatorUtil.deriveJoinRowType(left.getRowType(),
   right.getRowType(), joinType.toJoinType(),
   getCluster().getTypeFactory(), null,
-  ImmutableList.of()));
+  ImmutableList.of()));
   case ANTI:
   case SEMI:
 return constructRowType(left.getRowType());
@@ -82,12 +82,19 @@ protected RelDataType deriveRowType() {
 }
   }
 
-  public int getInputSize(int offset, RelNode input) {
-if (this.excludeCorrelateColumn &&
-  offset == 0) {
-  return input.getRowType().getFieldList().size() - 1;
+  /**
+   * Returns number of fields in {@link RelDataType} for
+   * input rel node with specified ordinal considering value of
+   * {@code excludeCorrelateColumn}.
+   *
+   * @param ordinal ordinal of input rel node
+   * @return number of fields in input's {@link RelDataType}
+   */
+  public int getInputSize(int ordinal) {
+if (this.excludeCorrelateColumn && ordinal == 0) {
+  return getInput(ordinal).getRowType().getFieldList().size() - 1;
 }
-return input.getRowType().getFieldList().size();
+return getInput(ordinal).getRowType().getFieldList().size();
   }
 
   public RelDataType constructRowType(RelDataType inputRowType) {
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillLateralJoinRel.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillLateralJoinRel.java
index aa6ccb051b..4356d49104 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillLateralJoinRel.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillLateralJoinRel.java
@@ -50,7 +50,7 @@ public Correlate copy(RelTraitSet traitSet,
   public LogicalOperator implement(DrillImplementor implementor) {
 final List fields = getRowType().getFieldNames();
 assert DrillJoinRel.isUnique(fields);
-final int leftCount = getInputSize(0,left);
+final int leftCount = getInputSize(0);
 
 final LogicalOperator leftOp = DrillJoinRel.implementInput(implementor, 0, 
0, left, this);
 final LogicalOperator rightOp = DrillJoinRel.implementInput(implementor, 
1, leftCount, right, this);
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillUnnestRule.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillUnnestRule.java
index 762eb46f31..ce0cd3c1aa 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillUnnestRul

[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531913#comment-16531913
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi commented on issue #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402290003
 
 
   AFAIK, currently allowed only unnest with single column. When several 
unnests are used, they will have their own Correlate rel nodes. Pp


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5365:
--
Issue Type: Bug  (was: Improvement)

> FileNotFoundException when reading a parquet file
> -
>
> Key: DRILL-5365
> URL: https://issues.apache.org/jira/browse/DRILL-5365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> The parquet file is generated through the following CTAS.
> To reproduce the issue: 1) two or more nodes cluster; 2) enable 
> impersonation; 3) set "fs.default.name": "file:///" in hive storage plugin; 
> 4) restart drillbits; 5) as a regular user, on node A, drop the table/file; 
> 6) ctas from a large enough hive table as source to recreate the table/file; 
> 7) query the table from node A should work; 8) query from node B as same user 
> should reproduce the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531906#comment-16531906
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402287485
 
 
   @vrozov , thanks Vlad, working on it!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531900#comment-16531900
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

vrozov commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402286366
 
 
   @sachouche I guess, you mean raise, not handle. For the rest
   a) if you throw DrillRuntimeException, please see c)
   b) yes (move it to a method that uses an iterator, if it is `setSafe()`, 
then move it to that method)
   c) I would prefer InterruptedException over DrillRuntimeException, but if it 
is too much to propagate InterruptedException, I am OK with 
DrillRuntimeException.
   d) yes, when an exception is raised, the interrupt flag needs to be cleared 
to avoid throwing exception again.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531888#comment-16531888
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

amansinha100 commented on issue #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402282802
 
 
   @vvysotskyi one question about the `$complexRexNode` ..suppose there are 2 
or more such nested columns, e.g `a.b.c`  and `a.b.d`  that need to be 
projected from the left side of Lateral, then I would think that there would be 
2 such fields: `$complexRexNode1`  and `$complexRexNode2`.   Is that correct ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531886#comment-16531886
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402282723
 
 
   @vrozov , sorry for the back & forth but I cannot read your mind because you 
give very few details with your suggestions.
   
   Sure I can handle the exception within the setSafe method; I only have one 
shot at this because I will be OOO tonight and working on another JIRA; this is 
the summary (please stop me if you don't agree):
   a) move the static checkInterrupted() method into  DrillRuntimeException
   b) move the checks into the setSafe() method
   c) setSafe will throw a DrillRuntimeException if the interrupt flag is set
   d) use the Thread.interrupted() method to clear the flag after handling


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6576) Unnest reports incoming record counts incorrectly

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531882#comment-16531882
 ] 

ASF GitHub Bot commented on DRILL-6576:
---

parthchandra commented on issue #1362: DRILL-6576: Unnest reports incoming 
record counts incorrectly
URL: https://github.com/apache/drill/pull/1362#issuecomment-402281879
 
 
   Thanks Boaz. I'll make the changes you suggested and will merge this in.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unnest reports incoming record counts incorrectly
> -
>
> Key: DRILL-6576
> URL: https://issues.apache.org/jira/browse/DRILL-6576
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Parth Chandra
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531881#comment-16531881
 ] 

ASF GitHub Bot commented on DRILL-6579:
---

vrozov commented on issue #1361: DRILL-6579: Added sanity checks to the Parquet 
reader to avoid infini…
URL: https://github.com/apache/drill/pull/1361#issuecomment-402281790
 
 
   `Preconditions.checkArgument` usage should be limited to check for arguments 
of a method, as it will throw `IllegalArgumentException`. Please use 
`Preconditions.checkState` or other methods when not validating arguments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531879#comment-16531879
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

vrozov commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402280613
 
 
   Yes, I would prefer to check for an interrupt in setSafe() (outside of an 
iterator).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531877#comment-16531877
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

vrozov commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402280125
 
 
   I did not suggest to introduce a new class. My suggestion was to move the 
method for example to DrillRuntimeException. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6553) Fix TopN for unnest operator

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531875#comment-16531875
 ] 

ASF GitHub Bot commented on DRILL-6553:
---

vvysotskyi commented on a change in pull request #1353: DRILL-6553: Fix TopN 
for unnest operator
URL: https://github.com/apache/drill/pull/1353#discussion_r199933855
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
 ##
 @@ -41,24 +41,22 @@
 
 
 public abstract class DrillLateralJoinRelBase extends Correlate implements 
DrillRelNode {
-
-  final private static double CORRELATE_MEM_COPY_COST = 
DrillCostBase.MEMORY_TO_CPU_RATIO * DrillCostBase.BASE_CPU_COST;
-  final public boolean excludeCorrelateColumn;
-  public DrillLateralJoinRelBase(RelOptCluster cluster, RelTraitSet traits, 
RelNode left, RelNode right, boolean excludeCorrelateCol,
-   CorrelationId correlationId, ImmutableBitSet 
requiredColumns, SemiJoinType semiJoinType) {
+final private static double CORRELATE_MEM_COPY_COST = 
DrillCostBase.MEMORY_TO_CPU_RATIO * DrillCostBase.BASE_CPU_COST;
+  final public boolean excludeCorrelateColumn;  public 
DrillLateralJoinRelBase(RelOptCluster cluster, RelTraitSet traits, RelNode 
left, RelNode right,boolean excludeCorrelateCol,
 
 Review comment:
   Sorry, for some reason these lines were moved after the rebase. Thanks, 
fixed. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix TopN for unnest operator
> 
>
> Key: DRILL-6553
> URL: https://issues.apache.org/jira/browse/DRILL-6553
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Plan for the query with unnest is chosen non-optimally:
> {code:sql}
> select customer.c_custkey, customer.c_name, t.o.o_orderkey,t.o.o_totalprice
> from dfs.`lateraljoin/multipleFiles` customer,
> unnest(customer.c_orders) t(o)
> order by customer.c_custkey, t.o.o_orderkey, t.o.o_totalprice
> limit 50
> {code}
> Plan:
> {noformat}
> 00-00Screen
> 00-01  ProjectAllowDup(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$0], sort1=[$2], sort2=[$3], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-06Project(c_custkey=[$2], c_name=[$3], EXPR$2=[ITEM($4, 
> 'o_orderkey')], EXPR$3=[ITEM($4, 'o_totalprice')])
> 00-07  LateralJoin(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{1}])
> 00-09Project(T0¦¦**=[$0], c_orders=[$1], c_custkey=[$2], 
> c_name=[$3])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`**`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> 00-08Project(c_orders0=[$0])
> 00-10  Unnest [srcOp=00-07] 
> {noformat}
> A similar query, but with flatten:
> {code:sql}
> select f.c_custkey, f.c_name, f.o.o_orderkey, f.o.o_totalprice from (select 
> c_custkey, c_name, flatten(c_orders) as o from 
> dfs.`lateraljoin/multipleFiles` customer) f order by f.c_custkey, 
> f.o.o_orderkey, f.o.o_totalprice limit 50
> {code}
> has plan:
> {noformat}
> 00-00Screen
> 00-01  Project(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  TopN(limit=[50])
> 00-06Project(c_custkey=[$0], c_name=[$1], EXPR$2=[ITEM($2, 
> 'o_orderkey')], EXPR$3=[ITEM($2, 'o_totalprice')])
> 00-07  Flatten(flattenField=[$2])
> 00-08Project(c_custkey=[$0], c_name=[$1], o=[$2])
> 00-09  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, c

[jira] [Updated] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6579:
-
Labels: ready-to-commit  (was: pull-request-available ready-to-commit)

> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6553) Fix TopN for unnest operator

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531868#comment-16531868
 ] 

ASF GitHub Bot commented on DRILL-6553:
---

vvysotskyi commented on a change in pull request #1353: DRILL-6553: Fix TopN 
for unnest operator
URL: https://github.com/apache/drill/pull/1353#discussion_r199933855
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
 ##
 @@ -41,24 +41,22 @@
 
 
 public abstract class DrillLateralJoinRelBase extends Correlate implements 
DrillRelNode {
-
-  final private static double CORRELATE_MEM_COPY_COST = 
DrillCostBase.MEMORY_TO_CPU_RATIO * DrillCostBase.BASE_CPU_COST;
-  final public boolean excludeCorrelateColumn;
-  public DrillLateralJoinRelBase(RelOptCluster cluster, RelTraitSet traits, 
RelNode left, RelNode right, boolean excludeCorrelateCol,
-   CorrelationId correlationId, ImmutableBitSet 
requiredColumns, SemiJoinType semiJoinType) {
+final private static double CORRELATE_MEM_COPY_COST = 
DrillCostBase.MEMORY_TO_CPU_RATIO * DrillCostBase.BASE_CPU_COST;
+  final public boolean excludeCorrelateColumn;  public 
DrillLateralJoinRelBase(RelOptCluster cluster, RelTraitSet traits, RelNode 
left, RelNode right,boolean excludeCorrelateCol,
 
 Review comment:
   Sorry, for some reason these lines were moved after the rebase 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix TopN for unnest operator
> 
>
> Key: DRILL-6553
> URL: https://issues.apache.org/jira/browse/DRILL-6553
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Plan for the query with unnest is chosen non-optimally:
> {code:sql}
> select customer.c_custkey, customer.c_name, t.o.o_orderkey,t.o.o_totalprice
> from dfs.`lateraljoin/multipleFiles` customer,
> unnest(customer.c_orders) t(o)
> order by customer.c_custkey, t.o.o_orderkey, t.o.o_totalprice
> limit 50
> {code}
> Plan:
> {noformat}
> 00-00Screen
> 00-01  ProjectAllowDup(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$0], sort1=[$2], sort2=[$3], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-06Project(c_custkey=[$2], c_name=[$3], EXPR$2=[ITEM($4, 
> 'o_orderkey')], EXPR$3=[ITEM($4, 'o_totalprice')])
> 00-07  LateralJoin(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{1}])
> 00-09Project(T0¦¦**=[$0], c_orders=[$1], c_custkey=[$2], 
> c_name=[$3])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`**`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> 00-08Project(c_orders0=[$0])
> 00-10  Unnest [srcOp=00-07] 
> {noformat}
> A similar query, but with flatten:
> {code:sql}
> select f.c_custkey, f.c_name, f.o.o_orderkey, f.o.o_totalprice from (select 
> c_custkey, c_name, flatten(c_orders) as o from 
> dfs.`lateraljoin/multipleFiles` customer) f order by f.c_custkey, 
> f.o.o_orderkey, f.o.o_totalprice limit 50
> {code}
> has plan:
> {noformat}
> 00-00Screen
> 00-01  Project(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  TopN(limit=[50])
> 00-06Project(c_custkey=[$0], c_name=[$1], EXPR$2=[ITEM($2, 
> 'o_orderkey')], EXPR$3=[ITEM($2, 'o_totalprice')])
> 00-07  Flatten(flattenField=[$2])
> 00-08Project(c_custkey=[$0], c_name=[$1], o=[$2])
> 00-09  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`c_custke

[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531841#comment-16531841
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402266591
 
 
   Suggestion made by @vrozov.. though I agree, the new method doesn't do much.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5365:
--
Priority: Minor  (was: Major)

> FileNotFoundException when reading a parquet file
> -
>
> Key: DRILL-5365
> URL: https://issues.apache.org/jira/browse/DRILL-5365
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Timothy Farkas
>Priority: Minor
> Fix For: 1.14.0
>
>
> The parquet file is generated through the following CTAS.
> To reproduce the issue: 1) two or more nodes cluster; 2) enable 
> impersonation; 3) set "fs.default.name": "file:///" in hive storage plugin; 
> 4) restart drillbits; 5) as a regular user, on node A, drop the table/file; 
> 6) ctas from a large enough hive table as source to recreate the table/file; 
> 7) query the table from node A should work; 8) query from node B as same user 
> should reproduce the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread Timothy Farkas (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531839#comment-16531839
 ] 

Timothy Farkas commented on DRILL-5365:
---

[~cch...@maprtech.com] Tried reproducing the issue a week and a half ago. It 
seems it cannot be reproduced. He ran the following branch 
https://github.com/ilooner/drill/tree/DRILL-5365-justlog with some debug 
messages added and it produced the following logs. I will look at the logs to 
see if there is anything obvious, however, it looks like this issue was 
inadvertently fixed by some other changes. At this point I would like to change 
this jira to a minor issue and focus on cleaning up and doucmenting some pieces 
of the code, so it is easier to understand.

{code}
2018-06-19 17:03:53,539 [24d66616-2319-b32b-30d3-5881533771cb:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
24d66616-2319-b32b-30d3-5881533771cb: create table md1375test1 as select 
count(*) as cnt from hive.tpch01_parquet_nodate.customer
2018-06-19 17:03:53,540 [24d66616-2319-b32b-30d3-5881533771cb:foreman] INFO  
o.a.d.exec.store.dfs.DrillFileSystem - Configuration for the DrillFileSystem 
maprfs:/// maprfs:///, underlyingFs: maprfs:///
2018-06-19 17:03:53,541 [24d66616-2319-b32b-30d3-5881533771cb:foreman] INFO  
o.a.d.exec.store.dfs.DrillFileSystem - Who made me? {}
java.lang.RuntimeException: Who made me?
at 
org.apache.drill.exec.store.dfs.DrillFileSystem.(DrillFileSystem.java:101)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.util.ImpersonationUtil$2.run(ImpersonationUtil.java:219) 
[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.util.ImpersonationUtil$2.run(ImpersonationUtil.java:216) 
[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at java.security.AccessController.doPrivileged(Native Method) 
[na:1.8.0_144]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_144]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
 [hadoop-common-2.7.0-mapr-1707.jar:na]
at 
org.apache.drill.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:216)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:208)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.FileSystemSchemaFactory$FileSystemSchema.(FileSystemSchemaFactory.java:89)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.FileSystemSchemaFactory.registerSchemas(FileSystemSchemaFactory.java:77)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.FileSystemPlugin.registerSchemas(FileSystemPlugin.java:157)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.calcite.jdbc.DynamicRootSchema.loadSchemaFactory(DynamicRootSchema.java:81)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.16.0-drill-r3]
at 
org.apache.calcite.jdbc.DynamicRootSchema.getImplicitSubSchema(DynamicRootSchema.java:66)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.16.0-drill-r3]
at 
org.apache.calcite.jdbc.CalciteSchema.getSubSchema(CalciteSchema.java:233) 
[calcite-core-1.16.0-drill-r3.jar:1.16.0-drill-r3]
at 
org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getSubSchema(CalciteSchema.java:600)
 [calcite-core-1.16.0-drill-r3.jar:1.16.0-drill-r3]
at 
org.apache.drill.exec.planner.sql.SchemaUtilites.searchSchemaTree(SchemaUtilites.java:106)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.SchemaUtilites.findSchema(SchemaUtilites.java:51)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.SchemaUtilites.findSchema(SchemaUtilites.java:77)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.user.UserSession.getDefaultSchema(UserSession.java:237)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.QueryContext.getNewDefaultSchema(QueryContext.java:138)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.SqlConverter.(SqlConverter.java:130) 
[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:105)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:83)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:567) 
[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
at org.apache.dri

[jira] [Updated] (DRILL-5365) FileNotFoundException when reading a parquet file

2018-07-03 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-5365:
--
Issue Type: Improvement  (was: Bug)

> FileNotFoundException when reading a parquet file
> -
>
> Key: DRILL-5365
> URL: https://issues.apache.org/jira/browse/DRILL-5365
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Affects Versions: 1.10.0
>Reporter: Chun Chang
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.14.0
>
>
> The parquet file is generated through the following CTAS.
> To reproduce the issue: 1) two or more nodes cluster; 2) enable 
> impersonation; 3) set "fs.default.name": "file:///" in hive storage plugin; 
> 4) restart drillbits; 5) as a regular user, on node A, drop the table/file; 
> 6) ctas from a large enough hive table as source to recreate the table/file; 
> 7) query the table from node A should work; 8) query from node B as same user 
> should reproduce the issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2018-07-03 Thread salim achouche (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531837#comment-16531837
 ] 

salim achouche commented on DRILL-6517:
---

If this is the case, then I'll fix that; though my impression is that the 
exception is thrown in the last HJ where both inputs came from non-parquet. I 
am currently re-running the test with new instrumentation..

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Critical
> Fix For: 1.14.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:690)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:662)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:73)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:79)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:242)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:218)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.r

[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531829#comment-16531829
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche edited a comment on issue #1360: DRILL-6578: Handle query 
cancellation in Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402262585
 
 
   So you suggest that I update the setSafe() method to throw the checked 
exception InterruptedException. Can we find a middle ground as the main point 
here is to handle cancellation? what about going back to my original 
implementation which threw a DrillException when interrupt flag is set. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531827#comment-16531827
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

Ben-Zvi commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402264413
 
 
   Is the special file/class RuntimeUtils needed ? (I.e., instead of calling 
Thread.interrupted() directly).
   I may be a bit biased against proliferation of files/classes that don't do 
much work  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6496) VectorUtil.showVectorAccessibleContent does not log vector content

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531825#comment-16531825
 ] 

ASF GitHub Bot commented on DRILL-6496:
---

ilooner commented on issue #1336: DRILL-6496: Added missing logging statement 
in VectorUtil.showVectorAccessibleContent(VectorAccessible va, int[] 
columnWidths)
URL: https://github.com/apache/drill/pull/1336#issuecomment-402263811
 
 
   @arina-ielchiieva We still allow System.out and System.err in our checkstyle 
rules.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> VectorUtil.showVectorAccessibleContent does not log vector content
> --
>
> Key: DRILL-6496
> URL: https://issues.apache.org/jira/browse/DRILL-6496
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Arina Ielchiieva
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.14.0
>
>
> {{VectorUtil.showVectorAccessibleContent(VectorAccessible va, int[] 
> columnWidths)}} does not log vector content. Introduced after DRILL-6438.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531817#comment-16531817
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402262585
 
 
   So you suggest that I update the setSafe() method to throw the checked 
exception InterruptedException. Can we find a middle ground as the main point 
here is to handle cancellation? what about going back to my original 
implementation which though a DrillException when interrupt flag is set. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread salim achouche (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

salim achouche updated DRILL-6579:
--
Labels: pull-request-available ready-to-commit  (was: 
pull-request-available)

> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available, ready-to-commit
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531812#comment-16531812
 ] 

ASF GitHub Bot commented on DRILL-6579:
---

sachouche commented on issue #1361: DRILL-6579: Added sanity checks to the 
Parquet reader to avoid infini…
URL: https://github.com/apache/drill/pull/1361#issuecomment-402260287
 
 
   Thank you @Ben-Zvi for the review!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531808#comment-16531808
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

vrozov commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402258967
 
 
   @sachouche No, there is no point to throw `InterrupedException` and catch it 
in the same method. I would prefer to check for a thread to be interrupted in a 
method that calls `hasNext()` and `next()`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6553) Fix TopN for unnest operator

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531803#comment-16531803
 ] 

ASF GitHub Bot commented on DRILL-6553:
---

HanumathRao commented on a change in pull request #1353: DRILL-6553: Fix TopN 
for unnest operator
URL: https://github.com/apache/drill/pull/1353#discussion_r199915644
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillLateralJoinRelBase.java
 ##
 @@ -41,24 +41,22 @@
 
 
 public abstract class DrillLateralJoinRelBase extends Correlate implements 
DrillRelNode {
-
-  final private static double CORRELATE_MEM_COPY_COST = 
DrillCostBase.MEMORY_TO_CPU_RATIO * DrillCostBase.BASE_CPU_COST;
-  final public boolean excludeCorrelateColumn;
-  public DrillLateralJoinRelBase(RelOptCluster cluster, RelTraitSet traits, 
RelNode left, RelNode right, boolean excludeCorrelateCol,
-   CorrelationId correlationId, ImmutableBitSet 
requiredColumns, SemiJoinType semiJoinType) {
+final private static double CORRELATE_MEM_COPY_COST = 
DrillCostBase.MEMORY_TO_CPU_RATIO * DrillCostBase.BASE_CPU_COST;
+  final public boolean excludeCorrelateColumn;  public 
DrillLateralJoinRelBase(RelOptCluster cluster, RelTraitSet traits, RelNode 
left, RelNode right,boolean excludeCorrelateCol,
 
 Review comment:
   I think there is some indentation problem here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix TopN for unnest operator
> 
>
> Key: DRILL-6553
> URL: https://issues.apache.org/jira/browse/DRILL-6553
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Plan for the query with unnest is chosen non-optimally:
> {code:sql}
> select customer.c_custkey, customer.c_name, t.o.o_orderkey,t.o.o_totalprice
> from dfs.`lateraljoin/multipleFiles` customer,
> unnest(customer.c_orders) t(o)
> order by customer.c_custkey, t.o.o_orderkey, t.o.o_totalprice
> limit 50
> {code}
> Plan:
> {noformat}
> 00-00Screen
> 00-01  ProjectAllowDup(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$0], sort1=[$2], sort2=[$3], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-06Project(c_custkey=[$2], c_name=[$3], EXPR$2=[ITEM($4, 
> 'o_orderkey')], EXPR$3=[ITEM($4, 'o_totalprice')])
> 00-07  LateralJoin(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{1}])
> 00-09Project(T0¦¦**=[$0], c_orders=[$1], c_custkey=[$2], 
> c_name=[$3])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`**`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> 00-08Project(c_orders0=[$0])
> 00-10  Unnest [srcOp=00-07] 
> {noformat}
> A similar query, but with flatten:
> {code:sql}
> select f.c_custkey, f.c_name, f.o.o_orderkey, f.o.o_totalprice from (select 
> c_custkey, c_name, flatten(c_orders) as o from 
> dfs.`lateraljoin/multipleFiles` customer) f order by f.c_custkey, 
> f.o.o_orderkey, f.o.o_totalprice limit 50
> {code}
> has plan:
> {noformat}
> 00-00Screen
> 00-01  Project(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  TopN(limit=[50])
> 00-06Project(c_custkey=[$0], c_name=[$1], EXPR$2=[ITEM($2, 
> 'o_orderkey')], EXPR$3=[ITEM($2, 'o_totalprice')])
> 00-07  Flatten(flattenField=[$2])
> 00-08Project(c_custkey=[$0], c_name=[$1], o=[$2])
> 00-09  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`c_custkey`, `c_name`, `

[jira] [Commented] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531797#comment-16531797
 ] 

ASF GitHub Bot commented on DRILL-6543:
---

ilooner commented on a change in pull request #1351: DRILL-6543: Disable Hash 
Join fallback, add percent_reserved_allowance_from_direct
URL: https://github.com/apache/drill/pull/1351#discussion_r199914352
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/util/MemoryAllocationUtilities.java
 ##
 @@ -138,16 +139,36 @@ public static long computeOperatorMemory(OptionSet 
optionManager, long maxAllocP
   @VisibleForTesting
   public static long computeQueryMemory(DrillConfig config, OptionSet 
optionManager, long directMemory) {
 
+// Get the options
+double percent_per_query = 
optionManager.getOption(ExecConstants.PERCENT_MEMORY_PER_QUERY);
 
 Review comment:
   Minor nitpick, but the convention in Java is to use Camel case for names
   
   `percentPerQuery`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531800#comment-16531800
 ] 

ASF GitHub Bot commented on DRILL-6543:
---

ilooner commented on a change in pull request #1351: DRILL-6543: Disable Hash 
Join fallback, add percent_reserved_allowance_from_direct
URL: https://github.com/apache/drill/pull/1351#discussion_r199910775
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/util/TestQueryMemoryAlloc.java
 ##
 @@ -120,6 +123,53 @@ public void testCustomPercent() throws Exception {
 }
   }
 
+  /**
+   *  Test that the percent_reserved_allowance_from_direct limitation works
+   */
+  @Test
+  public void testReservedAllowanceCheck() throws Exception {
+OperatorFixture.Builder builder = OperatorFixture.builder(dirTestWatcher);
 
 Review comment:
   Thanks for adding a unit test.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531799#comment-16531799
 ] 

ASF GitHub Bot commented on DRILL-6543:
---

ilooner commented on a change in pull request #1351: DRILL-6543: Disable Hash 
Join fallback, add percent_reserved_allowance_from_direct
URL: https://github.com/apache/drill/pull/1351#discussion_r199913761
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/util/MemoryAllocationUtilities.java
 ##
 @@ -138,16 +139,36 @@ public static long computeOperatorMemory(OptionSet 
optionManager, long maxAllocP
   @VisibleForTesting
   public static long computeQueryMemory(DrillConfig config, OptionSet 
optionManager, long directMemory) {
 
+// Get the options
+double percent_per_query = 
optionManager.getOption(ExecConstants.PERCENT_MEMORY_PER_QUERY);
+long max_query_per_node = 
optionManager.getOption(ExecConstants.MAX_QUERY_MEMORY_PER_NODE);
+double percent_allowance = 
optionManager.getOption(ExecConstants.PERCENT_RESERVED_ALLOWANCE_FROM_DIRECT);
+
+// verify that the allowance is kept
+if ( percent_per_query + percent_allowance > 1.0 ) {
 
 Review comment:
   Why do we need to make sure these add to one? Couldn't we just reserve our 
allowance from the previously computed maxAllocPerNode value. For example
   
   ```
   double percentAllowance = 
optionManager.getOption(ExecConstants.PERCENT_RESERVED_ALLOWANCE_FROM_DIRECT);
   // Memory computed as a percent of total memory.
   long perQueryMemory = Math.round(directMemory *
   optionManager.getOption(ExecConstants.PERCENT_MEMORY_PER_QUERY));
   
   // But, must allow at least the amount given explicitly for
   // backward compatibility.
   
   perQueryMemory = Math.max(perQueryMemory,
   optionManager.getOption(ExecConstants.MAX_QUERY_MEMORY_PER_NODE));
   
   // Compute again as either the total direct memory, or the
   // configured maximum top-level allocation (10 GB).
   
   long maxAllocPerNode = Math.min(directMemory,
   config.getLong(RootAllocatorFactory.TOP_LEVEL_MAX_ALLOC));
   
   // Final amount per node per query is the minimum of these two.
   maxAllocPerNode = Math.min(maxAllocPerNode, perQueryMemory);
   
   // Deduct non buffered allowance
   return Math.round(maxAllocPerNode * (1.0 - percentAllowance));
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> impl

[jira] [Commented] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531798#comment-16531798
 ] 

ASF GitHub Bot commented on DRILL-6543:
---

ilooner commented on a change in pull request #1351: DRILL-6543: Disable Hash 
Join fallback, add percent_reserved_allowance_from_direct
URL: https://github.com/apache/drill/pull/1351#discussion_r199909294
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
 ##
 @@ -490,6 +491,31 @@ private ExecConstants() {
   public static DoubleValidator PERCENT_MEMORY_PER_QUERY = new 
RangeDoubleValidator(
   PERCENT_MEMORY_PER_QUERY_KEY, 0, 1.0);
 
+  /**
+   * Enforce reserving some percentage of the JVM's Direct Memory for the
+   * non-buffered operators. I.e., whether using max_query_memory_per_node or
+   * percent_per_query, the memory result (for all the buffered operators) can
+   * not exceed the size of the Direct Memory minus this reserved allowance.
+   * 
+   * This allowance is needed to prevent a potential OOM. In case the total 
memory
+   * promised for the buffered operators is very close to the Direct Memory's 
size,
+   * then if some non-buffered operators (e.g., scanners) also grab 
significant memory,
+   * then the remaining memory is less than the promised memory, leading to an 
OOM.
+   * 
+   * 
+   * Note that this enforcement is only good for the non concurrent case. For 
multiple
+   * concurrent queries, other queries may grab some unaccounted for Direct 
Memory. For
+   * concurrent work, better set the option "planner.memory.percent_per_query" 
correctly.
+   * 
+   * 
+   * DEFAULT: 25%
+   * 
+   */
 
 Review comment:
   Thanks for adding this documentation.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5571) Unable to cancel running queries from Web UI

2018-07-03 Thread Kunal Khatua (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-5571:

Fix Version/s: 1.15.0

> Unable to cancel running queries from Web UI
> 
>
> Key: DRILL-5571
> URL: https://issues.apache.org/jira/browse/DRILL-5571
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.11.0
>Reporter: Kedar Sankar Behera
>Priority: Major
> Fix For: 1.15.0
>
>
> We are unable to access profiles of some running queries. Hit the following 
> error on the Web UI:
> {code}
> {
>   “errorMessage” : “VALIDATION ERROR: No profile with given query id 
> ‘26c90b95-928b-15e3-bedc-bfb4a046cc8b’ exists. Please verify the query 
> id.\n\n\n[Error Id: e6896a23-6932-469d-9968-d315fdd06dd4 ]”
> }
> {code}
> And we cannot cancel the running queries whose profile page can be accessed:
> {code}
> Failure attempting to cancel query 26c90b33-cf7e-0495-8f76-55220f71f809.  
> Unable to find information about where query is actively running.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5897) Support Query Cancellation when WebConnection is closed on client side both for authenticated and unauthenticated user's

2018-07-03 Thread Kunal Khatua (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-5897:

Fix Version/s: (was: Future)
   1.15.0

> Support Query Cancellation when WebConnection is closed on client side both 
> for authenticated and unauthenticated user's
> 
>
> Key: DRILL-5897
> URL: https://issues.apache.org/jira/browse/DRILL-5897
> Project: Apache Drill
>  Issue Type: Task
>  Components: Web Server
>Reporter: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.15.0
>
>
> Today there is no session created (using cookies) for unauthenticated WebUser 
> whereas for authenticated user's session is created. Also when a user submits 
> a query then we wait until entire results is gathered on WebServer side and 
> then send the entire Webpage in the response (probably that's how ftl works).
> For authenticated user's we only cancel the queries in-flight when the 
> session is invalidated (either by timeout or logout). However in absence of 
> session we do nothing for unauthenticated user's so once a query is submitted 
> it will run until it's failed or successful. The only way to explicitly 
> cancel a query is from profile page which will not work when profiles are 
> disabled.
> We should research more on if it's possible to get the underlying 
> WebConnection (not session) close event and cancel queries running as part of 
> that connection close event. Also since today we will wait for entire query 
> to finish on backend server and then send the response back, which is when a 
> bad connection is detected it doesn't makes sense to cancel at that point 
> (there is 1:1 mapping between request and connection) since query is already 
> completed. Instead we can send header followed by batches of data 
> (pagination) then we can detect early enough if connection is valid or not 
> and cancel the query in response to that. More research is needed in this 
> area along with knowledge of Jetty on how this can be achieved to make our 
> WebServer more performant.
>  It would also be good to explore if we can provide sessions for 
> unauthenticated user connection too based on timeout and then handle the 
> query cancellation as part of session timeout. This will also impact the way 
> we support impersonation without authentication scenario where we ask user to 
> input query user name for each request. If we support session then username 
> should be done at session level rather than per request level which can be 
> achieved by logging user without password (similar to authentication flow)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-2035) Add ability to cancel multiple queries

2018-07-03 Thread Kunal Khatua (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-2035:

Fix Version/s: (was: Future)
   1.15.0

> Add ability to cancel multiple queries
> --
>
> Key: DRILL-2035
> URL: https://issues.apache.org/jira/browse/DRILL-2035
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Client - HTTP, Web Server
>Reporter: Neeraja
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently Drill UI allows canceling one query at a time.
> This could be cumbersome to manage for scenarios using with BI tools which 
> generate multiple queries for a single action in the UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6576) Unnest reports incoming record counts incorrectly

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531770#comment-16531770
 ] 

ASF GitHub Bot commented on DRILL-6576:
---

Ben-Zvi commented on a change in pull request #1362: DRILL-6576: Unnest reports 
incoming record counts incorrectly
URL: https://github.com/apache/drill/pull/1362#discussion_r199905031
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
 ##
 @@ -207,32 +208,30 @@ public IterOutcome innerNext() {
   } finally {
 stats.stopSetup();
   }
-  // since we never called next on an upstream operator, incoming stats are
-  // not updated. update input stats explicitly.
-  stats.batchReceived(0, incoming.getRecordCount(), true);
   return IterOutcome.OK_NEW_SCHEMA;
 } else {
   assert state != BatchState.FIRST : "First batch should be OK_NEW_SCHEMA";
 
 Review comment:
   Unrelated to the change, but this "assert" is dead code (due to the if() 
above).
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unnest reports incoming record counts incorrectly
> -
>
> Key: DRILL-6576
> URL: https://issues.apache.org/jira/browse/DRILL-6576
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Parth Chandra
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6576) Unnest reports incoming record counts incorrectly

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531771#comment-16531771
 ] 

ASF GitHub Bot commented on DRILL-6576:
---

Ben-Zvi commented on a change in pull request #1362: DRILL-6576: Unnest reports 
incoming record counts incorrectly
URL: https://github.com/apache/drill/pull/1362#discussion_r199908575
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unnest/UnnestRecordBatch.java
 ##
 @@ -207,32 +208,30 @@ public IterOutcome innerNext() {
   } finally {
 stats.stopSetup();
   }
-  // since we never called next on an upstream operator, incoming stats are
-  // not updated. update input stats explicitly.
-  stats.batchReceived(0, incoming.getRecordCount(), true);
   return IterOutcome.OK_NEW_SCHEMA;
 } else {
   assert state != BatchState.FIRST : "First batch should be OK_NEW_SCHEMA";
   container.zeroVectors();
-
   // Check if schema has changed
-  if (lateral.getRecordIndex() == 0 && schemaChanged()) {
-hasRemainder = true; // next call to next will handle the actual 
data.
-try {
-  setupNewSchema();
-} catch (SchemaChangeException ex) {
-  kill(false);
-  logger.error("Failure during query", ex);
-  context.getExecutorState().fail(ex);
-  return IterOutcome.STOP;
-}
-stats.batchReceived(0, incoming.getRecordCount(), true);
-return OK_NEW_SCHEMA;
-  }
   if (lateral.getRecordIndex() == 0) {
-unnest.resetGroupIndex();
+boolean isNewSchema = schemaChanged();
+if (isNewSchema) {
+  hasRemainder = true; // next call to next will handle the actual 
data.
+  stats.batchReceived(0, incoming.getRecordCount(), isNewSchema);
 
 Review comment:
   (Very minor) The same stats call is in both sides of the if(); can be called 
once before the if(). Also can erase the "else" as the first part of the if() 
returns. 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unnest reports incoming record counts incorrectly
> -
>
> Key: DRILL-6576
> URL: https://issues.apache.org/jira/browse/DRILL-6576
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Parth Chandra
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531767#comment-16531767
 ] 

ASF GitHub Bot commented on DRILL-6579:
---

sachouche commented on issue #1361: DRILL-6579: Added sanity checks to the 
Parquet reader to avoid infini…
URL: https://github.com/apache/drill/pull/1361#issuecomment-402250181
 
 
   @Ben-Zvi 
   A Drill user (version 1.12) reported a Parquet issue which indicated two 
different problems
   a) The Parquet code was running within an infinite loop; we don't have any 
repro for this issue. Code inspection showed that if the number-of-rows to load 
is zero than this condition could arise. So fixed this possibility and added 
sanity checks 
   b) When an infinite loop arises, the user could not cancel the query; so I 
created another JIRA DRILL-6578 to detect thread interruption within the main 
processing loop to ensure query cancellation always succeeds 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6576) Unnest reports incoming record counts incorrectly

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531720#comment-16531720
 ] 

ASF GitHub Bot commented on DRILL-6576:
---

parthchandra opened a new pull request #1362: DRILL-6576: Unnest reports 
incoming record counts incorrectly
URL: https://github.com/apache/drill/pull/1362
 
 
   Minor fix to correct the record count reported by unnest
   @sohami, @Ben-Zvi, please review


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Unnest reports incoming record counts incorrectly
> -
>
> Key: DRILL-6576
> URL: https://issues.apache.org/jira/browse/DRILL-6576
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Parth Chandra
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531711#comment-16531711
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka closed pull request #1345: DRILL-6494: Drill Plugins Handler
URL: https://github.com/apache/drill/pull/1345
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/common/pom.xml b/common/pom.xml
index a7fba2bf16..a8cab1075d 100644
--- a/common/pom.xml
+++ b/common/pom.xml
@@ -53,13 +53,11 @@
 
   com.typesafe
   config
-  1.0.0
 
 
 
   org.apache.commons
   commons-lang3
-  3.1
 
 
 
diff --git 
a/common/src/main/java/org/apache/drill/common/config/CommonConstants.java 
b/common/src/main/java/org/apache/drill/common/config/CommonConstants.java
index 1b5fb29c01..e203972b8e 100644
--- a/common/src/main/java/org/apache/drill/common/config/CommonConstants.java
+++ b/common/src/main/java/org/apache/drill/common/config/CommonConstants.java
@@ -31,4 +31,7 @@
   /** Override configuration file name.  (Classpath resource pathname.) */
   String CONFIG_OVERRIDE_RESOURCE_PATHNAME = "drill-override.conf";
 
+  /** Override plugins configs file name.  (Classpath resource pathname.) */
+  String STORAGE_PLUGINS_OVERRIDE_CONF = "storage-plugins-override.conf";
+
 }
diff --git 
a/common/src/main/java/org/apache/drill/common/config/DrillConfig.java 
b/common/src/main/java/org/apache/drill/common/config/DrillConfig.java
index 66058643da..7211f19363 100644
--- a/common/src/main/java/org/apache/drill/common/config/DrillConfig.java
+++ b/common/src/main/java/org/apache/drill/common/config/DrillConfig.java
@@ -261,7 +261,7 @@ private static DrillConfig create(String 
overrideFileResourcePathname,
 final String className = getString(location);
 if (className == null) {
   throw new DrillConfigurationException(String.format(
-  "No class defined at location '%s'. Expected a definition of the 
class []",
+  "No class defined at location '%s'. Expected a definition of the 
class [%s]",
   location, clazz.getCanonicalName()));
 }
 
diff --git 
a/common/src/main/java/org/apache/drill/common/scanner/ClassPathScanner.java 
b/common/src/main/java/org/apache/drill/common/scanner/ClassPathScanner.java
index 13a5eade9a..909e8110df 100644
--- a/common/src/main/java/org/apache/drill/common/scanner/ClassPathScanner.java
+++ b/common/src/main/java/org/apache/drill/common/scanner/ClassPathScanner.java
@@ -51,7 +51,6 @@
 import com.google.common.base.Stopwatch;
 import com.google.common.collect.HashMultimap;
 import com.google.common.collect.Multimap;
-import com.google.common.collect.Sets;
 
 import javassist.bytecode.AccessFlag;
 import javassist.bytecode.AnnotationsAttribute;
@@ -320,15 +319,12 @@ public void scan(final Object cls) {
*   to scan for (relative to specified class loaders' classpath 
roots)
* @param  returnRootPathname  whether to collect classpath root portion of
*   URL for each resource instead of full URL of each resource
-   * @param  classLoaders  set of class loaders in which to look up resource;
-   *   none (empty array) to specify to use current thread's context
-   *   class loader and {@link Reflections}'s class loader
* @returns  ...; empty set if none
*/
   public static Set forResource(final String resourcePathname, final 
boolean returnRootPathname) {
 logger.debug("Scanning classpath for resources with pathname \"{}\".",
  resourcePathname);
-final Set resultUrlSet = Sets.newHashSet();
+final Set resultUrlSet = new HashSet<>();
 final ClassLoader classLoader = ClassPathScanner.class.getClassLoader();
 try {
   final Enumeration resourceUrls = 
classLoader.getResources(resourcePathname);
diff --git a/common/src/main/java/org/apache/drill/exec/util/ActionOnFile.java 
b/common/src/main/java/org/apache/drill/exec/util/ActionOnFile.java
new file mode 100644
index 00..cca1e771d7
--- /dev/null
+++ b/common/src/main/java/org/apache/drill/exec/util/ActionOnFile.java
@@ -0,0 +1,87 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed 

[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531693#comment-16531693
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche edited a comment on issue #1360: DRILL-6578: Handle query 
cancellation in Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402226927
 
 
   @vrozov, please confirm if this is what you are suggesting:
   
   boolean hasNext() {
   try {
 checkInterrupted();
   } catch (InterruptedException) {
 throw new DrillRuntimeException(...);
   }
   
   same for next()
   
   NOTE -
   o If I throw InterruptedException within the loop, then this would mean I'll 
have to change the setSafe() APIs to also throw InterruptedException
   o This will be a bigger change as all consumer methods (of setSafe()) will 
also need to handle this exception


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531688#comment-16531688
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402226927
 
 
   @vrozov, please confirm if this is what you are suggesting:
   
   boolean hasNext() {
   try {
 checkInterrupted();
   } catch (InterruptedException) {
 throw new DrillRuntimeException(...);
   }
   
   same for next()


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6576) Unnest reports incoming record counts incorrectly

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6576:
-
Fix Version/s: 1.14.0

> Unnest reports incoming record counts incorrectly
> -
>
> Key: DRILL-6576
> URL: https://issues.apache.org/jira/browse/DRILL-6576
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Parth Chandra
>Assignee: Parth Chandra
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531675#comment-16531675
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

ppadma commented on issue #1348: DRILL-6346: Create an Official Drill Docker 
Container
URL: https://github.com/apache/drill/pull/1348#issuecomment-402223870
 
 
   @Agirish This is very useful Abhishek. Thanks for doing this. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6559) Travis timing out

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6559:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Travis timing out
> -
>
> Key: DRILL-6559
> URL: https://issues.apache.org/jira/browse/DRILL-6559
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Minor
> Fix For: 1.15.0
>
>
> There was a decision to exclude some more TPCH unit tests from Travis CI 
> build by adding 
>  TestTpchSingleMode, TestTpchLimit0 (optionally).
> Excluding other unit tests is under discussion.
> [https://lists.apache.org/thread.html/35f41f16d1679029e6089e447e2a85534243e9d5904116114c5168e8@%3Cdev.drill.apache.org%3E]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531652#comment-16531652
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

Agirish commented on issue #1348: DRILL-6346: Create an Official Drill Docker 
Container
URL: https://github.com/apache/drill/pull/1348#issuecomment-402218839
 
 
   @cgivre , sure. Rough ETA on the drill.apache.org/docs documentation is mid 
July. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531649#comment-16531649
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

vrozov commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402218679
 
 
   @sachouche If you prefer to throw `DrillRuntimeException`, make 
`checkInterrupted` a static method of `DrillRuntimeException`, but I would 
prefer to avoid throwing `DrillRuntimeException` in the iterator as the 
iterator does not know when to throw this exception. Instead, throw 
`InterrupedException` when the iterator is used in a loop.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531637#comment-16531637
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

cgivre commented on issue #1348: DRILL-6346: Create an Official Drill Docker 
Container
URL: https://github.com/apache/drill/pull/1348#issuecomment-402216070
 
 
   Thanks @Agirish  for doing this!  Paul Rogers 
and I are almost done with an Oreilly book about Drill and we’d love to include 
something about this in the book.  If the docs are done soon, could you please 
send them our way?
   Thanks,
   — C
   
   > On Jul 3, 2018, at 12:23, Abhishek Girish  wrote:
   > 
   > Thanks all, for the review!
   > 
   > I'll work with @bbevens  to document usage of 
Docker. I'll follow-up on the Docker official images in a few days. Regarding 
dev docs in this repo, I have a draft here: 
https://github.com/Agirish/drill/blob/docker_doc/docs/dev/Docker.md 
. If it 
looks good, I can add it to the same PR. If not, I can create a new PR on a 
later date.
   > 
   > —
   > You are receiving this because you are subscribed to this thread.
   > Reply to this email directly, view it on GitHub 
, or mute the 
thread 
.
   > 
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6346) Create an Official Drill Docker Container

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531633#comment-16531633
 ] 

ASF GitHub Bot commented on DRILL-6346:
---

Agirish commented on issue #1348: DRILL-6346: Create an Official Drill Docker 
Container
URL: https://github.com/apache/drill/pull/1348#issuecomment-402215143
 
 
   Thanks all, for the review!
   
   I'll work with @bbevens to document usage of Docker. I'll follow-up on the 
Docker official images in a few days. Regarding dev docs in this repo, I have a 
draft here: 
https://github.com/Agirish/drill/blob/docker_doc/docs/dev/Docker.md. If it 
looks good, I can add it to the same PR. If not, I can create a new PR on a 
later date. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Create an Official Drill Docker Container
> -
>
> Key: DRILL-6346
> URL: https://issues.apache.org/jira/browse/DRILL-6346
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531622#comment-16531622
 ] 

ASF GitHub Bot commented on DRILL-5796:
---

vrozov commented on a change in pull request #1298: DRILL-5796: Filter pruning 
for multi rowgroup parquet file
URL: https://github.com/apache/drill/pull/1298#discussion_r199868514
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/stat/ParquetIsPredicate.java
 ##
 @@ -62,90 +60,89 @@ private ParquetIsPredicate(LogicalExpression expr, 
BiPredicate, Ra
 return visitor.visitUnknown(this, value);
   }
 
-  @Override
-  public boolean canDrop(RangeExprEvaluator evaluator) {
+  /**
+   * Apply the filter condition against the meta of the rowgroup.
+   */
+  public RowsMatch matches(RangeExprEvaluator evaluator) {
 Statistics exprStat = expr.accept(evaluator, null);
-if (isNullOrEmpty(exprStat)) {
-  return false;
-}
+return ParquetPredicatesHelper.isNullOrEmpty(exprStat) ? RowsMatch.SOME : 
predicate.apply(exprStat, evaluator);
+  }
 
-return predicate.test(exprStat, evaluator);
+  /**
+   * After the applying of the filter against the statistics of the rowgroup, 
if the result is RowsMatch.ALL,
+   * then we still must know if the rowgroup contains some null values, 
because it can change the filter result.
+   * If it contains some null values, then we change the RowsMatch.ALL into 
RowsMatch.SOME, which sya that maybe
+   * some values (the null ones) should be disgarded.
+   */
+  static RowsMatch checkNull(Statistics exprStat) {
+return exprStat.getNumNulls() > 0 ? RowsMatch.SOME : RowsMatch.ALL;
   }
 
   /**
* IS NULL predicate.
*/
   private static > LogicalExpression 
createIsNullPredicate(LogicalExpression expr) {
 return new ParquetIsPredicate(expr,
-//if there are no nulls  -> canDrop
-(exprStat, evaluator) -> hasNoNulls(exprStat)) {
-  private final boolean isArray = isArray(expr);
-
-  private boolean isArray(LogicalExpression expression) {
-if (expression instanceof TypedFieldExpr) {
-  TypedFieldExpr typedFieldExpr = (TypedFieldExpr) expression;
-  SchemaPath schemaPath = typedFieldExpr.getPath();
-  return schemaPath.isArray();
-}
-return false;
-  }
-
-  @Override
-  public boolean canDrop(RangeExprEvaluator evaluator) {
+  (exprStat, evaluator) -> {
 // for arrays we are not able to define exact number of nulls
 // [1,2,3] vs [1,2] -> in second case 3 is absent and thus it's null 
but statistics shows no nulls
-return !isArray && super.canDrop(evaluator);
-  }
-};
+TypedFieldExpr typedFieldExpr = (TypedFieldExpr) expr;
+if (typedFieldExpr.getPath().isArray()) {
+  return RowsMatch.SOME;
+}
+if (hasNoNulls(exprStat)) {
+  return RowsMatch.NONE;
+}
+return isAllNulls(exprStat, evaluator.getRowCount()) ? RowsMatch.ALL : 
RowsMatch.SOME;
+  });
   }
 
   /**
* IS NOT NULL predicate.
*/
   private static > LogicalExpression 
createIsNotNullPredicate(LogicalExpression expr) {
 return new ParquetIsPredicate(expr,
-//if there are all nulls  -> canDrop
-(exprStat, evaluator) -> isAllNulls(exprStat, evaluator.getRowCount())
+  (exprStat, evaluator) -> isAllNulls(exprStat, evaluator.getRowCount()) ? 
RowsMatch.NONE : checkNull(exprStat)
 );
   }
 
   /**
* IS TRUE predicate.
*/
-  private static LogicalExpression createIsTruePredicate(LogicalExpression 
expr) {
-return new ParquetIsPredicate(expr,
-//if max value is not true or if there are all nulls  -> canDrop
-(exprStat, evaluator) -> !((BooleanStatistics)exprStat).getMax() || 
isAllNulls(exprStat, evaluator.getRowCount())
-);
+  private static > LogicalExpression 
createIsTruePredicate(LogicalExpression expr) {
+return new ParquetIsPredicate(expr,
+  (exprStat, evaluator) -> {
+if (isAllNulls(exprStat, evaluator.getRowCount()) || 
(exprStat.genericGetMin().equals(Boolean.FALSE) && 
exprStat.genericGetMax().equals(Boolean.FALSE))) {
+  return RowsMatch.NONE;
+}
+return exprStat.genericGetMin().equals(Boolean.TRUE) && 
exprStat.genericGetMax().equals(Boolean.TRUE) ? checkNull(exprStat) : 
RowsMatch.SOME;
+  });
   }
 
   /**
* IS FALSE predicate.
*/
-  private static LogicalExpression createIsFalsePredicate(LogicalExpression 
expr) {
-return new ParquetIsPredicate(expr,
-//if min value is not false or if there are all nulls  -> canDrop
-(exprStat, evaluator) -> ((BooleanStatistics)exprStat).getMin() || 
isAllNulls(exprStat, evaluator.getRowCount())
+  private static > LogicalExpression 
createIsFalsePredicate(LogicalExpression expr) {
+return new ParquetIsPredicate(expr,
+  (exprStat,

[jira] [Commented] (DRILL-5796) Filter pruning for multi rowgroup parquet file

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531604#comment-16531604
 ] 

ASF GitHub Bot commented on DRILL-5796:
---

vrozov commented on issue #1298: DRILL-5796: Filter pruning for multi rowgroup 
parquet file
URL: https://github.com/apache/drill/pull/1298#issuecomment-402209476
 
 
   @jbimbert Please rebase your branch properly for the review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Filter pruning for multi rowgroup parquet file
> --
>
> Key: DRILL-5796
> URL: https://issues.apache.org/jira/browse/DRILL-5796
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: Damien Profeta
>Assignee: Jean-Blas IMBERT
>Priority: Major
> Fix For: 1.14.0
>
>
> Today, filter pruning use the file name as the partitioning key. This means 
> you can remove a partition only if the whole file is for the same partition. 
> With parquet, you can prune the filter if the rowgroup make a partition of 
> your dataset as the unit of work if the rowgroup not the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531601#comment-16531601
 ] 

ASF GitHub Bot commented on DRILL-6578:
---

sachouche commented on issue #1360: DRILL-6578: Handle query cancellation in 
Parquet reader
URL: https://github.com/apache/drill/pull/1360#issuecomment-402208750
 
 
   @vrozov and @Ben-Zvi  can you please review this PR?
   
   Thanks!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531543#comment-16531543
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199844308
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531544#comment-16531544
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199844868
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.IOException;
+import java.net.URL;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence lpPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.lpPersistence = new LogicalPlanPersistence(context.getConfig(), 
context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsForPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsForPersistentStore = new StoragePlugins(new HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsForPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsForPersistentStore.put(pluginName, updatedStatusPluginConfig);
+  }
+} else {
+  pluginsForPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsForPersistentStore to Persistent Store
+Optional.ofNullable(pluginsForPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  String fileAction = 
context.getConfig().getString(ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE).toUpperCase();
+  Optional actionOnFile = 
Arrays.stream(ActionOnFile.values())
+  .filter(action -> action.name().equals(fileAction))

[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531542#comment-16531542
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199844229
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531546#comment-16531546
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199844168
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  String oldName = CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF;
+  String currentDateTime = new 
SimpleDateFormat("MMddHHmmss").format(new Date());
+  String newFileName = new StringBuilder(oldName)
+  .insert(oldName.lastIndexOf("."), "-" + currentDateTime)
+  .toString();
+  Path pluginsOverrideFilePath  = pluginsOverrideFile.toPath();
+  try {
+Files.move(pluginsOverrideFilePath, 
pluginsOverrideFilePath.resolveSibling(newFileName));
+  } catch (IOException e) {
+logger.error("%s file is not renamed after it's use", 
CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF, e);
+  }
+}
+  },
+  REMOVE {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  try {
+Files.delete(pluginsOverrideFile.toPath());
+  } catch (IOException e) {
+logger.error("%s file is not deleted after it's use", 
CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF, e);
+  }
+}
+  };
+
+  private static final org.slf4j.Logger logger =  
org.slf4j.LoggerFactory.getLogger(ActionOnFile.class);
+
+  /**
+   * This is an action which should be performed on the
+   * {@link 
org.apache.drill.common.config.CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
after successful updating of
+   * storage plugins configs with Storage Plugins Handler.
+   * 
+   *   {@link #NONE}: no action will be performed
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531547#comment-16531547
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r19983
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  String oldName = CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF;
+  String currentDateTime = new 
SimpleDateFormat("MMddHHmmss").format(new Date());
+  String newFileName = new StringBuilder(oldName)
+  .insert(oldName.lastIndexOf("."), "-" + currentDateTime)
+  .toString();
+  Path pluginsOverrideFilePath  = pluginsOverrideFile.toPath();
+  try {
+Files.move(pluginsOverrideFilePath, 
pluginsOverrideFilePath.resolveSibling(newFileName));
+  } catch (IOException e) {
+logger.error("%s file is not renamed after it's use", 
CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF, e);
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531545#comment-16531545
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199844395
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  String oldName = CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF;
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531508#comment-16531508
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi commented on issue #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402185262
 
 
   Unit tests failures after rebase were caused by another bug. It appears when 
a query has a single expression in project list and this expression is taken 
from the unnest. In this case, name of this field in unnest rel node is lost 
due to the change made in DRILL-6545.
   
   I have fixed this bug and added one more unit test for this case.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6546) Allow unnest function with nested columns and complex expressions

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531510#comment-16531510
 ] 

ASF GitHub Bot commented on DRILL-6546:
---

vvysotskyi commented on issue #1346: DRILL-6546: Allow unnest function with 
nested columns and complex expressions
URL: https://github.com/apache/drill/pull/1346#issuecomment-402185457
 
 
   @amansinha100, could you please do the final review?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Allow unnest function with nested columns and complex expressions
> -
>
> Key: DRILL-6546
> URL: https://issues.apache.org/jira/browse/DRILL-6546
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Currently queries with unnest and nested columns or complex expressions 
> inside fails:
> {code:sql}
> select u.item from cp.`lateraljoin/nested-customer.parquet` c,
> unnest(c.orders.items) as u(item)
> {code}
> fails with error:
> {noformat}
> VALIDATION ERROR: From line 2, column 10 to line 2, column 21: Column 
> 'orders.items' not found in table 'c'
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531479#comment-16531479
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199828971
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.IOException;
+import java.net.URL;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence lpPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.lpPersistence = new LogicalPlanPersistence(context.getConfig(), 
context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsForPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsForPersistentStore = new StoragePlugins(new HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsForPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsForPersistentStore.put(pluginName, updatedStatusPluginConfig);
+  }
+} else {
+  pluginsForPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsForPersistentStore to Persistent Store
+Optional.ofNullable(pluginsForPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  String fileAction = 
context.getConfig().getString(ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE).toUpperCase();
+  Optional actionOnFile = 
Arrays.stream(ActionOnFile.values())
+  .filter(action -> action.name().equals(fileAction))

[jira] [Updated] (DRILL-6579) Add sanity checks to Parquet Reader

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6579:
-
Fix Version/s: 1.14.0

> Add sanity checks to Parquet Reader 
> 
>
> Key: DRILL-6579
> URL: https://issues.apache.org/jira/browse/DRILL-6579
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Add sanity checks to the Parquet reader to avoid infinite loops.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6578) Ensure the Flat Parquet Reader can handle query cancellation

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6578:
-
Fix Version/s: 1.14.0

> Ensure the Flat Parquet Reader can handle query cancellation
> 
>
> Key: DRILL-6578
> URL: https://issues.apache.org/jira/browse/DRILL-6578
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> * The optimized Parquet reader uses an iterator style to load column data 
>  * We need to ensure the code can properly handle query cancellation even in 
> the presence of bugs within the hasNext() .. next() calls



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6567) Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: java.lang.reflect.UndeclaredThrowableException.

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6567:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Jenkins Regression: TPCDS query 93 fails with INTERNAL_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException.
> ---
>
> Key: DRILL-6567
> URL: https://issues.apache.org/jira/browse/DRILL-6567
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Khurram Faraaz
>Priority: Critical
> Fix For: 1.15.0
>
>
> This is TPCDS Query 93.
> Query: 
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query93.sql
> SELECT ss_customer_sk,
> Sum(act_sales) sumsales
> FROM   (SELECT ss_item_sk,
> ss_ticket_number,
> ss_customer_sk,
> CASE
> WHEN sr_return_quantity IS NOT NULL THEN
> ( ss_quantity - sr_return_quantity ) * ss_sales_price
> ELSE ( ss_quantity * ss_sales_price )
> END act_sales
> FROM   store_sales
> LEFT OUTER JOIN store_returns
> ON ( sr_item_sk = ss_item_sk
> AND sr_ticket_number = ss_ticket_number ),
> reason
> WHERE  sr_reason_sk = r_reason_sk
> AND r_reason_desc = 'reason 38') t
> GROUP  BY ss_customer_sk
> ORDER  BY sumsales,
> ss_customer_sk
> LIMIT 100;
> Here is the stack trace:
> 2018-06-29 07:00:32 INFO  DrillTestLogger:348 - 
> Exception:
> java.sql.SQLException: INTERNAL_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Setup failed for null
> Fragment 4:56
> [Error Id: 3c72c14d-9362-4a9b-affb-5cf937bed89e on atsqa6c82.qa.lab:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> org.apache.drill.exec.store.hive.readers.HiveAbstractReader.setup():327
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():245
> org.apache.drill.exec.physical.impl.ScanBatch.next():164
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():147
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema():118
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext():152
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():281
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.ut

[jira] [Updated] (DRILL-6569) Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not read value at 2 in block 0 in file maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6569:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Jenkins Regression: TPCDS query 19 fails with INTERNAL_ERROR ERROR: Can not 
> read value at 2 in block 0 in file 
> maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet
> --
>
> Key: DRILL-6569
> URL: https://issues.apache.org/jira/browse/DRILL-6569
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Robert Hou
>Priority: Critical
> Fix For: 1.15.0
>
>
> This is TPCDS Query 19.
> I am able to scan the parquet file using:
>select * from 
> dfs.`/drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet`
> and I get 3,349,279 rows selected.
> Query: 
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/hive/parquet/query19.sql
> SELECT i_brand_id  brand_id,
> i_brand brand,
> i_manufact_id,
> i_manufact,
> Sum(ss_ext_sales_price) ext_price
> FROM   date_dim,
> store_sales,
> item,
> customer,
> customer_address,
> store
> WHERE  d_date_sk = ss_sold_date_sk
> AND ss_item_sk = i_item_sk
> AND i_manager_id = 38
> AND d_moy = 12
> AND d_year = 1998
> AND ss_customer_sk = c_customer_sk
> AND c_current_addr_sk = ca_address_sk
> AND Substr(ca_zip, 1, 5) <> Substr(s_zip, 1, 5)
> AND ss_store_sk = s_store_sk
> GROUP  BY i_brand,
> i_brand_id,
> i_manufact_id,
> i_manufact
> ORDER  BY ext_price DESC,
> i_brand,
> i_brand_id,
> i_manufact_id,
> i_manufact
> LIMIT 100;
> Here is the stack trace:
> 2018-06-29 07:00:32 INFO  DrillTestLogger:348 - 
> Exception:
> java.sql.SQLException: INTERNAL_ERROR ERROR: Can not read value at 2 in block 
> 0 in file 
> maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet
> Fragment 4:26
> [Error Id: 6401a71e-7a5d-4a10-a17c-16873fc3239b on atsqa6c88.qa.lab:31010]
>   (hive.org.apache.parquet.io.ParquetDecodingException) Can not read value at 
> 2 in block 0 in file 
> maprfs:///drill/testdata/tpcds_sf100/parquet/store_sales/1_13_1.parquet
> 
> hive.org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue():243
> hive.org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue():227
> 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next():199
> 
> org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.next():57
> 
> org.apache.drill.exec.store.hive.readers.HiveAbstractReader.hasNextValue():417
> org.apache.drill.exec.store.hive.readers.HiveParquetReader.next():54
> org.apache.drill.exec.physical.impl.ScanBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.sniffNonEmptyBatch():276
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides():238
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema():218
> org.apache.drill.exec.record.AbstractRecordBatch.next():152
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordB

[jira] [Updated] (DRILL-6517) IllegalStateException: Record count not set for this vector container

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6517:
-
Fix Version/s: 1.14.0

> IllegalStateException: Record count not set for this vector container
> -
>
> Key: DRILL-6517
> URL: https://issues.apache.org/jira/browse/DRILL-6517
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Critical
> Fix For: 1.14.0
>
> Attachments: 24d7b377-7589-7928-f34f-57d02061acef.sys.drill
>
>
> TPC-DS query is Canceled after 2 hrs and 47 mins and we see an 
> IllegalStateException: Record count not set for this vector container, in 
> drillbit.log
> Steps to reproduce the problem, query profile 
> (24d7b377-7589-7928-f34f-57d02061acef) is attached here.
> {noformat}
> In drill-env.sh set max direct memory to 12G on all 4 nodes in cluster
> export DRILL_MAX_DIRECT_MEMORY=${DRILL_MAX_DIRECT_MEMORY:-"12G"}
> and set these options from sqlline,
> alter system set `planner.memory.max_query_memory_per_node` = 10737418240;
> alter system set `drill.exec.hashagg.fallback.enabled` = true;
> To run the query (replace IP-ADDRESS with your foreman node's IP address)
> cd /opt/mapr/drill/drill-1.14.0/bin
> ./sqlline -u 
> "jdbc:drill:schema=dfs.tpcds_sf1_parquet_views;drillbit=" -f 
> /root/query72.sql
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> 2018-06-18 20:08:51,912 [24d7b377-7589-7928-f34f-57d02061acef:frag:4:49] 
> ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
> IllegalStateException: Record count not set for this vector container
> Fragment 4:49
> [Error Id: 73177a1c-f7aa-4c9e-99e1-d6e1280e3f27 on qa102-45.qa.lab:31010]
>  at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
>  [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>  at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.IllegalStateException: Record count not set for this 
> vector container
>  at com.google.common.base.Preconditions.checkState(Preconditions.java:173) 
> ~[guava-18.0.jar:na]
>  at 
> org.apache.drill.exec.record.VectorContainer.getRecordCount(VectorContainer.java:394)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.getRecordCount(RemovingRecordBatch.java:49)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:690)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.RecordBatchSizer.(RecordBatchSizer.java:662)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:73)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.JoinBatchMemoryManager.update(JoinBatchMemoryManager.java:79)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.prefetchFirstBatchFromBothSides(HashJoinBatch.java:242)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.buildSchema(HashJoinBatch.java:218)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:152)
>  ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
>  at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
>  ~[drill-java-exec-1.14.0-SNAPSHOT

[jira] [Updated] (DRILL-6473) Upgrade Drill 1.14 with Hive 2.3 for mapr profile

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6473:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Upgrade Drill 1.14 with Hive 2.3 for mapr profile
> -
>
> Key: DRILL-6473
> URL: https://issues.apache.org/jira/browse/DRILL-6473
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6422) Update guava to 23.0 and shade it

2018-07-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6422:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Update guava to 23.0 and shade it
> -
>
> Key: DRILL-6422
> URL: https://issues.apache.org/jira/browse/DRILL-6422
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Some hadoop libraries use old versions of guava and most of them are 
> incompatible with guava 23.0.
> To allow usage of new guava version, it should be shaded and shaded version 
> should be used in the project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6553) Fix TopN for unnest operator

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531440#comment-16531440
 ] 

ASF GitHub Bot commented on DRILL-6553:
---

priteshm commented on issue #1353: DRILL-6553: Fix TopN for unnest operator
URL: https://github.com/apache/drill/pull/1353#issuecomment-402165472
 
 
   @HanumathRao can you review this?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix TopN for unnest operator
> 
>
> Key: DRILL-6553
> URL: https://issues.apache.org/jira/browse/DRILL-6553
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.14.0
>
>
> Plan for the query with unnest is chosen non-optimally:
> {code:sql}
> select customer.c_custkey, customer.c_name, t.o.o_orderkey,t.o.o_totalprice
> from dfs.`lateraljoin/multipleFiles` customer,
> unnest(customer.c_orders) t(o)
> order by customer.c_custkey, t.o.o_orderkey, t.o.o_totalprice
> limit 50
> {code}
> Plan:
> {noformat}
> 00-00Screen
> 00-01  ProjectAllowDup(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], 
> EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  Sort(sort0=[$0], sort1=[$2], sort2=[$3], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-06Project(c_custkey=[$2], c_name=[$3], EXPR$2=[ITEM($4, 
> 'o_orderkey')], EXPR$3=[ITEM($4, 'o_totalprice')])
> 00-07  LateralJoin(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{1}])
> 00-09Project(T0¦¦**=[$0], c_orders=[$1], c_custkey=[$2], 
> c_name=[$3])
> 00-11  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`**`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> 00-08Project(c_orders0=[$0])
> 00-10  Unnest [srcOp=00-07] 
> {noformat}
> A similar query, but with flatten:
> {code:sql}
> select f.c_custkey, f.c_name, f.o.o_orderkey, f.o.o_totalprice from (select 
> c_custkey, c_name, flatten(c_orders) as o from 
> dfs.`lateraljoin/multipleFiles` customer) f order by f.c_custkey, 
> f.o.o_orderkey, f.o.o_totalprice limit 50
> {code}
> has plan:
> {noformat}
> 00-00Screen
> 00-01  Project(c_custkey=[$0], c_name=[$1], EXPR$2=[$2], EXPR$3=[$3])
> 00-02SelectionVectorRemover
> 00-03  Limit(fetch=[50])
> 00-04SelectionVectorRemover
> 00-05  TopN(limit=[50])
> 00-06Project(c_custkey=[$0], c_name=[$1], EXPR$2=[ITEM($2, 
> 'o_orderkey')], EXPR$3=[ITEM($2, 'o_totalprice')])
> 00-07  Flatten(flattenField=[$2])
> 00-08Project(c_custkey=[$0], c_name=[$1], o=[$2])
> 00-09  Scan(groupscan=[EasyGroupScan 
> [selectionRoot=file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles,
>  numFiles=2, columns=[`c_custkey`, `c_name`, `c_orders`], 
> files=[file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_2.json,
>  
> file:/home/mapr/drill/exec/java-exec/target/org.apache.drill.exec.physical.impl.lateraljoin.TestE2EUnnestAndLateral/root/lateraljoin/multipleFiles/cust_order_10_1.json]]])
> {noformat}
> The main difference is that for the case of unnest, a project wasn't pushed 
> to the scan and Limit with Sort weren't converted to TopN. 
> The first problem is tracked by DRILL-6545 and this Jira aims to fix the 
> problem with TopN



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531370#comment-16531370
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199800857
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
 
 Review comment:
   Please move class to `org.apache.drill.exec.util`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531367#comment-16531367
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199800215
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  String oldName = CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF;
 
 Review comment:
   No need to use constant, file name can be taken from URL and thus 
`ActionOnFile` can be used for other files, not only for storage plugin 
override conf.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531369#comment-16531369
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199798981
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.IOException;
+import java.net.URL;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence lpPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.lpPersistence = new LogicalPlanPersistence(context.getConfig(), 
context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsForPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsForPersistentStore = new StoragePlugins(new HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsForPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsForPersistentStore.put(pluginName, updatedStatusPluginConfig);
+  }
+} else {
+  pluginsForPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsForPersistentStore to Persistent Store
+Optional.ofNullable(pluginsForPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  String fileAction = 
context.getConfig().getString(ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE).toUpperCase();
+  Optional actionOnFile = 
Arrays.stream(ActionOnFile.values())
+  .filter(action -> action.name().equals(fileA

[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531364#comment-16531364
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199801281
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  String oldName = CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF;
+  String currentDateTime = new 
SimpleDateFormat("MMddHHmmss").format(new Date());
+  String newFileName = new StringBuilder(oldName)
+  .insert(oldName.lastIndexOf("."), "-" + currentDateTime)
+  .toString();
+  Path pluginsOverrideFilePath  = pluginsOverrideFile.toPath();
+  try {
+Files.move(pluginsOverrideFilePath, 
pluginsOverrideFilePath.resolveSibling(newFileName));
+  } catch (IOException e) {
+logger.error("%s file is not renamed after it's use", 
CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF, e);
+  }
+}
+  },
+  REMOVE {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  try {
+Files.delete(pluginsOverrideFile.toPath());
+  } catch (IOException e) {
+logger.error("%s file is not deleted after it's use", 
CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF, e);
+  }
+}
+  };
+
+  private static final org.slf4j.Logger logger =  
org.slf4j.LoggerFactory.getLogger(ActionOnFile.class);
+
+  /**
+   * This is an action which should be performed on the
+   * {@link 
org.apache.drill.common.config.CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
after successful updating of
+   * storage plugins configs with Storage Plugins Handler.
+   * 
+   *   {@link #NONE}: no action will be performed
 
 Review comment:
   Consider adding java doc above each enum element. Rather than here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531368#comment-16531368
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199801644
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,150 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.IOException;
+import java.net.URL;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence lpPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.lpPersistence = new LogicalPlanPersistence(context.getConfig(), 
context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsForPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsForPersistentStore = new StoragePlugins(new HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsForPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsForPersistentStore.put(pluginName, updatedStatusPluginConfig);
+  }
+} else {
+  pluginsForPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsForPersistentStore to Persistent Store
+Optional.ofNullable(pluginsForPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  String fileAction = 
context.getConfig().getString(ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE).toUpperCase();
+  Optional actionOnFile = 
Arrays.stream(ActionOnFile.values())
+  .filter(action -> action.name().equals(fileA

[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531366#comment-16531366
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199801152
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
+  String oldName = CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF;
+  String currentDateTime = new 
SimpleDateFormat("MMddHHmmss").format(new Date());
+  String newFileName = new StringBuilder(oldName)
+  .insert(oldName.lastIndexOf("."), "-" + currentDateTime)
+  .toString();
+  Path pluginsOverrideFilePath  = pluginsOverrideFile.toPath();
+  try {
+Files.move(pluginsOverrideFilePath, 
pluginsOverrideFilePath.resolveSibling(newFileName));
+  } catch (IOException e) {
+logger.error("%s file is not renamed after it's use", 
CommonConstants.STORAGE_PLUGINS_OVERRIDE_CONF, e);
 
 Review comment:
   Re-phrase, for example: `There was an error during file %s rename.` Here and 
below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531365#comment-16531365
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

arina-ielchiieva commented on a change in pull request #1345: DRILL-6494: Drill 
Plugins Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199800616
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ActionOnFile.java
 ##
 @@ -0,0 +1,68 @@
+package org.apache.drill.exec.store;
+
+import org.apache.drill.common.config.CommonConstants;
+
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+
+/**
+ * The action on the {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} 
file being performed after it's use
+ */
+public enum ActionOnFile {
+
+  NONE {
+@Override
+void action(URL url) {
+  // nothing to do
+}
+  },
+  RENAME {
+@Override
+void action(URL url) {
+  File pluginsOverrideFile = new File(url.getPath());
 
 Review comment:
   Make variable names generic, not related to plugins.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill Plugins Handler
> -
>
> Key: DRILL-6494
> URL: https://issues.apache.org/jira/browse/DRILL-6494
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Tools, Build & Test
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.14.0
>
> Attachments: storage-plugins.conf
>
>
> The new service of updating Drill's plugins configs could be implemented.
> Please find details from design overview document:
> https://docs.google.com/document/d/14JKb2TA8dGnOIE5YT2RImkJ7R0IAYSGjJg8xItL5yMI/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531327#comment-16531327
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199750865
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence hoconLogicalPlanPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.hoconLogicalPlanPersistence = new 
LogicalPlanPersistence(context.getConfig(), context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsToBeWrittenToPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsToBeWrittenToPersistentStore = new StoragePlugins(new 
HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsToBeWrittenToPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsToBeWrittenToPersistentStore.put(pluginName, 
updatedStatusPluginConfig);
+  }
+} else {
+  pluginsToBeWrittenToPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsToBeWrittenToPersistentStore to Persistent Store
+Optional.ofNullable(pluginsToBeWrittenToPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  PluginsOverrideFileAction 

[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531330#comment-16531330
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199790967
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence hoconLogicalPlanPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.hoconLogicalPlanPersistence = new 
LogicalPlanPersistence(context.getConfig(), context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsToBeWrittenToPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsToBeWrittenToPersistentStore = new StoragePlugins(new 
HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsToBeWrittenToPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsToBeWrittenToPersistentStore.put(pluginName, 
updatedStatusPluginConfig);
+  }
+} else {
+  pluginsToBeWrittenToPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsToBeWrittenToPersistentStore to Persistent Store
+Optional.ofNullable(pluginsToBeWrittenToPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  PluginsOverrideFileAction 

[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531329#comment-16531329
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199789010
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence hoconLogicalPlanPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.hoconLogicalPlanPersistence = new 
LogicalPlanPersistence(context.getConfig(), context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsToBeWrittenToPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsToBeWrittenToPersistentStore = new StoragePlugins(new 
HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsToBeWrittenToPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsToBeWrittenToPersistentStore.put(pluginName, 
updatedStatusPluginConfig);
+  }
+} else {
+  pluginsToBeWrittenToPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsToBeWrittenToPersistentStore to Persistent Store
+Optional.ofNullable(pluginsToBeWrittenToPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  PluginsOverrideFileAction 

[jira] [Commented] (DRILL-6494) Drill Plugins Handler

2018-07-03 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531328#comment-16531328
 ] 

ASF GitHub Bot commented on DRILL-6494:
---

vdiravka commented on a change in pull request #1345: DRILL-6494: Drill Plugins 
Handler
URL: https://github.com/apache/drill/pull/1345#discussion_r199790185
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePluginsHandlerService.java
 ##
 @@ -0,0 +1,197 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.google.common.base.Charsets;
+import com.google.common.io.Resources;
+import com.jasonclawson.jackson.dataformat.hocon.HoconFactory;
+import org.apache.drill.common.config.CommonConstants;
+import org.apache.drill.common.config.LogicalPlanPersistence;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.scanner.ClassPathScanner;
+import org.apache.drill.exec.planner.logical.StoragePlugins;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.sys.PersistentStore;
+
+import javax.annotation.Nullable;
+import javax.validation.constraints.NotNull;
+import java.io.File;
+import java.io.IOException;
+import java.net.URL;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.text.SimpleDateFormat;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
+
+import static 
org.apache.drill.exec.store.StoragePluginRegistry.ACTION_ON_STORAGE_PLUGINS_OVERRIDE_FILE;
+
+/**
+ * Drill plugins handler, which allows to update storage plugins configs from 
the
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF} conf file
+ *
+ * TODO: DRILL-6564: It can be improved with configs versioning and service of 
creating
+ * {@link CommonConstants#STORAGE_PLUGINS_OVERRIDE_CONF}
+ */
+public class StoragePluginsHandlerService implements StoragePluginsHandler {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StoragePluginsHandlerService.class);
+
+  private final LogicalPlanPersistence hoconLogicalPlanPersistence;
+  private final DrillbitContext context;
+  private URL pluginsOverrideFileUrl;
+
+  public StoragePluginsHandlerService(DrillbitContext context) {
+this.context = context;
+this.hoconLogicalPlanPersistence = new 
LogicalPlanPersistence(context.getConfig(), context.getClasspathScan(),
+new ObjectMapper(new HoconFactory()));
+  }
+
+  @Override
+  public void loadPlugins(@NotNull PersistentStore 
persistentStore,
+  @Nullable StoragePlugins bootstrapPlugins) {
+// if bootstrapPlugins is not null -- fresh Drill set up
+StoragePlugins pluginsToBeWrittenToPersistentStore;
+
+StoragePlugins newPlugins = getNewStoragePlugins();
+
+if (newPlugins != null) {
+  pluginsToBeWrittenToPersistentStore = new StoragePlugins(new 
HashMap<>());
+  Optional.ofNullable(bootstrapPlugins)
+  .ifPresent(pluginsToBeWrittenToPersistentStore::putAll);
+
+  for (Map.Entry newPlugin : newPlugins) {
+String pluginName = newPlugin.getKey();
+StoragePluginConfig oldPluginConfig = 
Optional.ofNullable(bootstrapPlugins)
+.map(plugins -> plugins.getConfig(pluginName))
+.orElse(persistentStore.get(pluginName));
+StoragePluginConfig updatedStatusPluginConfig = 
updatePluginStatus(oldPluginConfig, newPlugin.getValue());
+pluginsToBeWrittenToPersistentStore.put(pluginName, 
updatedStatusPluginConfig);
+  }
+} else {
+  pluginsToBeWrittenToPersistentStore = bootstrapPlugins;
+}
+
+// load pluginsToBeWrittenToPersistentStore to Persistent Store
+Optional.ofNullable(pluginsToBeWrittenToPersistentStore)
+.ifPresent(plugins -> plugins.forEach(plugin -> 
persistentStore.put(plugin.getKey(), plugin.getValue(;
+
+if (newPlugins != null) {
+  PluginsOverrideFileAction 

  1   2   >