[jira] [Commented] (DRILL-5079) PreparedStatement dynamic parameters to avoid SQL Injection test

2017-03-16 Thread Tobias (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15927668#comment-15927668
 ] 

Tobias commented on DRILL-5079:
---

We would also like this for the above reason and additionally we would like 
that the plan for the statement or at least parts of the physical plan could be 
cached
as it is a significant part of the query plan for short running queries.



> PreparedStatement dynamic parameters to avoid SQL Injection test
> 
>
> Key: DRILL-5079
> URL: https://issues.apache.org/jira/browse/DRILL-5079
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - JDBC
>Affects Versions: 1.8.0
>Reporter: Wahyu Sudrajat
>Priority: Critical
>  Labels: security
>
> Capability to use PreparedStatement with dynamic parameters to prevent SQL 
> Injection.
> For example:
> select  * from PEOPLE where FIRST_NAME = ? and LAST_NAME = ? limit 100
> As for now, Drill will return:
> Error Message:PreparedStatementCallback; uncategorized SQLException for SQL 
> []; SQL state [null]; error code [0]; Failed to create prepared statement: 
> PLAN ERROR: Cannot convert RexNode to equivalent Drill expression. RexNode 
> Class: org.apache.calcite.rex.RexDynamicParam, RexNode Digest: ?0



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5358) Error if Parquet file changes during query

2017-03-16 Thread Tobias (JIRA)
Tobias created DRILL-5358:
-

 Summary: Error if Parquet file changes during query
 Key: DRILL-5358
 URL: https://issues.apache.org/jira/browse/DRILL-5358
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata, Storage - Parquet
Affects Versions: 1.9.0
Reporter: Tobias


We have a scenario where we generate our own parquet files
every X amount of seconds.
These files are in a structure based on date and it is only the file for today 
that gets updated

The process is as follows

1. generate parquet file in temp directory
2. When finished generation mv the file into a drill workspace/ 
(data/2017/03/10/data.parquet, ..)
3. Then restart the process

We have noticed that if the file is moved in while a query has started running
it will throw and error that the parquet magic number is incorrect
This is due to the file length being cached and reused so basically what seems 
to happen is

1. Drill plans the query
2. File gets changed under Drills feet
3. Drill executes query and tries to read and incorrect offset of the changed 
file

Is there anyway to fix this or avoid this scenario?
Another side effect of constantly generating a new file is that the metadata 
cache gets discarded for the whole workspace despite only one file changing
Is there a way to avoid that?





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4517) Reading emtpy Parquet file failes with java.lang.IllegalArgumentException

2016-03-19 Thread Tobias (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15199442#comment-15199442
 ] 

Tobias commented on DRILL-4517:
---

Is this fixed on head (1.7) as mentioned in DRILL-2223? If so we can build our 
own version

> Reading emtpy Parquet file failes with java.lang.IllegalArgumentException
> -
>
> Key: DRILL-4517
> URL: https://issues.apache.org/jira/browse/DRILL-4517
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server
>Reporter: Tobias
>
> When querying a Parquet file that has a schema but no rows the Drill Server 
> will fail with the below
> This looks similar to DRILL-3557
> {noformat}
> {{ParquetMetaData{FileMetaData{schema: message TRANSACTION_REPORT {
>   required int64 MEMBER_ACCOUNT_ID;
>   required int64 TIMESTAMP_IN_HOUR;
>   optional int64 APPLICATION_ID;
> }
> , metadata: {}}}, blocks: []}
> {noformat}
> {noformat}
> Caused by: java.lang.IllegalArgumentException: MinorFragmentId 0 has no read 
> entries assigned
> at 
> com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:707)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:105)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:68)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractGroupScan.accept(AbstractGroupScan.java:60)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:102)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject(AbstractPhysicalVisitor.java:77)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Project.accept(Project.java:51) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:82)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:35)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen(AbstractPhysicalVisitor.java:195)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.physical.config.Screen.accept(Screen.java:97) 
> ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit(SimpleParallelizer.java:355)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments(SimpleParallelizer.java:134)
>  ~[drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit(Foreman.java:518) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:405) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> at 
> org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:926) 
> [drill-java-exec-1.5.0.jar:1.5.0]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4517) Reading emtpy Parquet file failes with java.lang.IllegalArgumentException

2016-03-19 Thread Tobias (JIRA)
Tobias created DRILL-4517:
-

 Summary: Reading emtpy Parquet file failes with 
java.lang.IllegalArgumentException
 Key: DRILL-4517
 URL: https://issues.apache.org/jira/browse/DRILL-4517
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Reporter: Tobias


When querying a Parquet file that has a schema but no rows the Drill Server 
will fail with the below
This looks similar to DRILL-3557
{noformat}
{{ParquetMetaData{FileMetaData{schema: message TRANSACTION_REPORT {
  required int64 MEMBER_ACCOUNT_ID;
  required int64 TIMESTAMP_IN_HOUR;
  optional int64 APPLICATION_ID;
}
, metadata: {}}}, blocks: []}
{noformat}

{noformat}
Caused by: java.lang.IllegalArgumentException: MinorFragmentId 0 has no read 
entries assigned
at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:92) 
~[guava-14.0.1.jar:na]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:707)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.getSpecificScan(ParquetGroupScan.java:105)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:68)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.Materializer.visitGroupScan(Materializer.java:35)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.physical.base.AbstractGroupScan.accept(AbstractGroupScan.java:60)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:102)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.Materializer.visitOp(Materializer.java:35)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitProject(AbstractPhysicalVisitor.java:77)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.physical.config.Project.accept(Project.java:51) 
~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:82)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.Materializer.visitStore(Materializer.java:35)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitScreen(AbstractPhysicalVisitor.java:195)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at org.apache.drill.exec.physical.config.Screen.accept(Screen.java:97) 
~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.SimpleParallelizer.generateWorkUnit(SimpleParallelizer.java:355)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.planner.fragment.SimpleParallelizer.getFragments(SimpleParallelizer.java:134)
 ~[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.work.foreman.Foreman.getQueryWorkUnit(Foreman.java:518) 
[drill-java-exec-1.5.0.jar:1.5.0]
at 
org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:405) 
[drill-java-exec-1.5.0.jar:1.5.0]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:926) 
[drill-java-exec-1.5.0.jar:1.5.0]
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4505) Can't group by or sort across files with different schema

2016-03-13 Thread Tobias (JIRA)
Tobias created DRILL-4505:
-

 Summary: Can't group by or sort across files with different schema
 Key: DRILL-4505
 URL: https://issues.apache.org/jira/browse/DRILL-4505
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.5.0
 Environment: Java 1.8
Reporter: Tobias


We are currently trying out the support for querying across parquet files with 
different schemas.
Simple selects work well but when we wan't to do sort or group by Drill returns 
"UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with 
changing schemas Fragment 0:0 [Error Id: ff490670-64c1-4fb8-990e-a02aa44ac010 
on zookeeper-1:31010]"

This is despite not even including the new columns in the query.
Expected result would be to treat the non existing columns in certain files as 
either null or default value and allow them to be grouped and sorted

Example
SELECT APPLICATION_ID ,dir0 AS year_ FROM dfs.`/PRO/UTC/1` WHERE dir2 
>='2016-01-01' AND dir2<'2016-04-02' work with changing schema

but SELECT max(APPLICATION_ID ),dir0 AS year_ FROM dfs.`/PRO/UTC/1` WHERE dir2 
>='2016-01-01' AND dir2<'2016-04-02'  group by dir0 does not work

For us this hampers any possibility to have an evolving schema with moderatly 
complex queries



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)