[jira] [Created] (DRILL-3342) like query is not working properly

2015-06-23 Thread Devender Yadav (JIRA)
Devender Yadav  created DRILL-3342:
--

 Summary: like query is not working properly 
 Key: DRILL-3342
 URL: https://issues.apache.org/jira/browse/DRILL-3342
 Project: Apache Drill
  Issue Type: Bug
 Environment: I tried on ubuntu 14.0.4
Reporter: Devender Yadav 


I am using drill with MongoDB. On using "show databases" it is showing 
databases under MongoDB too. Then I tried like query on a collection named 
"testCollection" : 

select * from testCollection where name like 'dev*';

no error is shown and nothing is returned.  

I also tried :
select * from testCollection where name like 'dev_';

This is working fine. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Apache Drill 1.0 JDBC issue

2015-06-23 Thread Ilangovan Sathasivan
Hi Team,

We are facing the below issue when we try to use REST API  + JDBC in Drill
1.0. The same is working in Drill 0.8.

Please help to resolve this issue.

15:16:57.144 [Client-1] DEBUG io.netty.util.Recycler -
-Dio.netty.recycler.maxCapacity.default: 262144
java.sql.SQLException: Failure while attempting to connect to Drill.
at
org.apache.drill.jdbc.DrillConnectionImpl.(DrillConnectionImpl.java:101)
at
org.apache.drill.jdbc.DrillJdbc41Factory$DrillJdbc41Connection.(DrillJdbc41Factory.java:94)
at
org.apache.drill.jdbc.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:57)
at
org.apache.drill.jdbc.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:43)
at
org.apache.drill.jdbc.DrillFactory.newConnection(DrillFactory.java:54)
at
net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:270)
at
com.cdg.analytics.aaf.dbconnection.ConnectionManager.getConnection(ConnectionManager.java:57)
at
com.cdg.analytics.aaf.service.DataLakeExplorerService.getListOfDir(DataLakeExplorerService.java:135)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
15:16:57.305 [Client-1] INFO  o.a.drill.exec.rpc.user.UserClient - Channel
closed between local 0.0.0.0/0.0.0.0:41995 and remote
ip-10-10-21-76.ec2.internal/10.10.21.76:31010
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:164)
at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:181)
at
org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:203)
at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:101)
at
org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389)
at
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347)
at
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102)
at
org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:305)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267)
at org.glassfish.jersey.internal.Errors.process(Errors.java:315)
at org.glassfish.jersey.internal.Errors.process(Errors.java:297)
at org.glassfish.jersey.internal.Errors.process(Errors.java:267)
at
org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317)
at
org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:288)
at
org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1110)
at
org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:401)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:386)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:335)
at
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:222)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)
at
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)
at
org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:617)
at
org.apache.catalina.core.Stan

Re: Apache Drill 1.0 JDBC issue

2015-06-23 Thread Rajkumar Singh
I can see a rpc version mismatch in the logs. please try connection with the 
updated drill jdbc driver which can be found at the following location 
apache-drill-1.0.0/jars/jdbc-driver/drill-jdbc-all-1.0.0.jar

Rajkumar Singh
MapR Technologies


> On Jun 23, 2015, at 8:52 PM, Ilangovan Sathasivan  
> wrote:
> 
> org.apache.drill.exec.rpc.RpcException: Invalid rpc version



Unit test failing on master

2015-06-23 Thread Abdel Hakim Deneche
TestWindowFunctions#testWindowWithJoin is failing consistently on today's
master. It happens if you run the test from maven or from an IDE. Here is
the error I am seeing:

org.apache.calcite.sql.validate.SqlValidatorException 
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Column
> 'r_regionKey' is ambiguous
> Jun 23, 2015 9:36:16 AM org.apache.calcite.runtime.CalciteException 
> SEVERE: org.apache.calcite.runtime.CalciteContextException: At line 0,
> column 0: Column 'r_regionKey' is ambiguous
> Exception (no rows returned):
> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: At
> line 0, column 0: Column 'r_regionKey' is ambiguous
> [Error Id: bd256fba-2bb2-4256-9993-f44af247f3f8 on 172.30.1.107:31010].
> Returned in 129ms.
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 40.119 sec
> <<< FAILURE! - in org.apache.drill.exec.TestWindowFunctions



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training



Re: Unit test failing on master

2015-06-23 Thread Abdel Hakim Deneche
hold on, I may have corrupted my local maven repository with a custom
calcite build. I will clean the local repository and try again

On Tue, Jun 23, 2015 at 9:41 AM, Abdel Hakim Deneche 
wrote:

> TestWindowFunctions#testWindowWithJoin is failing consistently on today's
> master. It happens if you run the test from maven or from an IDE. Here is
> the error I am seeing:
>
> org.apache.calcite.sql.validate.SqlValidatorException 
>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Column
>> 'r_regionKey' is ambiguous
>> Jun 23, 2015 9:36:16 AM org.apache.calcite.runtime.CalciteException 
>> SEVERE: org.apache.calcite.runtime.CalciteContextException: At line 0,
>> column 0: Column 'r_regionKey' is ambiguous
>> Exception (no rows returned):
>> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: At
>> line 0, column 0: Column 'r_regionKey' is ambiguous
>> [Error Id: bd256fba-2bb2-4256-9993-f44af247f3f8 on 172.30.1.107:31010].
>> Returned in 129ms.
>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 40.119
>> sec <<< FAILURE! - in org.apache.drill.exec.TestWindowFunctions
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> 
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training



Re: Unit test failing on master

2015-06-23 Thread Hsuan Yi Chu
Make sure you are using drill-calcite-r8.

In intellij, you need to reimport maven projects

On Tue, Jun 23, 2015 at 9:44 AM, Abdel Hakim Deneche 
wrote:

> hold on, I may have corrupted my local maven repository with a custom
> calcite build. I will clean the local repository and try again
>
> On Tue, Jun 23, 2015 at 9:41 AM, Abdel Hakim Deneche <
> adene...@maprtech.com>
> wrote:
>
> > TestWindowFunctions#testWindowWithJoin is failing consistently on today's
> > master. It happens if you run the test from maven or from an IDE. Here is
> > the error I am seeing:
> >
> > org.apache.calcite.sql.validate.SqlValidatorException 
> >> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Column
> >> 'r_regionKey' is ambiguous
> >> Jun 23, 2015 9:36:16 AM org.apache.calcite.runtime.CalciteException
> 
> >> SEVERE: org.apache.calcite.runtime.CalciteContextException: At line 0,
> >> column 0: Column 'r_regionKey' is ambiguous
> >> Exception (no rows returned):
> >> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: At
> >> line 0, column 0: Column 'r_regionKey' is ambiguous
> >> [Error Id: bd256fba-2bb2-4256-9993-f44af247f3f8 on 172.30.1.107:31010].
> >> Returned in 129ms.
> >> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 40.119
> >> sec <<< FAILURE! - in org.apache.drill.exec.TestWindowFunctions
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>


Hangout starting a little late this morning, should start around 10:10

2015-06-23 Thread Jason Altekruse
Join us at our weekly hangout to discuss what has been happening in the
Drill community!

https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc


Re: Hangout starting a little late this morning, should start around 10:10

2015-06-23 Thread Jason Altekruse
That is pacific time, the meeting will start in about 10 minutes.

On Tue, Jun 23, 2015 at 9:59 AM, Jason Altekruse 
wrote:

> Join us at our weekly hangout to discuss what has been happening in the
> Drill community!
>
> https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
>


Re: Unit test failing on master

2015-06-23 Thread Abdel Hakim Deneche
my bad, I had a custom calcite r8 in my maven cache. The test is no longer
failing now.

Thanks!

On Tue, Jun 23, 2015 at 9:45 AM, Hsuan Yi Chu  wrote:

> Make sure you are using drill-calcite-r8.
>
> In intellij, you need to reimport maven projects
>
> On Tue, Jun 23, 2015 at 9:44 AM, Abdel Hakim Deneche <
> adene...@maprtech.com>
> wrote:
>
> > hold on, I may have corrupted my local maven repository with a custom
> > calcite build. I will clean the local repository and try again
> >
> > On Tue, Jun 23, 2015 at 9:41 AM, Abdel Hakim Deneche <
> > adene...@maprtech.com>
> > wrote:
> >
> > > TestWindowFunctions#testWindowWithJoin is failing consistently on
> today's
> > > master. It happens if you run the test from maven or from an IDE. Here
> is
> > > the error I am seeing:
> > >
> > > org.apache.calcite.sql.validate.SqlValidatorException 
> > >> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Column
> > >> 'r_regionKey' is ambiguous
> > >> Jun 23, 2015 9:36:16 AM org.apache.calcite.runtime.CalciteException
> > 
> > >> SEVERE: org.apache.calcite.runtime.CalciteContextException: At line 0,
> > >> column 0: Column 'r_regionKey' is ambiguous
> > >> Exception (no rows returned):
> > >> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR:
> At
> > >> line 0, column 0: Column 'r_regionKey' is ambiguous
> > >> [Error Id: bd256fba-2bb2-4256-9993-f44af247f3f8 on 172.30.1.107:31010
> ].
> > >> Returned in 129ms.
> > >> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 40.119
> > >> sec <<< FAILURE! - in org.apache.drill.exec.TestWindowFunctions
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training



Hangout happening now

2015-06-23 Thread Parth Chandra
Come join the Drill community as we discuss what has been happening lately
and what is in the pipeline. All are welcome, if you know about Drill, want
to know more or just want to listen in.

Link: https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Thanks


[jira] [Created] (DRILL-3343) Seemingly incorrect result with SUM window functions and float data type

2015-06-23 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3343:
---

 Summary: Seemingly incorrect result with SUM window functions and 
float data type
 Key: DRILL-3343
 URL: https://issues.apache.org/jira/browse/DRILL-3343
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.0.0
Reporter: Victoria Markman


While running query below against voter_hive (drill table), where contributions 
field is defined as "float" (4 byte floating point number in drill) I get a 
value of SUM that is different from result generated by postgres.

{code}
select 
  registration, 
  age, 
  name, 
  sum(contributions) over w  
from   voter_hive 
window w AS (partition by registration order by age rows unbounded preceding) 
order by 
  registration, 
  age, 
  name;
{code}

Find attached:
1. Query + result generated by Postgres (queries.tar)
2. voter_hive parquet file
3. create_table.tar - contains CTAS statement + csv file (if you want to create 
table yourself)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-959) drill fails to display binary in hive correctly

2015-06-23 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti resolved DRILL-959.
---
   Resolution: Fixed
Fix Version/s: (was: Future)
   1.1.0

No longer reproes.

> drill fails to display binary in hive correctly
> ---
>
> Key: DRILL-959
> URL: https://issues.apache.org/jira/browse/DRILL-959
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Hive
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Mehant Baid
> Fix For: 1.1.0
>
>
> Hive table ddl
> create table alldrilltypes 
>  (c1 int, c2 boolean, c3 double, c4 string,
>  c9 tinyint, c10 smallint, c11 float, c12 bigint,
>  c19 binary);
> doing a select from drill works but c19 shows up as binary
> 0: jdbc:drill:schema=hive> SELECT c1,c2,c3,c4,c9,c10,c11,c12,c19 from 
> alldrilltypes;
> ++++++++++
> | c1 | c2 | c3 | c4 | c9 |c10 
> |c11 |c12 |c19 |
> ++++++++++
> | null   | null   | null   | null   | null   | null   
> | null   | null   | null   |
> | -1 | false  | -1.1   || -1 | -1 
> | -1.0   | -1 | null   |
> | 1  | true   | 1.1| 1  | 1  | 1  
> | 1.0| 1  | [B@661725c1 |
> ++++++++++
> A cast does not work either:
> SELECT c1,c2,c3,c4,c9,c10,c11,c12,cast(c19 as varchar) from alldrilltypes;
> message: "Failure while parsing sql. < ValidationException:[ 
> org.eigenbase.util.EigenbaseContextException: From line 1, column 35 to line 
> 1, column 54 ] < EigenbaseContextException:[ From line 1, column 35 to line 
> 1, column 54 ] < SqlValidatorException:[ Cast function cannot convert value 
> of type BINARY(1) to type VARCHAR(1) ]"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35634: DRILL-3319: UserExceptions should be logged from the right class

2015-06-23 Thread Sudheesh Katkam

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35634/
---

(Updated June 23, 2015, 6:50 p.m.)


Review request for drill, abdelhakim deneche and Parth Chandra.


Bugs: DRILL-3319
https://issues.apache.org/jira/browse/DRILL-3319


Repository: drill-git


Description
---

DRILL-3319: Replaced UserException#build() method with #build(Logger) method to 
log from the correct class

+ Fixed docs in UserException
+ Created loggers, and changed logger visibility to private


Diffs (updated)
-

  common/src/main/java/org/apache/drill/common/exceptions/UserException.java 
6f28a2b 
  
common/src/test/java/org/apache/drill/common/exceptions/TestUserException.java 
151b762 
  
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java
 9f63e05 
  exec/java-exec/src/main/codegen/templates/ListWriters.java ab78603 
  
exec/java-exec/src/main/java/org/apache/drill/exec/client/PrintingResultsListener.java
 f5a119d 
  
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AggregateErrorFunctions.java
 8161a43 
  exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java 
1cbe886 
  
exec/java-exec/src/main/java/org/apache/drill/exec/ops/ViewExpansionContext.java
 157d550 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/ScanBatch.java 
da73185 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java
 e1b5909 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/StreamingAggBatch.java
 b252971 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java
 9991404 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 5ce63fb 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillOptiq.java
 8b95f0b 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 73aeec6 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/SchemaUtilites.java
 655e135 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java
 920b284 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DefaultSqlHandler.java
 a2858b8 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DescribeTableHandler.java
 676dcba 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ExplainHandler.java
 efc4b36 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ShowFileHandler.java
 c96dc73 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ShowTablesHandler.java
 055b761 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/SqlHandlerUtil.java
 9e7be7f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/ViewHandler.java
 36287a4 
  
exec/java-exec/src/main/java/org/apache/drill/exec/record/AbstractRecordBatch.java
 ff53052 
  exec/java-exec/src/main/java/org/apache/drill/exec/rpc/BasicServer.java 
2ebd353 
  exec/java-exec/src/main/java/org/apache/drill/exec/rpc/RpcBus.java 9ca09a1 
  
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/QueryResultHandler.java
 8443948 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/AbstractSchema.java 
524fe26 
  exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedRunnable.java 
5a35aff 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 8e0432a 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 0df6227 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/TextReader.java
 fec0ab4 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetReaderUtility.java
 da480d7 
  
exec/java-exec/src/main/java/org/apache/drill/exec/vector/complex/fn/JsonReader.java
 260ebde 
  exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java 
78c438b 
  
exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java
 a9c2b6d 
  
exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetResultListener.java
 df74f7a 

Diff: https://reviews.apache.org/r/35634/diff/


Testing (updated)
---

Passes unit and regression tests


Thanks,

Sudheesh Katkam



Re: Review Request 35573: DRILL-3304: improve org.apache.drill.exec.expr.TypeHelper error messages when UnsupportedOprationException is thrown

2015-06-23 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35573/#review89028
---

Ship it!


Ship It!

- Hanifi Gunes


On June 18, 2015, 3:56 p.m., abdelhakim deneche wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35573/
> ---
> 
> (Updated June 18, 2015, 3:56 p.m.)
> 
> 
> Review request for drill and Hanifi Gunes.
> 
> 
> Bugs: DRILL-3304
> https://issues.apache.org/jira/browse/DRILL-3304
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Made some changes to TypeHelper template to display the failed "operation" 
> and minor-type + data-mode when an UnsupportedOperationException is thrown
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/drill/common/exceptions/ErrorHelper.java 
> 5dd9b67 
>   exec/java-exec/src/main/codegen/templates/TypeHelper.java ad818bd 
> 
> Diff: https://reviews.apache.org/r/35573/diff/
> 
> 
> Testing
> ---
> 
> all unit tests are passing along with functional/tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>



Re: Review Request 35636: DRILL-2447: Add already-closed checks to remaining ResultSet methods.

2015-06-23 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35636/#review89033
---

Ship it!


looks good except one minor comment below.


exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillResultSet.java (line 382)


Do we really need to have these comments checked-in? These look extensive.


- Hanifi Gunes


On June 19, 2015, 8:23 p.m., Daniel Barclay wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35636/
> ---
> 
> (Updated June 19, 2015, 8:23 p.m.)
> 
> 
> Review request for drill, Hanifi Gunes and Mehant Baid.
> 
> 
> Bugs: DRILL-2447
> https://issues.apache.org/jira/browse/DRILL-2447
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Extended coverage from just selected methods to all methods.  Added wrapper
> methods checking state before delegating.  (Couldn't implement at just a few
> choke points because Avatica makes them private and doesn't provide hooks.)
> [DrillResultSetImpl]
> 
> Defined DrillResultSet.getQueryId() to throw SQLException as other methods do.
> [DrillResultSet]
> 
> Re-enabled ResultSet test methods.  (Also re-enabled other test methods that
> pass now with DRILL-2782 changes.  
> [Drill2489CallsAfterCloseThrowExceptionsTest]
> 
> 
> Diffs
> -
> 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillResultSet.java e0a7763 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java 
> d7fafe9 
>   
> exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Drill2489CallsAfterCloseThrowExceptionsTest.java
>  0e37efa 
> 
> Diff: https://reviews.apache.org/r/35636/diff/
> 
> 
> Testing
> ---
> 
> Enabled pre-written specific unit tests.
> 
> Ran existing tests; no new failures.
> 
> 
> Thanks,
> 
> Daniel Barclay
> 
>



Re: Review Request 35623: DRILL-2494: Have PreparedStmt. set-param. methods throw "unsupported".

2015-06-23 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35623/#review89035
---

Ship it!


Ship It!

- Hanifi Gunes


On June 19, 2015, 7 p.m., Daniel Barclay wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35623/
> ---
> 
> (Updated June 19, 2015, 7 p.m.)
> 
> 
> Review request for drill, Hanifi Gunes and Mehant Baid.
> 
> 
> Bugs: DRILL-2494
> https://issues.apache.org/jira/browse/DRILL-2494
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Added (integration-level) unit test.
> 
> Modified set-parameter methods to throw SQLFeatureNotSupportedException.
> (Intercepted common getParameter method.)
> 
> Inserted DrillPreparedStatement into hierarchy for place for documentation.
> 
> Documented that parameter-setting methods are not supported.
> 
> 
> Diffs
> -
> 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillPreparedStatement.java 
> PRE-CREATION 
>   
> exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
>  5e9ec93 
>   exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35623/diff/
> 
> 
> Testing
> ---
> 
> Added specific unit tests.
> 
> Ran existing tests; no new failures.
> 
> 
> Thanks,
> 
> Daniel Barclay
> 
>



Re: Review Request 35609: DRILL-3243: Need a better error message - Use of alias in window function definition

2015-06-23 Thread Hanifi Gunes

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35609/#review89036
---

Ship it!


Ship It!

- Hanifi Gunes


On June 18, 2015, 3:14 p.m., abdelhakim deneche wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35609/
> ---
> 
> (Updated June 18, 2015, 3:14 p.m.)
> 
> 
> Review request for drill and Hanifi Gunes.
> 
> 
> Bugs: DRILL-3243
> https://issues.apache.org/jira/browse/DRILL-3243
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> changed RepeatedVarCharOutput to display the column name as part of the 
> exception's message
> changed CompliantTestRecordReader to throw a DATA_READ user exception
> added new unit test to TestNewTextReader
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/drill/common/exceptions/UserException.java 
> 6f28a2b 
>   
> common/src/main/java/org/apache/drill/common/exceptions/UserRemoteException.java
>  1b3fa42 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/CompliantTextRecordReader.java
>  254e0d8 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/text/compliant/RepeatedVarCharOutput.java
>  40276f4 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/store/text/TestNewTextReader.java
>  76674f9 
> 
> Diff: https://reviews.apache.org/r/35609/diff/
> 
> 
> Testing
> ---
> 
> all unit tests are passing along with functional and tpch100
> 
> 
> Thanks,
> 
> abdelhakim deneche
> 
>



Drill should validate column names within window functions

2015-06-23 Thread Abhishek Girish
Hey all,

I observed an issue while working with Window Functions. I observed a case
where wrong results are returned from Drill.

In-case of weak schema such as parquet, Drill does not validate column
names. It is understandable when only part of the projection list in a
query. But when part of a Window Function, the results displayed are wrong,
and at times hard to identify the cause.

Two examples below:

> SELECT PERCENT_RANK() OVER (PARTITION BY s.store_sk, s.ss_customer_sk
ORDER BY s.store_sk, s.ss_customer_sk) FROM store_sales s LIMIT 2;
+-+
| EXPR$0  |
+-+
| 0.0 |
| 0.0 |
+-+
2 rows selected (7.116 seconds)

SELECT CUME_DIST() OVER (PARTITION BY s.ss_store_sk ORDER BY s.ss_stoe_sk,
s.s_customr_sk) FROM store_sales s LIMIT 2;
+-+
| EXPR$0  |
+-+
| 1.0 |
| 1.0 |
+-+
2 rows selected (8.361 seconds)

In both cases above, some columns do not exist.

With normal aggregate functions, it is similar to having a non-existent
column in projection list. Drill prints a column of null rows. This could
still be documented for users to expect "null" columns in results when
non-existent columns are part of a projection list.

> SELECT s.ss_store_sk, avg (ssdfd), ssdfd FROM store_sales s GROUP BY
s.ss_store_sk, ssdfd LIMIT 2;
+--+-++
| ss_store_sk  | EXPR$1  | ssdfd  |
+--+-++
| 10   | null| null   |
| 4| null| null   |
+--+-++
2 rows selected (1.252 seconds)

But in case of window functions (and maybe other functions & expressions),
the results might look more real and hence difficult to identify that the
query had typos. Worse, users may trust the data returned from Drill, which
they shouldn't have.

Postgres:

# SELECT CUME_DIST() OVER (PARTITION BY s.ss_store_sk ORDER BY
s.ss_store_sk, s.ss_customer_sk) FROM store_sales s LIMIT 2;
  cume_dist
--
 3.06415464350749e-05
 3.06415464350749e-05
(2 rows)

# SELECT PERCENT_RANK() OVER (PARTITION BY s.store_sk, s.ss_customer_sk
ORDER BY s.store_sk, s.ss_customer_sk) FROM store_sales s LIMIT 2;

ERROR:  column s.store_sk does not exist

LINE 1: ...ARTITION BY s.store_sk, s.ss_customer_sk ORDER BY s.store_sk...
   ^

I think Drill at minimum should throw a warning message when it encounters
a non-existent column. And ideally queries must fail when non-existent
columns are part of any function/expression.

I'll file a JIRA if it is agreed to be an issue.

Regards,
Abhishek


[jira] [Created] (DRILL-3344) Empty OVER clause + Group By : AssertionError

2015-06-23 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3344:
-

 Summary: Empty OVER clause + Group By : AssertionError
 Key: DRILL-3344
 URL: https://issues.apache.org/jira/browse/DRILL-3344
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.1.0
 Environment: 6ebfbb9d0fc0b87b032f5e5d5cb0825f5464426e
Reporter: Khurram Faraaz
Assignee: Chris Westin



CTAS

0: jdbc:drill:schema=dfs.tmp> create table tblForView(col_int, col_bigint, 
col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as 
select cast(columns[0] as INT), cast(columns[1] as BIGINT),cast(columns[2] as 
CHAR(2)), cast(columns[3] as VARCHAR(52)), cast(columns[4] as TIMESTAMP), 
cast(columns[5] as DATE), cast(columns[6] as BOOLEAN),cast(columns[7] as 
DOUBLE),cast(columns[8] as TIME) from `forPrqView.csv`;
+---++
| Fragment  | Number of records written  |
+---++
| 0_0   | 30 |
+---++
1 row selected (0.586 seconds)


Failing query

0: jdbc:drill:schema=dfs.tmp> select max(col_tm) over(), col_char_2 from 
tblForView group by col_char_2;
Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while converting 
MAX(`tblForView`.`col_tm`)


[Error Id: 11afbdc9-d47a-4a52-aa77-40c20ffd2bc6 on centos-03.qa.lab:31010] 
(state=,code=0)

Stack trace

[Error Id: 11afbdc9-d47a-4a52-aa77-40c20ffd2bc6 on centos-03.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
java.lang.AssertionError: Internal error: while converting 
MAX(`tblForView`.`col_tm`)


[Error Id: 11afbdc9-d47a-4a52-aa77-40c20ffd2bc6 on centos-03.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:522)
 ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:738)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:840)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:782)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:784)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:893) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: Internal error: while converting 
MAX(`tblForView`.`col_tm`)
... 4 common frames omitted
Caused by: java.lang.AssertionError: Internal error: while converting 
MAX(`tblForView`.`col_tm`)
at org.apache.calcite.util.Util.newInternal(Util.java:790) 
~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.ReflectiveConvertletTable$2.convertCall(ReflectiveConvertletTable.java:152)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlNodeToRexConverterImpl.convertCall(SqlNodeToRexConverterImpl.java:60)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1762)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1000(SqlToRelConverter.java:180)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3938)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.createAggImpl(SqlToRelConverter.java:2521)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertAgg(SqlToRelConverter.java:2342)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:604)
 ~[calcite-core-1.1.0-drill-r8

[jira] [Created] (DRILL-3345) TestWindowFrame fails to properly check cases involving multiple batches

2015-06-23 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3345:
---

 Summary: TestWindowFrame fails to properly check cases involving 
multiple batches
 Key: DRILL-3345
 URL: https://issues.apache.org/jira/browse/DRILL-3345
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Deneche A. Hakim
Assignee: Deneche A. Hakim
Priority: Critical
 Fix For: 1.1.0


as part of adding unit tests to TestWindowFrame, I added a debug option to 
MSorter to force the batches passed downstream to be of a specific size. This 
was supposed to help test edge cases involving multiple data batches while 
controlling precisely when a partition and/or window frame ends in a batch.

Turns out the change to MSorter was incomplete and all cases end up processing 
one single big batch of data.

The purpose of this JIRA issue is to fix this, add some check to detect when 
the batches are no longer split correctly and make sure all unit tests are 
passing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35636: DRILL-2447: Add already-closed checks to remaining ResultSet methods.

2015-06-23 Thread Daniel Barclay


> On June 23, 2015, 7:24 p.m., Hanifi Gunes wrote:
> > exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillResultSet.java, line 400
> > 
> >
> > Do we really need to have these comments checked-in? These look 
> > extensive.

The plan is to keep methods in DrillResultSet in the same order as that in 
java.sql.ResultSet, so that DrillResultSet's Javadoc-generated documentation is 
in the same order as java.sql.ResultSet's (so it'll be easier to see 
correspondences).

Those comments are placeholders (so I don't have to dig into ResultSet again to 
determine where an addition to DrillResultSet goes) and reminders (to others 
who might add methods).


- Daniel


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35636/#review89033
---


On June 19, 2015, 8:23 p.m., Daniel Barclay wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35636/
> ---
> 
> (Updated June 19, 2015, 8:23 p.m.)
> 
> 
> Review request for drill, Hanifi Gunes and Mehant Baid.
> 
> 
> Bugs: DRILL-2447
> https://issues.apache.org/jira/browse/DRILL-2447
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Extended coverage from just selected methods to all methods.  Added wrapper
> methods checking state before delegating.  (Couldn't implement at just a few
> choke points because Avatica makes them private and doesn't provide hooks.)
> [DrillResultSetImpl]
> 
> Defined DrillResultSet.getQueryId() to throw SQLException as other methods do.
> [DrillResultSet]
> 
> Re-enabled ResultSet test methods.  (Also re-enabled other test methods that
> pass now with DRILL-2782 changes.  
> [Drill2489CallsAfterCloseThrowExceptionsTest]
> 
> 
> Diffs
> -
> 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/DrillResultSet.java e0a7763 
>   exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java 
> d7fafe9 
>   
> exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Drill2489CallsAfterCloseThrowExceptionsTest.java
>  0e37efa 
> 
> Diff: https://reviews.apache.org/r/35636/diff/
> 
> 
> Testing
> ---
> 
> Enabled pre-written specific unit tests.
> 
> Ran existing tests; no new failures.
> 
> 
> Thanks,
> 
> Daniel Barclay
> 
>



Re: Review Request 34004: DRILL-1942: Improved memory allocator

2015-06-23 Thread Sudheesh Katkam


> On June 18, 2015, 7:09 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/RootFragmentManager.java,
> >  line 74
> > 
> >
> > cancel() and isCancelled() need to be synchronized methods
> 
> Chris Westin wrote:
> Not necessary, because the variable is volatile. cancel() is already 
> synchronized internally, but I changed to to be function-level to make it 
> more obvious.

I guess you want to use it as a way to check if cancel() was called, then what 
you are doing is sufficient.


> On June 18, 2015, 7:09 p.m., Sudheesh Katkam wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/NonRootFragmentManager.java,
> >  line 112
> > 
> >
> > synchronized?

I am assuming same as below?


- Sudheesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34004/#review87844
---


On June 17, 2015, 1:07 a.m., Chris Westin wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34004/
> ---
> 
> (Updated June 17, 2015, 1:07 a.m.)
> 
> 
> Review request for drill, Jacques Nadeau and Jason Altekruse.
> 
> 
> Bugs: DRILL-1942
> https://issues.apache.org/jira/browse/DRILL-1942
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> Rewritten direct memory allocator. Simplified interface, and use, along with 
> a means to support additional allocation policies in the future. There are 
> features in the allocator and in DrillBuf that make finding leaks easier, as 
> well as better enforcement of limits. New features include transfer of 
> buffers, and better slicing support.
> 
> This is a preliminary patch to get the review started because it touches a 
> lot of files (readers and record batches were made AutoCloseable in order to 
> cover cleanup). Subsequent reviews can use the differential view to just see 
> additional changes. The new allocator is in BaseAllocator.java (along with 
> derived classes RootAllocator and ChildAllocator); DrillBuf also has 
> significant changes. Most other changes in other files are just to use newer 
> interfaces, or to change cleanup() to close(), or to close subordinate 
> objects that are newly (Auto)Closeable. 1There are still a couple of things 
> to do:
> * Some TODO(cwestin)s to clean up tracing and debugging code, as well as 
> adding javadoc
> * Using the AllocatorOwner interface to replace the reallocation mechanism 
> for FragmentContext and OperatorContext so that the allocator doesn't know 
> anything about those objects.
> 
> 
> Diffs
> -
> 
>   common/src/main/java/org/apache/drill/common/AutoCloseablePointer.java 
> PRE-CREATION 
>   common/src/main/java/org/apache/drill/common/DrillAutoCloseables.java 
> PRE-CREATION 
>   common/src/main/java/org/apache/drill/common/DrillCloseables.java 
> PRE-CREATION 
>   common/src/main/java/org/apache/drill/common/HistoricalLog.java 
> PRE-CREATION 
>   common/src/main/java/org/apache/drill/common/StackTrace.java 54068ec 
>   common/src/main/java/org/apache/drill/common/config/DrillConfig.java 
> 522303f 
>   common/src/main/java/org/apache/drill/common/config/NestedConfig.java 
> 3fd885f 
>   
> contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
>  9458db2 
>   
> contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveRecordReader.java
>  3c8b9ba 
>   
> contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
>  182f5a4 
>   exec/java-exec/src/main/codegen/templates/AbstractFieldWriter.java 1b5dad1 
>   exec/java-exec/src/main/codegen/templates/BaseWriter.java ada410d 
>   exec/java-exec/src/main/codegen/templates/ComplexWriters.java 980f9ac 
>   exec/java-exec/src/main/codegen/templates/FixedValueVectors.java 0dffa0b 
>   exec/java-exec/src/main/codegen/templates/JsonOutputRecordWriter.java 
> ea643f0 
>   exec/java-exec/src/main/codegen/templates/ListWriters.java ab78603 
>   exec/java-exec/src/main/codegen/templates/MapWriters.java 06a6813 
>   exec/java-exec/src/main/codegen/templates/NullableValueVectors.java ce6a3a7 
>   exec/java-exec/src/main/codegen/templates/ParquetOutputRecordWriter.java 
> 35777b0 
>   exec/java-exec/src/main/codegen/templates/RepeatedValueVectors.java 37b8fac 
>   exec/java-exec/src/main/codegen/templates/StringOutputRecordWriter.java 
> f704cca 
>   exec/java-exec/src/main/codegen/templates/VariableLengthVectors.java 
> 529f21b 
>   exec/java-exec/src/main/java/io/netty/buffer/DrillBuf.java 3ec6b3e 
>   exec/java-exec/src/main/java/io/netty/buffer

Re: [DISCUSS] Allowing the option to use github pull requests in place of reviewboard

2015-06-23 Thread Parth Chandra
+1 on trying this. RB has been pretty painful to us.



On Mon, Jun 22, 2015 at 9:45 PM, Matthew Burgess 
wrote:

> Is Travis   a viable option for the GitHub route?
> I
> use it for my own projects to build pull requests (with additional code
> quality targets like CheckStyle, PMD, etc.). Perhaps that would take some
> of
> the burden off the reviewers and let them focus on the proposed
> implementations, rather than some of the more tedious aspects of each
> review.
>
> From:  Jacques Nadeau 
> Reply-To:  
> Date:  Monday, June 22, 2015 at 10:22 PM
> To:  "dev@drill.apache.org" 
> Subject:  Re: [DISCUSS] Allowing the option to use github pull requests in
> place of reviewboard
>
> I'm up for this if we deprecate the old way.  Having two different
> processes seems like overkill.  In general, I find the review interface of
> GitHub less expressive/clear but everything else is way better.
>
> On Mon, Jun 22, 2015 at 6:59 PM, Steven Phillips 
> wrote:
>
> >  +1
> >
> >  I am in favor of giving this a try.
> >
> >  If I remember correctly, the reason we abandoned pull requests
> originally
> >  was because we couldn't close the pull requests through Github. A
> solution
> >  could be for whoever pushes the commit to the apache git repo to add the
> >  Line "Closes ". Github would then automatically close
> the
> >  pull request.
> >
> >  On Mon, Jun 22, 2015 at 1:02 PM, Jason Altekruse <
> altekruseja...@gmail.com
> >>  >
> >  wrote:
> >
> >>  > Hello Drill developers,
> >>  >
> >>  > I am writing this message today to propose allowing the use of github
> >  pull
> >>  > requests to perform reviews in place of the apache reviewboard
> instance.
> >>  >
> >>  > Reviewboard has caused a number of headaches in the past few months,
> and
> >  I
> >>  > think its time to evaluate the benefits of the apache infrastructure
> >>  > relative to the actual cost of using it in practice.
> >>  >
> >>  > For clarity of the discussion, we cannot use the complete github
> >  workflow.
> >>  > Comitters will still need to use patch files, or check out the branch
> >  used
> >>  > in the review request and push to apache master manually. I am not
> >>  > advocating for using a merging strategy with git, just for using the
> >  github
> >>  > web UI for reviews. I expect anyone generating a chain of commits as
> >>  > described below to use the rebasing workflow we do today.
> Additionally
> >  devs
> >>  > should only be breaking up work to make it easier to review, we will
> not
> >  be
> >>  > reviewing branches that contain a bunch of useless WIP commits.
> >>  >
> >>  > A few examples of problems I have experienced with reviewboard
> include:
> >>  > corruption of patches when they are downloaded, the web interface
> showing
> >>  > inconsistent content from the raw diff, and random rejection of
> patches
> >>  > that are based directly on the head of apache master.
> >>  >
> >>  > These are all serious blockers for getting code reviewed and
> integrated
> >>  > into the master branch in a timely manner.
> >>  >
> >>  > In addition to serious bugs in reviewboard, there are a number of
> >>  > difficulties with the combination of our typical dev workflow and how
> >>  > reviewboard works with patches. As we are still adding features to
> Drill,
> >>  > we often have several weeks of work to submit in response to a JIRA
> or
> >>  > series of related JIRAs. Sometimes this work can be broken up into
> >>  > independent reviewable units, and other times it cannot. When a
> series of
> >>  > changes requires a mixture of refactoring and additions, the process
> is
> >>  > currently quite painful. Ether reviewers need to look through a giant
> >  messy
> >>  > diff, or the submitters need to do a lot of extra work. This
> involves not
> >>  > only organizing their work into a reviewable series of commits, but
> also
> >>  > generating redundant squashed versions of the intermediate work to
> make
> >>  > reviewboard happy.
> >>  >
> >>  > For a relatively simple 3 part change, this involves creating 3
> >  reviewboard
> >>  > pages. The first will contain the first commit by itself. The second
> will
> >>  > have the first commits patch as a parent patch with the next change
> in
> >  the
> >>  > series uploaded as the core change to review. For the third change, a
> >>  > squashed version of the first two commits must be generated to serve
> as a
> >>  > parent patch and then the third changeset uploaded as the reviewable
> >>  > change. Frequently a change to the first commit requires
> regenerating all
> >>  > of these patches and uploading them to the individual review pages.
> >>  >
> >>  > This gets even worse with larger chains of commits.
> >>  >
> >>  > It would be great if all of our changes could be small units of
> work, but
> >>  > very frequently we want to make sure we are ready to merge a complete
> >>  > feature before starting the review process. We need to have a better
> way
> >  to

Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Parth Chandra
I'd be in favor of doing this in the 1.2 release cycle.

On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:

> Few of the storage plugins like HBase and Hive have matured enough to be
> moved out of contrib and into the mainline, probably under "exec/storage".
>
> If people think that it is time to do that, I can take this up.
>
> aditya...
>


Re: [DISCUSS] Allowing the option to use github pull requests in place of reviewboard

2015-06-23 Thread Hanifi Gunes
+1

At the very least GitHub will be UP.

On Tue, Jun 23, 2015 at 2:18 PM, Parth Chandra 
wrote:

> +1 on trying this. RB has been pretty painful to us.
>
>
>
> On Mon, Jun 22, 2015 at 9:45 PM, Matthew Burgess 
> wrote:
>
> > Is Travis   a viable option for the GitHub
> route?
> > I
> > use it for my own projects to build pull requests (with additional code
> > quality targets like CheckStyle, PMD, etc.). Perhaps that would take some
> > of
> > the burden off the reviewers and let them focus on the proposed
> > implementations, rather than some of the more tedious aspects of each
> > review.
> >
> > From:  Jacques Nadeau 
> > Reply-To:  
> > Date:  Monday, June 22, 2015 at 10:22 PM
> > To:  "dev@drill.apache.org" 
> > Subject:  Re: [DISCUSS] Allowing the option to use github pull requests
> in
> > place of reviewboard
> >
> > I'm up for this if we deprecate the old way.  Having two different
> > processes seems like overkill.  In general, I find the review interface
> of
> > GitHub less expressive/clear but everything else is way better.
> >
> > On Mon, Jun 22, 2015 at 6:59 PM, Steven Phillips  >
> > wrote:
> >
> > >  +1
> > >
> > >  I am in favor of giving this a try.
> > >
> > >  If I remember correctly, the reason we abandoned pull requests
> > originally
> > >  was because we couldn't close the pull requests through Github. A
> > solution
> > >  could be for whoever pushes the commit to the apache git repo to add
> the
> > >  Line "Closes ". Github would then automatically close
> > the
> > >  pull request.
> > >
> > >  On Mon, Jun 22, 2015 at 1:02 PM, Jason Altekruse <
> > altekruseja...@gmail.com
> > >>  >
> > >  wrote:
> > >
> > >>  > Hello Drill developers,
> > >>  >
> > >>  > I am writing this message today to propose allowing the use of
> github
> > >  pull
> > >>  > requests to perform reviews in place of the apache reviewboard
> > instance.
> > >>  >
> > >>  > Reviewboard has caused a number of headaches in the past few
> months,
> > and
> > >  I
> > >>  > think its time to evaluate the benefits of the apache
> infrastructure
> > >>  > relative to the actual cost of using it in practice.
> > >>  >
> > >>  > For clarity of the discussion, we cannot use the complete github
> > >  workflow.
> > >>  > Comitters will still need to use patch files, or check out the
> branch
> > >  used
> > >>  > in the review request and push to apache master manually. I am not
> > >>  > advocating for using a merging strategy with git, just for using
> the
> > >  github
> > >>  > web UI for reviews. I expect anyone generating a chain of commits
> as
> > >>  > described below to use the rebasing workflow we do today.
> > Additionally
> > >  devs
> > >>  > should only be breaking up work to make it easier to review, we
> will
> > not
> > >  be
> > >>  > reviewing branches that contain a bunch of useless WIP commits.
> > >>  >
> > >>  > A few examples of problems I have experienced with reviewboard
> > include:
> > >>  > corruption of patches when they are downloaded, the web interface
> > showing
> > >>  > inconsistent content from the raw diff, and random rejection of
> > patches
> > >>  > that are based directly on the head of apache master.
> > >>  >
> > >>  > These are all serious blockers for getting code reviewed and
> > integrated
> > >>  > into the master branch in a timely manner.
> > >>  >
> > >>  > In addition to serious bugs in reviewboard, there are a number of
> > >>  > difficulties with the combination of our typical dev workflow and
> how
> > >>  > reviewboard works with patches. As we are still adding features to
> > Drill,
> > >>  > we often have several weeks of work to submit in response to a JIRA
> > or
> > >>  > series of related JIRAs. Sometimes this work can be broken up into
> > >>  > independent reviewable units, and other times it cannot. When a
> > series of
> > >>  > changes requires a mixture of refactoring and additions, the
> process
> > is
> > >>  > currently quite painful. Ether reviewers need to look through a
> giant
> > >  messy
> > >>  > diff, or the submitters need to do a lot of extra work. This
> > involves not
> > >>  > only organizing their work into a reviewable series of commits, but
> > also
> > >>  > generating redundant squashed versions of the intermediate work to
> > make
> > >>  > reviewboard happy.
> > >>  >
> > >>  > For a relatively simple 3 part change, this involves creating 3
> > >  reviewboard
> > >>  > pages. The first will contain the first commit by itself. The
> second
> > will
> > >>  > have the first commits patch as a parent patch with the next change
> > in
> > >  the
> > >>  > series uploaded as the core change to review. For the third
> change, a
> > >>  > squashed version of the first two commits must be generated to
> serve
> > as a
> > >>  > parent patch and then the third changeset uploaded as the
> reviewable
> > >>  > change. Frequently a change to the first commit requires
> > regenerating all
> > >>  > of these patches 

Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Jacques Nadeau
Unless we move contrib out of the build (and to a different repo), I'm not
sure what the change really means.  Thoughts on this?

On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra 
wrote:

> I'd be in favor of doing this in the 1.2 release cycle.
>
> On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:
>
> > Few of the storage plugins like HBase and Hive have matured enough to be
> > moved out of contrib and into the mainline, probably under
> "exec/storage".
> >
> > If people think that it is time to do that, I can take this up.
> >
> > aditya...
> >
>


Re: Drill should validate column names within window functions

2015-06-23 Thread Jason Altekruse
You can mistype a column that you are sorting on or joining on and get the
same problem. Vicky has filed a bug for a subquery that looked to be
returning wrong results, but one of the columns she was trying to refer to
was not available within that scope, and we happily filled it in with all
null values. This unfortunately is not a trivial problem to solve, but it
is also not impossible to provide a better user experience here.

There are a couple of considerations from a dev perspective that I think
need to be discussed here. We currently lack a way to record a list of
warnings as a query is executing that can be shown alongside a result set.
Even if we decide that this situation is not best solved with warnings, we
will eventually need such a feature if we ever implement features like
tunable exception suppression. This feature has been brought up on the list
a few times, and would allow for ignoring a small percentage of malformed
data. In these cases I would expect Drill to not only return results based
on the properly formed data, but also report how much was ignored or
discarded.

Once we have a way to report warnings to the user, we also need to define
when we need to issue the warning. The major justification for the current
behavior is that we optimistically assume that we may run into the field in
a later record in the case of formats that can change schema like JSON, or
in static schema formats like parquet where we may be reading files with
different schemas.

A simple "solution" would be to turn off our behavior to materialize nulls
into columns that we have not yet read out of any files. This
materialization currently happens at a project or filter as we are
materializing a expressions to evaluate against the column. Making such a
simple change however would remove the legitimate case where the schema is
reasonably evolving, in particular the case of adding a new column. The old
files will lack it, and batches missing a column may arrive for expression
evaluation first, with other batches that have the column arriving later.
This case should continue to work. There are other discussions happening
right now about how this isn't working in all cases, but these are bugs
that should be fixed soon because they do not function as we expect them to
today.

The more correct solution would be to hold state for each column we
materialize these phantom-nulls into, keeping track if we ever see data
coming back from the scan in these columns. For those that never see real
data coming back, we could give a warning message to say that none of the
input files had any data in columns x,y and z. This is gets a little more
complicated with other operators. For example, if we had a union all over a
scan of two separate directories, what would be the expected behavior if
one of the streams always lacked a column, but the other one did have it.
Should we issue a warning in this case?

There is also the consideration of how this interacts with the behavior of
the file readers. In JSON, we consider a non-existent field to be the same
as a field with an explicit null value. If a user were to read JSON file
that contained only nulls in a few of the columns, they would receive a
warning that a particular column was not found in the file. This could also
be confusing, but I think it is a much smaller problem than the cases where
we look to return correct results but the user made a simple mistake or
misunderstand how Drill works and we didn't try to make them aware of it.


On Tue, Jun 23, 2015 at 12:57 PM, Abhishek Girish  wrote:

> Hey all,
>
> I observed an issue while working with Window Functions. I observed a case
> where wrong results are returned from Drill.
>
> In-case of weak schema such as parquet, Drill does not validate column
> names. It is understandable when only part of the projection list in a
> query. But when part of a Window Function, the results displayed are wrong,
> and at times hard to identify the cause.
>
> Two examples below:
>
> > SELECT PERCENT_RANK() OVER (PARTITION BY s.store_sk, s.ss_customer_sk
> ORDER BY s.store_sk, s.ss_customer_sk) FROM store_sales s LIMIT 2;
> +-+
> | EXPR$0  |
> +-+
> | 0.0 |
> | 0.0 |
> +-+
> 2 rows selected (7.116 seconds)
>
> SELECT CUME_DIST() OVER (PARTITION BY s.ss_store_sk ORDER BY s.ss_stoe_sk,
> s.s_customr_sk) FROM store_sales s LIMIT 2;
> +-+
> | EXPR$0  |
> +-+
> | 1.0 |
> | 1.0 |
> +-+
> 2 rows selected (8.361 seconds)
>
> In both cases above, some columns do not exist.
>
> With normal aggregate functions, it is similar to having a non-existent
> column in projection list. Drill prints a column of null rows. This could
> still be documented for users to expect "null" columns in results when
> non-existent columns are part of a projection list.
>
> > SELECT s.ss_store_sk, avg (ssdfd), ssdfd FROM store_sales s GROUP BY
> s.ss_store_sk, ssdfd LIMIT 2;
> +--+-++
> |

[jira] [Resolved] (DRILL-3094) TPCH query 15 returns non-deterministic result

2015-06-23 Thread Aman Sinha (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aman Sinha resolved DRILL-3094.
---
Resolution: Won't Fix

Pls see previous comments and try the round() function.  I am closing this for 
now. 

> TPCH query 15 returns non-deterministic result
> --
>
> Key: DRILL-3094
> URL: https://issues.apache.org/jira/browse/DRILL-3094
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.0.0
>Reporter: Abhishek Girish
>Assignee: Aman Sinha
>
> Query 15:
> {code:sql}
> create or replace view revenue0 (supplier_no, total_revenue) as
>   select
> l_suppkey,
> sum(l_extendedprice * (1 - l_discount))
>   from
> lineitem
>   where
> l_shipdate >= date '1993-05-01'
> and l_shipdate < date '1993-05-01' + interval '3' month
>   group by
> l_suppkey;
> select
>   s.s_suppkey,
>   s.s_name,
>   s.s_address,
>   s.s_phone,
>   r.total_revenue
> from
>   supplier s,
>   revenue0 r
> where
>   s.s_suppkey = r.supplier_no
>   and r.total_revenue = (
> select
>   max(total_revenue)
> from
>   revenue0
>   )
> order by
>   s.s_suppkey;
> {code}
> Drill sometimes returns 0 rows and other times 1 row. Postgres always returns 
> 1 row. 
> This is possibly due to the non-deterministic comparison of floating point 
> values. 
> {code}total_revenue (calculated as sum(l_extendedprice * (1 - 
> l_discount))){code} is compared with {code}max(total_revenue){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3346) Windowing query over View fails

2015-06-23 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3346:
-

 Summary: Windowing query over View fails
 Key: DRILL-3346
 URL: https://issues.apache.org/jira/browse/DRILL-3346
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.1.0
 Environment: 6ebfbb9d0fc0b87b032f5e5d5cb0825f5464426e 
Reporter: Khurram Faraaz
Assignee: Chris Westin


The below window function query over a view fails. The view was defined over 
parquet data. Test was on a 4 node cluster.

DDL for create view

{code}
0: jdbc:drill:schema=dfs.tmp> create view vwOnParq (col_int, col_bigint, 
col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as 
select col_int, col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, 
col_booln, col_dbl, col_tm from `tblForView/0_0_0.parquet`;
+---+---+
|  ok   |  summary  |
+---+---+
| true  | View 'vwOnParq' created successfully in 'dfs.tmp' schema  |
+---+---+
1 row selected (0.181 seconds)
{code}

Failing query, that uses group by

{code}
0: jdbc:drill:schema=dfs.tmp> select sum(col_int) over(), col_vchar_52 from 
vwOnParq group by col_vchar_52;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
java.lang.IllegalArgumentException: Invalid value for boolean: 
axcb

Fragment 0:0

[Error Id: 29be6826-4c45-4b0a-9922-3d2509316c01 on centos-03.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:85)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:116)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

With out group by, we see the query returns results

{code}
0: jdbc:drill:schema=dfs.tmp> select sum(col_int) over(), col_vchar_52 from 
vwOnParq;
+-+---+
| EXPR$0  | col_vchar_52  |
+-+---+
| 91  | AXCB  |
| 91  | DXEF  |
| 91  | GXHI  |
| 91  | HXIJ  |
| 91  | DXEF  |
| 91  | AXCB  |
| 91  | GXHI  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | HXIJ  |
| 91  | GXHI  |
| 91  | HXIJ  |
| 91  | AXCB  |
| 91  | HXIJ  |
| 91  | DXEF  |
| 91  | GXHI  |
| 91  | DXEF  |
| 91  | AXCB  |
| 91  | HXIJ  |
| 91  | DXEF  |
| 91  | GXHI  |
| 91  | GXHI  |
| 91  | GXHI  |
| 91  | GXHI  |
| 91  | HXIJ  |
+-+---+
30 rows selected (0.246 seconds)
{code}

Stack trace from drillbit.log

{code}
[Error Id: 29be6826-4c45-4b0a-9922-3d2509316c01 on centos-03.qa.lab:31010]
org.apache.drill.common.exceptions.

[jira] [Created] (DRILL-3347) Resolve: ResultSet.getObject(...) for VARCHAR returns ...hadoop.io.Text, not String

2015-06-23 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3347:
-

 Summary: Resolve: ResultSet.getObject(...) for VARCHAR returns 
...hadoop.io.Text, not String
 Key: DRILL-3347
 URL: https://issues.apache.org/jira/browse/DRILL-3347
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


For SQL type CHARACTER VARYING, ResultSet methods 
[getObject(int)|http://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getObject-int-]
 and 
[getObject(String)|http://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getObject-String-]
 return an org.apache.hadoop.io.Text object, not a java.lang.String object, as 
specified by JDBC.

If this behavior is intentional (presumably for optimization), we should 
resolve how it fits in JDBC type-mapping rules.
 

JDBC does allow custom mapping from SQL types to Java types.  However, Drill's 
getObject behaves as above by default, although JDBC says that by default there 
are no custom mappings.

Additionally, the JDBC driver does not implement Connection.getTypeMap() or 
Connection.setTypeMap(...), which means that the client can neither ask the 
driver what class getObject(...) will use for CHARACTER VARYING, or tell it to 
revert to the standard mapping.

(Also, JDBC's wording about custom mappings seems to allow it only for 
user-defined types, not predefined types (such as CHARACTER VARYING), but it's 
not clear whether that was intended, and much of the JDBC specification is 
ambiguous--and some is clearly wrong.)
 

Choices regarding the return values/type of  getObject(int)/getObject(String) 
include:

1.  Not changing the behavior (but documenting it).  This is non-compliant with 
JDBC in several ways (non-standard mapping, no reporting of mapping via getMap, 
no changing back to standard via setMap).

2.  Implementing getMap enough to support at least reporting of the actual 
mapping.

3.  Implementing setMap as well as getMap, enough* to support switching to the 
standard mapping (and back to the custom mapping).  (*This option does not 
require fully implementing the general custom mapping allowed by JDBC (e.g., 
for arbitrary user-defined SQL and Java types).)

4.  Defaulting to the standard mapping (in addition to adding setMap and 
getMap), to support starting off in the standard default state.  (This choice 
is the compliant with the JDBC specification.)


It is not clear whether there are any other cases that need to be resolved.
 

(For most SQL types, Drill's getObject already returns the expected types.  For 
interval types, getObject's returning of org.joda.time.Period objects doesn't 
conflict with any expected type because there is no expected type.  (JDBC (as 
of 4.2) doesn't address interval types.))





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35673: DRILL-3326: Query with unsupported windows function containing "AS" blocks correct error message

2015-06-23 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35673/#review89084
---



exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
(line 203)


It wasn't clear why you were using non-existent columns 'age' and 
'contribution' in this test that is querying the nation table.  I thought 
DRILL-3326 is about unsupported window functions that are getting masked by the 
aliasing.


- Aman Sinha


On June 22, 2015, 5:10 p.m., Sean Hsuan-Yi Chu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35673/
> ---
> 
> (Updated June 22, 2015, 5:10 p.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Bugs: DRILL-3326
> https://issues.apache.org/jira/browse/DRILL-3326
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-3326: When inspecting SELECT-LIST, UnsupportedOperatorsVisitor will dig 
> into AS clause
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
>  544a838 
>   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
> 1c8b0db 
> 
> Diff: https://reviews.apache.org/r/35673/diff/
> 
> 
> Testing
> ---
> 
> Unit, others are on the way
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>



Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Aditya
I have always interpreted "contrib" as an experimental code which are
provided "as is"[1], not fully validated by the main project and possibly
has not reached the same maturity as the rest of the project.

It will help us in communicate the maturity of few plugins like hbase, hive
and mongo as such while accepting more user contributions say cassandra,
couchbase into Drill's code base.

[1]
http://stackoverflow.com/questions/7328852/what-are-apache-contrib-modules

On Tue, Jun 23, 2015 at 2:33 PM, Jacques Nadeau  wrote:

> Unless we move contrib out of the build (and to a different repo), I'm not
> sure what the change really means.  Thoughts on this?
>
> On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra 
> wrote:
>
> > I'd be in favor of doing this in the 1.2 release cycle.
> >
> > On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:
> >
> > > Few of the storage plugins like HBase and Hive have matured enough to
> be
> > > moved out of contrib and into the mainline, probably under
> > "exec/storage".
> > >
> > > If people think that it is time to do that, I can take this up.
> > >
> > > aditya...
> > >
> >
>


Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Jason Altekruse
In light of Aditya's explanation I think this is a good way to clarify the
relationship between Drill, Hive and Hbase. These are core plugins that we
are planning to support completely, just as we do with the HDFS API in the
FileSystemPlugin, which is currently a part of the exec module.

+1

On Tue, Jun 23, 2015 at 4:31 PM, Aditya  wrote:

> I have always interpreted "contrib" as an experimental code which are
> provided "as is"[1], not fully validated by the main project and possibly
> has not reached the same maturity as the rest of the project.
>
> It will help us in communicate the maturity of few plugins like hbase, hive
> and mongo as such while accepting more user contributions say cassandra,
> couchbase into Drill's code base.
>
> [1]
> http://stackoverflow.com/questions/7328852/what-are-apache-contrib-modules
>
> On Tue, Jun 23, 2015 at 2:33 PM, Jacques Nadeau 
> wrote:
>
> > Unless we move contrib out of the build (and to a different repo), I'm
> not
> > sure what the change really means.  Thoughts on this?
> >
> > On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra 
> > wrote:
> >
> > > I'd be in favor of doing this in the 1.2 release cycle.
> > >
> > > On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:
> > >
> > > > Few of the storage plugins like HBase and Hive have matured enough to
> > be
> > > > moved out of contrib and into the mainline, probably under
> > > "exec/storage".
> > > >
> > > > If people think that it is time to do that, I can take this up.
> > > >
> > > > aditya...
> > > >
> > >
> >
>


Re: Review Request 35484: DRILL-2851: set an upper-bound on # of bytes to re-allocate to prevent overflows

2015-06-23 Thread Jason Altekruse

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35484/#review89076
---

Ship it!



exec/java-exec/src/main/codegen/templates/NullableValueVectors.java (line 134)


If the allocation of the bit vector fails will this cause issues calling 
zeroVector on an unallocated vector? I think there is a risk that we will have 
a null buffer that the old code would defend against, the zeroVector() method 
calls a method on the data buffer directly. We need to add a return false after 
the clear in the finally block.



exec/java-exec/src/main/codegen/templates/NullableValueVectors.java (line 216)


Same as above


- Jason Altekruse


On June 23, 2015, 10:59 p.m., Hanifi Gunes wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35484/
> ---
> 
> (Updated June 23, 2015, 10:59 p.m.)
> 
> 
> Review request for drill, Jason Altekruse and Mehant Baid.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-2851: set an upper-bound on # of bytes to re-allocate to prevent 
> overflows
> Vectors
> - set an upper bound on # of bytes to allocate
> - 
> TestValueVector.java  
> - Add unit tests
> 
> 
> Diffs
> -
> 
>   exec/java-exec/src/main/codegen/includes/vv_imports.ftl 
> 92c80072cfcde4deb0bbb34bc3b688707541f2f6 
>   exec/java-exec/src/main/codegen/templates/FixedValueVectors.java 
> 7103a17108693d47839212c418d11d13fbb8f6f4 
>   exec/java-exec/src/main/codegen/templates/NullableValueVectors.java 
> 7f835424b68a9d68b0a6c60749677a83ac486590 
>   exec/java-exec/src/main/codegen/templates/VariableLengthVectors.java 
> 50ae770f24aff1e8eed1dfa800878ce92308c644 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/exception/OversizedAllocationException.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenRecordBatch.java
>  999140498ab303d3f5ecf20695755bdfe943cb46 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/FlattenTemplate.java
>  de67b62248a68c1f483808c4b575e0afa7854aca 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/flatten/Flattener.java
>  92cf79d37da89864ab7702830fe078479773a73e 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/BaseDataValueVector.java
>  0e38f3cad3792e936ff918ae970f4b40e478d516 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/vector/BaseValueVector.java
>  8129668b6ff5dc674e30dca6947bd93c87fb4d3d 
>   exec/java-exec/src/main/java/org/apache/drill/exec/vector/BitVector.java 
> 10bdf0752632c7577b9a6eb445c7101ec1a24730 
>   
> exec/java-exec/src/test/java/org/apache/drill/exec/record/vector/TestValueVector.java
>  037c8c6d3da94acf5c2ca300ce617338cacb0fb0 
> 
> Diff: https://reviews.apache.org/r/35484/diff/
> 
> 
> Testing
> ---
> 
> all
> 
> 
> Thanks,
> 
> Hanifi Gunes
> 
>



Review Request 35808: DRILL-2862: Convert_to/Convert_From throw assertion when an incorrect encoding type is specified

2015-06-23 Thread Parth Chandra

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35808/
---

Review request for drill, Jinfeng Ni and Sudheesh Katkam.


Repository: drill-git


Description
---

DRILL-2862: Convert_to/Convert_From throw assertion when an incorrect encoding 
type is specified or if the encoding type is not a string literal.

Instead of an assertion when user input is wrong, we now throw an exception 
with the appropriate error message. 
For the case where the user types in a type name incorrectly, the error message 
also provides a helpful suggestion. The suggested name is selected from the 
list of available functions.

For example: 

  select convert_from(foo, 'UTF') from dfs.`/table_foo`

will print the following error:

  Error: UNSUPPORTED_OPERATION ERROR: CONVERT_FROM does not support conversion 
from type 'UTF'.
  Did you mean UTF8?
  [Error Id: 87ed2941-f9c2-4c35-8ff2-a3f21eae1104 on localhost:31010] 
(state=,code=0)


Diffs
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PreProcessLogicalRel.java
 0f8e45afd88fd2c4f82df87a8216f4505bfa03fe 
  
exec/java-exec/src/main/java/org/apache/drill/exec/util/ApproximateStringMatcher.java
 PRE-CREATION 

Diff: https://reviews.apache.org/r/35808/diff/


Testing
---

All regression tests


Thanks,

Parth Chandra



[jira] [Resolved] (DRILL-1616) Add support for count() on maps and arrays

2015-06-23 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1616.

Resolution: Fixed

> Add support for count() on maps and arrays
> --
>
> Key: DRILL-1616
> URL: https://issues.apache.org/jira/browse/DRILL-1616
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
>Priority: Minor
> Fix For: 1.1.0
>
>
> Count(field) throws error on fields which are objects or arrays and these are 
> not clean. They do not indicate an error in usage. Also, count on 
> objects/arrays should be supported. 
> > select * from `abc.json`;
> ++++++
> |  field_1   |  field_2   |  field_3   |  field_4   |  field_5   |
> ++++++
> | ["1"]  | null   | {"inner_3":[]} | {"inner_1":[],"inner_3":{}} | [] 
> |
> | ["5"]  | 2  | {"inner_1":"2","inner_3":[]} | 
> {"inner_1":["1","2","3"],"inner_2":"3","inner_3":{"inner_object_field_1":"2"}}
>  | [{"inner_list":["1","null","6"],"inner_ |
> | ["5","10","15"] | A wild string appears! | 
> {"inner_1":"5","inner_2":"3","inner_3":[{},{"inner_object_field_1":"10"}]} | 
> {"inner_1":["4","5","6"],"inner_2":"3","inner_3":{}} | [{ |
> ++++++
> 3 rows selected (0.081 seconds)
> > select count(field_1) from `abc.json`;
> Query failed: Failure while running fragment., Schema is currently null.  You 
> must call buildSchema(SelectionVectorMode) before this container can return a 
> schema. [ b6f021f9-213e-475e-83f4-a6facf6fd76d on abhi7.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> Error is seen on fields 1,3,4,5. 
> The issue is not seen when array index is specified. 
> > select count(field_1[0]) from `abc.json`;
> ++
> |   EXPR$0   |
> ++
> | 3  |
> ++
> 1 row selected (0.152 seconds)
> Or when the element in the object is specified:
> > select count(t.field_3.inner_3) from `textmode.json` as t;
> ++
> |   EXPR$0   |
> ++
> | 3  |
> ++
> 1 row selected (0.155 seconds)
> LOG:
> 2014-10-30 13:28:20,286 [a90cc246-e60b-452b-ba96-7f79709f5ffa:frag:0:0] ERROR 
> o.a.d.e.w.f.AbstractStatusReporter - Error 
> bc438332-0828-4a86-8063-9dc8c5a703d9: Failure while running fragment.
> java.lang.NullPointerException: Schema is currently null.  You must call 
> buildSchema(SelectionVectorMode) before this container can return a schema.
> at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:273)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.getSchema(AbstractRecordBatch.java:116)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.getSchema(IteratorValidatorBatchIterator.java:75)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema(ScreenCreator.java:100)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:103)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:249)
>  
> [drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1616) Add support for count() on maps and arrays

2015-06-23 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1616.

Resolution: Duplicate

> Add support for count() on maps and arrays
> --
>
> Key: DRILL-1616
> URL: https://issues.apache.org/jira/browse/DRILL-1616
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Reporter: Abhishek Girish
>Assignee: Jason Altekruse
>Priority: Minor
>
> Count(field) throws error on fields which are objects or arrays and these are 
> not clean. They do not indicate an error in usage. Also, count on 
> objects/arrays should be supported. 
> > select * from `abc.json`;
> ++++++
> |  field_1   |  field_2   |  field_3   |  field_4   |  field_5   |
> ++++++
> | ["1"]  | null   | {"inner_3":[]} | {"inner_1":[],"inner_3":{}} | [] 
> |
> | ["5"]  | 2  | {"inner_1":"2","inner_3":[]} | 
> {"inner_1":["1","2","3"],"inner_2":"3","inner_3":{"inner_object_field_1":"2"}}
>  | [{"inner_list":["1","null","6"],"inner_ |
> | ["5","10","15"] | A wild string appears! | 
> {"inner_1":"5","inner_2":"3","inner_3":[{},{"inner_object_field_1":"10"}]} | 
> {"inner_1":["4","5","6"],"inner_2":"3","inner_3":{}} | [{ |
> ++++++
> 3 rows selected (0.081 seconds)
> > select count(field_1) from `abc.json`;
> Query failed: Failure while running fragment., Schema is currently null.  You 
> must call buildSchema(SelectionVectorMode) before this container can return a 
> schema. [ b6f021f9-213e-475e-83f4-a6facf6fd76d on abhi7.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> Error is seen on fields 1,3,4,5. 
> The issue is not seen when array index is specified. 
> > select count(field_1[0]) from `abc.json`;
> ++
> |   EXPR$0   |
> ++
> | 3  |
> ++
> 1 row selected (0.152 seconds)
> Or when the element in the object is specified:
> > select count(t.field_3.inner_3) from `textmode.json` as t;
> ++
> |   EXPR$0   |
> ++
> | 3  |
> ++
> 1 row selected (0.155 seconds)
> LOG:
> 2014-10-30 13:28:20,286 [a90cc246-e60b-452b-ba96-7f79709f5ffa:frag:0:0] ERROR 
> o.a.d.e.w.f.AbstractStatusReporter - Error 
> bc438332-0828-4a86-8063-9dc8c5a703d9: Failure while running fragment.
> java.lang.NullPointerException: Schema is currently null.  You must call 
> buildSchema(SelectionVectorMode) before this container can return a schema.
> at 
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) 
> ~[guava-14.0.1.jar:na]
> at 
> org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:273)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.getSchema(AbstractRecordBatch.java:116)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.getSchema(IteratorValidatorBatchIterator.java:75)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema(ScreenCreator.java:100)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:103)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:249)
>  
> [drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_65]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_65]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3348) NPE when two different window functions are used in projection list and order by clauses

2015-06-23 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3348:
---

 Summary: NPE when two different window functions are used in 
projection list and order by clauses
 Key: DRILL-3348
 URL: https://issues.apache.org/jira/browse/DRILL-3348
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


{code:sql}
select 
a1, 
rank() over(partition by b1 order by a1) 
from 
t1 
order by 
row_number() over(partition by b1 order by a1);
{code}

{code}
0: jdbc:drill:schema=dfs> select a1, rank() over(partition by b1 order by a1) 
from t1 order by row_number() over(partition by b1 order by a1);
Error: SYSTEM ERROR: org.apache.drill.exec.work.foreman.ForemanException: 
Unexpected exception during fragment initialization: null
[Error Id: ba3e0fda-cc78-4650-a49b-51e4fd7d625d on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

drillbit.log
{code}
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
during fragment initialization: null


[Error Id: ba3e0fda-cc78-4650-a49b-51e4fd7d625d on atsqa4-133.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:523)
 ~[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:738)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:840)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:782)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
[drill-common-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:784)
 [drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:893) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:253) 
[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_71]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_71]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: null
... 4 common frames omitted
Caused by: java.lang.NullPointerException: null
at org.apache.calcite.rex.RexBuilder.makeCast(RexBuilder.java:465) 
~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at org.apache.calcite.rex.RexBuilder.ensureType(RexBuilder.java:955) 
~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertOver(SqlToRelConverter.java:1763)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.access$1000(SqlToRelConverter.java:180)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter$Blackboard.convertExpression(SqlToRelConverter.java:3938)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectList(SqlToRelConverter.java:3327)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelectImpl(SqlToRelConverter.java:609)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertSelect(SqlToRelConverter.java:564)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQueryRecursive(SqlToRelConverter.java:2741)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.calcite.sql2rel.SqlToRelConverter.convertQuery(SqlToRelConverter.java:522)
 ~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at org.apache.calcite.prepare.PlannerImpl.convert(PlannerImpl.java:198) 
~[calcite-core-1.1.0-drill-r8.jar:1.1.0-drill-r8]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRel(DefaultSqlHandler.java:448)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebuffed.jar:1.1.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:191)
 ~[drill-java-exec-1.1.0-SNAPSHOT-rebu

Re: Review Request 35673: DRILL-3326: Query with unsupported windows function containing "AS" blocks correct error message

2015-06-23 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35673/
---

(Updated June 24, 2015, 12:12 a.m.)


Review request for drill and Aman Sinha.


Changes
---

addressed review comment


Bugs: DRILL-3326
https://issues.apache.org/jira/browse/DRILL-3326


Repository: drill-git


Description
---

DRILL-3326: When inspecting SELECT-LIST, UnsupportedOperatorsVisitor will dig 
into AS clause


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 544a838 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
1c8b0db 

Diff: https://reviews.apache.org/r/35673/diff/


Testing
---

Unit, others are on the way


Thanks,

Sean Hsuan-Yi Chu



Re: Review Request 35673: DRILL-3326: Query with unsupported windows function containing "AS" blocks correct error message

2015-06-23 Thread Sean Hsuan-Yi Chu


> On June 23, 2015, 11:27 p.m., Aman Sinha wrote:
> > exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java,
> >  line 203
> > 
> >
> > It wasn't clear why you were using non-existent columns 'age' and 
> > 'contribution' in this test that is querying the nation table.  I thought 
> > DRILL-3326 is about unsupported window functions that are getting masked by 
> > the aliasing.

That was copy-paste mistake. A new patch has addressed that


- Sean Hsuan-Yi


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35673/#review89084
---


On June 24, 2015, 12:12 a.m., Sean Hsuan-Yi Chu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35673/
> ---
> 
> (Updated June 24, 2015, 12:12 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Bugs: DRILL-3326
> https://issues.apache.org/jira/browse/DRILL-3326
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-3326: When inspecting SELECT-LIST, UnsupportedOperatorsVisitor will dig 
> into AS clause
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
>  544a838 
>   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
> 1c8b0db 
> 
> Diff: https://reviews.apache.org/r/35673/diff/
> 
> 
> Testing
> ---
> 
> Unit, others are on the way
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>



Re: Review Request 35808: DRILL-2862: Convert_to/Convert_From throw assertion when an incorrect encoding type is specified

2015-06-23 Thread Venki Korukanti

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35808/#review89103
---


How about the case where the function is not found for the specific input col 
type and the encoding is correct? 
(https://issues.apache.org/jira/browse/DRILL-2648)?

- Venki Korukanti


On June 23, 2015, 4:44 p.m., Parth Chandra wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35808/
> ---
> 
> (Updated June 23, 2015, 4:44 p.m.)
> 
> 
> Review request for drill, Jinfeng Ni and Sudheesh Katkam.
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-2862: Convert_to/Convert_From throw assertion when an incorrect 
> encoding type is specified or if the encoding type is not a string literal.
> 
> Instead of an assertion when user input is wrong, we now throw an exception 
> with the appropriate error message. 
> For the case where the user types in a type name incorrectly, the error 
> message also provides a helpful suggestion. The suggested name is selected 
> from the list of available functions.
> 
> For example: 
> 
>   select convert_from(foo, 'UTF') from dfs.`/table_foo`
> 
> will print the following error:
> 
>   Error: UNSUPPORTED_OPERATION ERROR: CONVERT_FROM does not support 
> conversion from type 'UTF'.
>   Did you mean UTF8?
>   [Error Id: 87ed2941-f9c2-4c35-8ff2-a3f21eae1104 on localhost:31010] 
> (state=,code=0)
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/PreProcessLogicalRel.java
>  0f8e45afd88fd2c4f82df87a8216f4505bfa03fe 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/util/ApproximateStringMatcher.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/35808/diff/
> 
> 
> Testing
> ---
> 
> All regression tests
> 
> 
> Thanks,
> 
> Parth Chandra
> 
>



Re: Review Request 35673: DRILL-3326: Query with unsupported windows function containing "AS" blocks correct error message

2015-06-23 Thread Sean Hsuan-Yi Chu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35673/
---

(Updated June 24, 2015, 12:15 a.m.)


Review request for drill and Aman Sinha.


Changes
---

Addressed the comment


Bugs: DRILL-3326
https://issues.apache.org/jira/browse/DRILL-3326


Repository: drill-git


Description
---

DRILL-3326: When inspecting SELECT-LIST, UnsupportedOperatorsVisitor will dig 
into AS clause


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 544a838 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
1c8b0db 

Diff: https://reviews.apache.org/r/35673/diff/


Testing
---

Unit, others are on the way


Thanks,

Sean Hsuan-Yi Chu



Re: Review Request 35739: Patch for DRILL-3333

2015-06-23 Thread Steven Phillips


> On June 22, 2015, 10:25 p.m., Jacques Nadeau wrote:
> > exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java,
> >  line 278
> > 
> >
> > Is creating the metadata converter repeatedly expensive?

It's not expensive, but I will go ahead and reuse it anyway, as it looks 
cleaner.


- Steven


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35739/#review88842
---


On June 22, 2015, 10:22 p.m., Steven Phillips wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35739/
> ---
> 
> (Updated June 22, 2015, 10:22 p.m.)
> 
> 
> Review request for drill.
> 
> 
> Bugs: DRILL-
> https://issues.apache.org/jira/browse/DRILL-
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-: Parquet writer auto-partitioning and partition pruning
> 
> Conflicts:
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/WriterPrel.java
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java
> 
> 
> Diffs
> -
> 
>   exec/java-exec/src/main/codegen/templates/AbstractRecordWriter.java 
> 6b6065f6b6c8469aa548acf194e0621b9f4ffea8 
>   exec/java-exec/src/main/codegen/templates/EventBasedRecordWriter.java 
> 797f3cb8c83a89821ee46ce0b093f81406fa6067 
>   exec/java-exec/src/main/codegen/templates/NewValueFunctions.java 
> PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/RecordWriter.java 
> c6325fd0a5c7d7cb5f3628df1ecf9c01c264ed52 
>   exec/java-exec/src/main/codegen/templates/StringOutputRecordWriter.java 
> f704cca0e4d62ca1435df84d9eb1b07b32ea8b39 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScan.java
>  5c4ee4da9e0542244b0f71a520cea1c3a2d49a66 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/GroupScan.java
>  2d16cd01b94ed8a5463c0e2fb896f019133f7f03 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java
>  d5d64a722ed6d9b5d97158046e6838f07c0d5381 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
>  d9b1354492454dcd2630c72f5dbc1c3badf958c7 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/ParquetPruneScanRule.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java
>  920b2848d8edb62667b880e81f5aee12b459d63a 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/AutoPartitioner.java 
> PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/NewValueFunction.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonRecordWriter.java
>  a43a4a0f21bf11f29b6385e36db4d25003ffa98f 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java
>  cf39518b2a8b4564504a3971d1f89c268aee4b30 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java
>  621f05c4d50ecf83071a5df414be88e7471f0490 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/text/DrillTextRecordWriter.java
>  31b1fbe9e03282161ee125cb7a4b2f53c8a8da63 
> 
> Diff: https://reviews.apache.org/r/35739/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steven Phillips
> 
>



Wrong results - windowing query over view

2015-06-23 Thread Khurram Faraaz
Windowing query over a view returns wrong results when used with and
without a group by clause. Please let me know if this is a planning bug ?
Postgres does not support the query where we use a group by.

DDL used for view creation was,

create view vwOnParq (col_int, col_bigint, col_char_2, col_vchar_52,
col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as select col_int,
col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln,
col_dbl, col_tm from `tblForView/0_0_0.parquet`;


The two queries are,


0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq;

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*| *-19* |*

*+-+*

30 rows selected (0.26 seconds)


Explain plan for the above query


*| *00-00Screen

00-01  Project(EXPR$0=[$0])

00-02Project($0=[$9])

00-03  Window(window#0=[window(partition {} order by [] range
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])

00-04Project(col_int=[$4], col_bigint=[$7], col_char_2=[$2],
col_vchar_52=[$1], col_tmstmp=[$0], col_dt=[$3], col_booln=[$6],
col_dbl=[$8], col_tm=[$5])

00-05  Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
columns=[`col_int`, `col_bigint`, `col_char_2`, `col_vchar_52`,
`col_tmstmp`, `col_dt`, `col_booln`, `col_dbl`, `col_tm`]]])


0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq
group by col_char_2;

*+-+*

*| **EXPR$0 ** |*

*+-+*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*| *AZ * |*

*+-+*

18 rows selected (0.27 seconds)


Explain plan for the above query that uses group by


*| *00-00Screen

00-01  Project(EXPR$0=[$0])

00-02Project($0=[$2])

00-03  Window(window#0=[window(partition {} order by [] range
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])

00-04HashAgg(group=[{0}], agg#0=[MIN($1)])

00-05  Scan(groupscan=[ParquetGroupScan
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
columns=[`col_char_2`, `col_int`]]])


Thanks,

Khurram


Re: Wrong results - windowing query over view

2015-06-23 Thread Abdel Hakim Deneche
What happens if you run the queries on the original parquet files and not
the views ?

On Tue, Jun 23, 2015 at 5:28 PM, Khurram Faraaz 
wrote:

> Windowing query over a view returns wrong results when used with and
> without a group by clause. Please let me know if this is a planning bug ?
> Postgres does not support the query where we use a group by.
>
> DDL used for view creation was,
>
> create view vwOnParq (col_int, col_bigint, col_char_2, col_vchar_52,
> col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as select col_int,
> col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln,
> col_dbl, col_tm from `tblForView/0_0_0.parquet`;
>
>
> The two queries are,
>
>
> 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq;
>
> *+-+*
>
> *| **EXPR$0 ** |*
>
> *+-+*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *| *-19* |*
>
> *+-+*
>
> 30 rows selected (0.26 seconds)
>
>
> Explain plan for the above query
>
>
> *| *00-00Screen
>
> 00-01  Project(EXPR$0=[$0])
>
> 00-02Project($0=[$9])
>
> 00-03  Window(window#0=[window(partition {} order by [] range
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
>
> 00-04Project(col_int=[$4], col_bigint=[$7], col_char_2=[$2],
> col_vchar_52=[$1], col_tmstmp=[$0], col_dt=[$3], col_booln=[$6],
> col_dbl=[$8], col_tm=[$5])
>
> 00-05  Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
> selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
> columns=[`col_int`, `col_bigint`, `col_char_2`, `col_vchar_52`,
> `col_tmstmp`, `col_dt`, `col_booln`, `col_dbl`, `col_tm`]]])
>
>
> 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq
> group by col_char_2;
>
> *+-+*
>
> *| **EXPR$0 ** |*
>
> *+-+*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *| *AZ * |*
>
> *+-+*
>
> 18 rows selected (0.27 seconds)
>
>
> Explain plan for the above query that uses group by
>
>
> *| *00-00Screen
>
> 00-01  Project(EXPR$0=[$0])
>
> 00-02Project($0=[$2])
>
> 00-03  Window(window#0=[window(partition {} order by [] range
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
>
> 00-04HashAgg(group=[{0}], agg#0=[MIN($1)])
>
> 00-05  Scan(groupscan=[ParquetGroupScan
> [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
> selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
> columns=[`col_char_2`, `col_int`]]])
>
>
> Thanks,
>
> Khurram
>



-- 

Abdelhakim Deneche

Software Engineer

  


Now Available - Free Hadoop On-Demand Training



[jira] [Created] (DRILL-3349) Resolve whether BoundCheckingAccessor still needs to exist

2015-06-23 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3349:
-

 Summary: Resolve whether BoundCheckingAccessor still needs to exist
 Key: DRILL-3349
 URL: https://issues.apache.org/jira/browse/DRILL-3349
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


BoundCheckingAccessor's getObject(int rowOffset) method suppresses index errors 
that would normally occur for bad values of rowOffset.

It's not clear why it does that (its other get methods don't) or whether it 
still needs to exist.

If it no longer needs to exist, it should be excised.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3350) query of csv with 65k columns causes Drill Explorer exception

2015-06-23 Thread Nick Amato (JIRA)
Nick Amato created DRILL-3350:
-

 Summary: query of csv with 65k columns causes Drill Explorer 
exception 
 Key: DRILL-3350
 URL: https://issues.apache.org/jira/browse/DRILL-3350
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Nick Amato
Assignee: Chris Westin
Priority: Minor


The error is in a dialogue window, "Unexpected end of content while loading 
JArray.  Path '[9506]', line 1, position 65536.".

I will attach the file.  Steps to reproduce:
- Open Drill Explorer -- I used Drill 1.0.0 (both on ODBC side and cluster)
running on a 5.0.0 build from about two weeks ago.
- Browse under dfs.root to a csv file with 65k columns.  I was able to 
reproduce it
with these size files:
- 5000 columns OK
- 20k, 30k, 50k, 65k failed
- Selecting the file causes the following exception:

See the end of this message for details on invoking 
just-in-time (JIT) debugging instead of this dialog box.

** Exception Text **
Newtonsoft.Json.JsonReaderException: Unexpected end of content while loading
JArray. Path '[9506]', line 1, position 65536.
   at Newtonsoft.Json.Linq.JContainer.ReadTokenFrom(JsonReader reader)
   at Newtonsoft.Json.Linq.JArray.Load(JsonReader reader)
   at Newtonsoft.Json.Linq.JArray.Parse(String json)
   at DrillExplorer.DRExploreTablesDialog.ParseCSVJsonToColumns(String
jsonString)
   at DrillExplorer.DRExploreTablesDialog.RetrieveCSVMetadata(String
schemaName, String path)
   at DrillExplorer.DRExploreTablesDialog.BrowseDfsNode(DRTreeNode dfsNode)
   at DrillExplorer.DRExploreTablesDialog.browseTreeView_AfterSelect(Object
sender, TreeViewEventArgs e)
   at System.Windows.Forms.TreeView.TvnSelected(NMTREEVIEW* nmtv)
   at System.Windows.Forms.TreeView.WmNotify(Message& m)
   at System.Windows.Forms.TreeView.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr
wparam, IntPtr lparam)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 35739: Patch for DRILL-3333

2015-06-23 Thread Steven Phillips

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35739/
---

(Updated June 24, 2015, 12:39 a.m.)


Review request for drill.


Bugs: DRILL-
https://issues.apache.org/jira/browse/DRILL-


Repository: drill-git


Description (updated)
---

DRILL-: Parquet writer auto-partitioning and partition pruning


Diffs (updated)
-

  exec/java-exec/src/main/codegen/templates/AbstractRecordWriter.java 
6b6065f6b6c8469aa548acf194e0621b9f4ffea8 
  exec/java-exec/src/main/codegen/templates/EventBasedRecordWriter.java 
797f3cb8c83a89821ee46ce0b093f81406fa6067 
  exec/java-exec/src/main/codegen/templates/NewValueFunctions.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/RecordWriter.java 
c6325fd0a5c7d7cb5f3628df1ecf9c01c264ed52 
  exec/java-exec/src/main/codegen/templates/StringOutputRecordWriter.java 
f704cca0e4d62ca1435df84d9eb1b07b32ea8b39 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScan.java
 5c4ee4da9e0542244b0f71a520cea1c3a2d49a66 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/GroupScan.java 
2d16cd01b94ed8a5463c0e2fb896f019133f7f03 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java
 d5d64a722ed6d9b5d97158046e6838f07c0d5381 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
 d9b1354492454dcd2630c72f5dbc1c3badf958c7 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/PruneScanRule.java
 2544d34bd4e531e48df66c8efba3cf6b6a6e6142 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java
 1e63748fe459760348501ad1fbe29553bee7264c 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/NewValueFunction.java 
PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonRecordWriter.java
 a43a4a0f21bf11f29b6385e36db4d25003ffa98f 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java
 cf39518b2a8b4564504a3971d1f89c268aee4b30 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java
 621f05c4d50ecf83071a5df414be88e7471f0490 
  
exec/java-exec/src/main/java/org/apache/drill/exec/store/text/DrillTextRecordWriter.java
 31b1fbe9e03282161ee125cb7a4b2f53c8a8da63 
  exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 
6d03d8176d3686740fd8fbcc158e664b3a3eeed9 

Diff: https://reviews.apache.org/r/35739/diff/


Testing
---


Thanks,

Steven Phillips



Re: Wrong results - windowing query over view

2015-06-23 Thread Khurram Faraaz
The second query (below) that uses group by is not supported by Postgres, I
will file a JIRA to block that query.

SELECT MIN(col_int) OVER() FROM vwOnParq group by col_char_2;

Output from Postgres

postgres=# select min(col_int) over() from all_typs_tbl group by col_char_2;

ERROR:  column "all_typs_tbl.col_int" must appear in the GROUP BY clause or
be used in an aggregate function

LINE 1: select min(col_int) over() from all_typs_tbl group by col_ch...


Querying over that original parquet file using which the view was created,
we see an assertion error


0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM
`tblForView/0_0_0.parquet` group by col_char_2;

*Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while
converting MIN(`tblForView/0_0_0.parquet`.`col_int`)*



*[Error Id: e8ed279d-aa8c-4db1-9906-5dd7fdecaac2 on centos-02.qa.lab:31010]
(state=,code=0)*

On Tue, Jun 23, 2015 at 5:31 PM, Abdel Hakim Deneche 
wrote:

> What happens if you run the queries on the original parquet files and not
> the views ?
>
> On Tue, Jun 23, 2015 at 5:28 PM, Khurram Faraaz 
> wrote:
>
> > Windowing query over a view returns wrong results when used with and
> > without a group by clause. Please let me know if this is a planning bug ?
> > Postgres does not support the query where we use a group by.
> >
> > DDL used for view creation was,
> >
> > create view vwOnParq (col_int, col_bigint, col_char_2, col_vchar_52,
> > col_tmstmp, col_dt, col_booln, col_dbl, col_tm) as select col_int,
> > col_bigint, col_char_2, col_vchar_52, col_tmstmp, col_dt, col_booln,
> > col_dbl, col_tm from `tblForView/0_0_0.parquet`;
> >
> >
> > The two queries are,
> >
> >
> > 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq;
> >
> > *+-+*
> >
> > *| **EXPR$0 ** |*
> >
> > *+-+*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *| *-19* |*
> >
> > *+-+*
> >
> > 30 rows selected (0.26 seconds)
> >
> >
> > Explain plan for the above query
> >
> >
> > *| *00-00Screen
> >
> > 00-01  Project(EXPR$0=[$0])
> >
> > 00-02Project($0=[$9])
> >
> > 00-03  Window(window#0=[window(partition {} order by [] range
> > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
> >
> > 00-04Project(col_int=[$4], col_bigint=[$7], col_char_2=[$2],
> > col_vchar_52=[$1], col_tmstmp=[$0], col_dt=[$3], col_booln=[$6],
> > col_dbl=[$8], col_tm=[$5])
> >
> > 00-05  Scan(groupscan=[ParquetGroupScan
> > [entries=[ReadEntryWithPath
> [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
> > selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
> > columns=[`col_int`, `col_bigint`, `col_char_2`, `col_vchar_52`,
> > `col_tmstmp`, `col_dt`, `col_booln`, `col_dbl`, `col_tm`]]])
> >
> >
> > 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq
> > group by col_char_2;
> >
> > *+-+*
> >
> > *| **EXPR$0 ** |*
> >
> > *+-+*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *| *AZ * |*
> >
> > *+-+*
> >
> > 18 rows selected (0.27 seconds)
> >
> >
> > Explain plan for the above query that uses group by
> >
> >
> > *| *00-00Screen
> >
> > 00-01  Project(EXPR$0=[$0])
> >
> > 00-02Project($0=[$2])
> >
> > 00-03  Window(window#0=[window(partition {} order by [] range
> > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
> >
> > 00-04HashAgg(group=[{0}], agg#0=[MIN($1)])
> >
> > 00-05  Scan(groupscan=[ParquetGroupScan
> > [entries=[ReadEntryWithPath
> [path=maprfs:///tmp/tblForView/0_0_0.parquet]],
> > selectionRoot=/tmp/tblForView/0_0_0.parquet, numFiles=1,
> > columns=[`col_char_2`, `col_int`]]])
> >
> >
> > Thanks,
> >
> > Khurram
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_cam

[jira] [Created] (DRILL-3351) Invalid query must be caught earlier

2015-06-23 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3351:
-

 Summary: Invalid query must be caught earlier
 Key: DRILL-3351
 URL: https://issues.apache.org/jira/browse/DRILL-3351
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.1.0
 Environment: 8815eb7d
Reporter: Khurram Faraaz
Assignee: Jinfeng Ni


The below query is not valid and we must report an error instead of returning 
results. Postgres doe not support this kind of a query.

Drill returns some results, we must instead report an error to user.
{code}
0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM vwOnParq group by 
col_char_2;
+-+
| EXPR$0  |
+-+
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
| AZ  |
+-+
18 rows selected (0.27 seconds)
{code}

Output from Postgres
{code}
postgres=# select min(col_int) over() from all_typs_tbl group by col_char_2;
ERROR:  column "all_typs_tbl.col_int" must appear in the GROUP BY clause or be 
used in an aggregate function
LINE 1: select min(col_int) over() from all_typs_tbl group by col_ch...
{code}

Querying the original parquet file that was used to create the view, returns an 
assertion error
{code}
0: jdbc:drill:schema=dfs.tmp> SELECT MIN(col_int) OVER() FROM 
`tblForView/0_0_0.parquet` group by col_char_2;
Error: SYSTEM ERROR: java.lang.AssertionError: Internal error: while converting 
MIN(`tblForView/0_0_0.parquet`.`col_int`)


[Error Id: e8ed279d-aa8c-4db1-9906-5dd7fdecaac2 on centos-02.qa.lab:31010] 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] Allowing the option to use github pull requests in place of reviewboard

2015-06-23 Thread Chris Westin
And I'll bet GitHub is a lot faster too.

On Tue, Jun 23, 2015 at 2:23 PM, Hanifi Gunes  wrote:

> +1
>
> At the very least GitHub will be UP.
>
> On Tue, Jun 23, 2015 at 2:18 PM, Parth Chandra 
> wrote:
>
> > +1 on trying this. RB has been pretty painful to us.
> >
> >
> >
> > On Mon, Jun 22, 2015 at 9:45 PM, Matthew Burgess 
> > wrote:
> >
> > > Is Travis   a viable option for the GitHub
> > route?
> > > I
> > > use it for my own projects to build pull requests (with additional code
> > > quality targets like CheckStyle, PMD, etc.). Perhaps that would take
> some
> > > of
> > > the burden off the reviewers and let them focus on the proposed
> > > implementations, rather than some of the more tedious aspects of each
> > > review.
> > >
> > > From:  Jacques Nadeau 
> > > Reply-To:  
> > > Date:  Monday, June 22, 2015 at 10:22 PM
> > > To:  "dev@drill.apache.org" 
> > > Subject:  Re: [DISCUSS] Allowing the option to use github pull requests
> > in
> > > place of reviewboard
> > >
> > > I'm up for this if we deprecate the old way.  Having two different
> > > processes seems like overkill.  In general, I find the review interface
> > of
> > > GitHub less expressive/clear but everything else is way better.
> > >
> > > On Mon, Jun 22, 2015 at 6:59 PM, Steven Phillips <
> sphill...@maprtech.com
> > >
> > > wrote:
> > >
> > > >  +1
> > > >
> > > >  I am in favor of giving this a try.
> > > >
> > > >  If I remember correctly, the reason we abandoned pull requests
> > > originally
> > > >  was because we couldn't close the pull requests through Github. A
> > > solution
> > > >  could be for whoever pushes the commit to the apache git repo to add
> > the
> > > >  Line "Closes ". Github would then automatically
> close
> > > the
> > > >  pull request.
> > > >
> > > >  On Mon, Jun 22, 2015 at 1:02 PM, Jason Altekruse <
> > > altekruseja...@gmail.com
> > > >>  >
> > > >  wrote:
> > > >
> > > >>  > Hello Drill developers,
> > > >>  >
> > > >>  > I am writing this message today to propose allowing the use of
> > github
> > > >  pull
> > > >>  > requests to perform reviews in place of the apache reviewboard
> > > instance.
> > > >>  >
> > > >>  > Reviewboard has caused a number of headaches in the past few
> > months,
> > > and
> > > >  I
> > > >>  > think its time to evaluate the benefits of the apache
> > infrastructure
> > > >>  > relative to the actual cost of using it in practice.
> > > >>  >
> > > >>  > For clarity of the discussion, we cannot use the complete github
> > > >  workflow.
> > > >>  > Comitters will still need to use patch files, or check out the
> > branch
> > > >  used
> > > >>  > in the review request and push to apache master manually. I am
> not
> > > >>  > advocating for using a merging strategy with git, just for using
> > the
> > > >  github
> > > >>  > web UI for reviews. I expect anyone generating a chain of commits
> > as
> > > >>  > described below to use the rebasing workflow we do today.
> > > Additionally
> > > >  devs
> > > >>  > should only be breaking up work to make it easier to review, we
> > will
> > > not
> > > >  be
> > > >>  > reviewing branches that contain a bunch of useless WIP commits.
> > > >>  >
> > > >>  > A few examples of problems I have experienced with reviewboard
> > > include:
> > > >>  > corruption of patches when they are downloaded, the web interface
> > > showing
> > > >>  > inconsistent content from the raw diff, and random rejection of
> > > patches
> > > >>  > that are based directly on the head of apache master.
> > > >>  >
> > > >>  > These are all serious blockers for getting code reviewed and
> > > integrated
> > > >>  > into the master branch in a timely manner.
> > > >>  >
> > > >>  > In addition to serious bugs in reviewboard, there are a number of
> > > >>  > difficulties with the combination of our typical dev workflow and
> > how
> > > >>  > reviewboard works with patches. As we are still adding features
> to
> > > Drill,
> > > >>  > we often have several weeks of work to submit in response to a
> JIRA
> > > or
> > > >>  > series of related JIRAs. Sometimes this work can be broken up
> into
> > > >>  > independent reviewable units, and other times it cannot. When a
> > > series of
> > > >>  > changes requires a mixture of refactoring and additions, the
> > process
> > > is
> > > >>  > currently quite painful. Ether reviewers need to look through a
> > giant
> > > >  messy
> > > >>  > diff, or the submitters need to do a lot of extra work. This
> > > involves not
> > > >>  > only organizing their work into a reviewable series of commits,
> but
> > > also
> > > >>  > generating redundant squashed versions of the intermediate work
> to
> > > make
> > > >>  > reviewboard happy.
> > > >>  >
> > > >>  > For a relatively simple 3 part change, this involves creating 3
> > > >  reviewboard
> > > >>  > pages. The first will contain the first commit by itself. The
> > second
> > > will
> > > >>  > have the first commits patch as a parent

Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Chris Westin
It's never been clear to me why these are in the same repo at all. If
someone wants to go off and build a plug-in, they don't really need to be
part of the project. Being part of the project brings a certain expectation
by users of support from developers, and Aditya's explanation basically
negates that.

On Tue, Jun 23, 2015 at 4:38 PM, Jason Altekruse 
wrote:

> In light of Aditya's explanation I think this is a good way to clarify the
> relationship between Drill, Hive and Hbase. These are core plugins that we
> are planning to support completely, just as we do with the HDFS API in the
> FileSystemPlugin, which is currently a part of the exec module.
>
> +1
>
> On Tue, Jun 23, 2015 at 4:31 PM, Aditya  wrote:
>
> > I have always interpreted "contrib" as an experimental code which are
> > provided "as is"[1], not fully validated by the main project and possibly
> > has not reached the same maturity as the rest of the project.
> >
> > It will help us in communicate the maturity of few plugins like hbase,
> hive
> > and mongo as such while accepting more user contributions say cassandra,
> > couchbase into Drill's code base.
> >
> > [1]
> >
> http://stackoverflow.com/questions/7328852/what-are-apache-contrib-modules
> >
> > On Tue, Jun 23, 2015 at 2:33 PM, Jacques Nadeau 
> > wrote:
> >
> > > Unless we move contrib out of the build (and to a different repo), I'm
> > not
> > > sure what the change really means.  Thoughts on this?
> > >
> > > On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra 
> > > wrote:
> > >
> > > > I'd be in favor of doing this in the 1.2 release cycle.
> > > >
> > > > On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:
> > > >
> > > > > Few of the storage plugins like HBase and Hive have matured enough
> to
> > > be
> > > > > moved out of contrib and into the mainline, probably under
> > > > "exec/storage".
> > > > >
> > > > > If people think that it is time to do that, I can take this up.
> > > > >
> > > > > aditya...
> > > > >
> > > >
> > >
> >
>


Re: Review Request 35739: Patch for DRILL-3333

2015-06-23 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35739/#review89113
---

Ship it!



exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java (line 41)


wip ?


- Aman Sinha


On June 24, 2015, 12:39 a.m., Steven Phillips wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35739/
> ---
> 
> (Updated June 24, 2015, 12:39 a.m.)
> 
> 
> Review request for drill.
> 
> 
> Bugs: DRILL-
> https://issues.apache.org/jira/browse/DRILL-
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-: Parquet writer auto-partitioning and partition pruning
> 
> 
> Diffs
> -
> 
>   exec/java-exec/src/main/codegen/templates/AbstractRecordWriter.java 
> 6b6065f6b6c8469aa548acf194e0621b9f4ffea8 
>   exec/java-exec/src/main/codegen/templates/EventBasedRecordWriter.java 
> 797f3cb8c83a89821ee46ce0b093f81406fa6067 
>   exec/java-exec/src/main/codegen/templates/NewValueFunctions.java 
> PRE-CREATION 
>   exec/java-exec/src/main/codegen/templates/RecordWriter.java 
> c6325fd0a5c7d7cb5f3628df1ecf9c01c264ed52 
>   exec/java-exec/src/main/codegen/templates/StringOutputRecordWriter.java 
> f704cca0e4d62ca1435df84d9eb1b07b32ea8b39 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractGroupScan.java
>  5c4ee4da9e0542244b0f71a520cea1c3a2d49a66 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/GroupScan.java
>  2d16cd01b94ed8a5463c0e2fb896f019133f7f03 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/WriterRecordBatch.java
>  d5d64a722ed6d9b5d97158046e6838f07c0d5381 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/ParquetPartitionDescriptor.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillRuleSets.java
>  d9b1354492454dcd2630c72f5dbc1c3badf958c7 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/partition/PruneScanRule.java
>  2544d34bd4e531e48df66c8efba3cf6b6a6e6142 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java
>  1e63748fe459760348501ad1fbe29553bee7264c 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/NewValueFunction.java
>  PRE-CREATION 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JsonRecordWriter.java
>  a43a4a0f21bf11f29b6385e36db4d25003ffa98f 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetGroupScan.java
>  cf39518b2a8b4564504a3971d1f89c268aee4b30 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordWriter.java
>  621f05c4d50ecf83071a5df414be88e7471f0490 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/store/text/DrillTextRecordWriter.java
>  31b1fbe9e03282161ee125cb7a4b2f53c8a8da63 
>   exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java 
> 6d03d8176d3686740fd8fbcc158e664b3a3eeed9 
> 
> Diff: https://reviews.apache.org/r/35739/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Steven Phillips
> 
>



Re: [DISCUSS] Allowing the option to use github pull requests in place of reviewboard

2015-06-23 Thread Ted Dunning
Travis won't likely be faster for the CI part, but it is super easy to set
up so it can't hurt to try it out.

On Tue, Jun 23, 2015 at 9:10 PM, Chris Westin 
wrote:

> And I'll bet GitHub is a lot faster too.
>
> On Tue, Jun 23, 2015 at 2:23 PM, Hanifi Gunes  wrote:
>
> > +1
> >
> > At the very least GitHub will be UP.
> >
> > On Tue, Jun 23, 2015 at 2:18 PM, Parth Chandra 
> > wrote:
> >
> > > +1 on trying this. RB has been pretty painful to us.
> > >
> > >
> > >
> > > On Mon, Jun 22, 2015 at 9:45 PM, Matthew Burgess 
> > > wrote:
> > >
> > > > Is Travis   a viable option for the GitHub
> > > route?
> > > > I
> > > > use it for my own projects to build pull requests (with additional
> code
> > > > quality targets like CheckStyle, PMD, etc.). Perhaps that would take
> > some
> > > > of
> > > > the burden off the reviewers and let them focus on the proposed
> > > > implementations, rather than some of the more tedious aspects of each
> > > > review.
> > > >
> > > > From:  Jacques Nadeau 
> > > > Reply-To:  
> > > > Date:  Monday, June 22, 2015 at 10:22 PM
> > > > To:  "dev@drill.apache.org" 
> > > > Subject:  Re: [DISCUSS] Allowing the option to use github pull
> requests
> > > in
> > > > place of reviewboard
> > > >
> > > > I'm up for this if we deprecate the old way.  Having two different
> > > > processes seems like overkill.  In general, I find the review
> interface
> > > of
> > > > GitHub less expressive/clear but everything else is way better.
> > > >
> > > > On Mon, Jun 22, 2015 at 6:59 PM, Steven Phillips <
> > sphill...@maprtech.com
> > > >
> > > > wrote:
> > > >
> > > > >  +1
> > > > >
> > > > >  I am in favor of giving this a try.
> > > > >
> > > > >  If I remember correctly, the reason we abandoned pull requests
> > > > originally
> > > > >  was because we couldn't close the pull requests through Github. A
> > > > solution
> > > > >  could be for whoever pushes the commit to the apache git repo to
> add
> > > the
> > > > >  Line "Closes ". Github would then automatically
> > close
> > > > the
> > > > >  pull request.
> > > > >
> > > > >  On Mon, Jun 22, 2015 at 1:02 PM, Jason Altekruse <
> > > > altekruseja...@gmail.com
> > > > >>  >
> > > > >  wrote:
> > > > >
> > > > >>  > Hello Drill developers,
> > > > >>  >
> > > > >>  > I am writing this message today to propose allowing the use of
> > > github
> > > > >  pull
> > > > >>  > requests to perform reviews in place of the apache reviewboard
> > > > instance.
> > > > >>  >
> > > > >>  > Reviewboard has caused a number of headaches in the past few
> > > months,
> > > > and
> > > > >  I
> > > > >>  > think its time to evaluate the benefits of the apache
> > > infrastructure
> > > > >>  > relative to the actual cost of using it in practice.
> > > > >>  >
> > > > >>  > For clarity of the discussion, we cannot use the complete
> github
> > > > >  workflow.
> > > > >>  > Comitters will still need to use patch files, or check out the
> > > branch
> > > > >  used
> > > > >>  > in the review request and push to apache master manually. I am
> > not
> > > > >>  > advocating for using a merging strategy with git, just for
> using
> > > the
> > > > >  github
> > > > >>  > web UI for reviews. I expect anyone generating a chain of
> commits
> > > as
> > > > >>  > described below to use the rebasing workflow we do today.
> > > > Additionally
> > > > >  devs
> > > > >>  > should only be breaking up work to make it easier to review, we
> > > will
> > > > not
> > > > >  be
> > > > >>  > reviewing branches that contain a bunch of useless WIP commits.
> > > > >>  >
> > > > >>  > A few examples of problems I have experienced with reviewboard
> > > > include:
> > > > >>  > corruption of patches when they are downloaded, the web
> interface
> > > > showing
> > > > >>  > inconsistent content from the raw diff, and random rejection of
> > > > patches
> > > > >>  > that are based directly on the head of apache master.
> > > > >>  >
> > > > >>  > These are all serious blockers for getting code reviewed and
> > > > integrated
> > > > >>  > into the master branch in a timely manner.
> > > > >>  >
> > > > >>  > In addition to serious bugs in reviewboard, there are a number
> of
> > > > >>  > difficulties with the combination of our typical dev workflow
> and
> > > how
> > > > >>  > reviewboard works with patches. As we are still adding features
> > to
> > > > Drill,
> > > > >>  > we often have several weeks of work to submit in response to a
> > JIRA
> > > > or
> > > > >>  > series of related JIRAs. Sometimes this work can be broken up
> > into
> > > > >>  > independent reviewable units, and other times it cannot. When a
> > > > series of
> > > > >>  > changes requires a mixture of refactoring and additions, the
> > > process
> > > > is
> > > > >>  > currently quite painful. Ether reviewers need to look through a
> > > giant
> > > > >  messy
> > > > >>  > diff, or the submitters need to do a lot of extra work. This
> > > > involves not
> > > > >>  > o

Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Ted Dunning
As a transitional stage, it does make sense.

Having separated builds at least for a few examples is really helpful for
the people interested in extending Drill without necessarily contributing
their code back in (at least not right away).

That was my motive behind
https://github.com/mapr-demos/simple-drill-functions



On Tue, Jun 23, 2015 at 9:17 PM, Chris Westin 
wrote:

> It's never been clear to me why these are in the same repo at all. If
> someone wants to go off and build a plug-in, they don't really need to be
> part of the project. Being part of the project brings a certain expectation
> by users of support from developers, and Aditya's explanation basically
> negates that.
>
> On Tue, Jun 23, 2015 at 4:38 PM, Jason Altekruse  >
> wrote:
>
> > In light of Aditya's explanation I think this is a good way to clarify
> the
> > relationship between Drill, Hive and Hbase. These are core plugins that
> we
> > are planning to support completely, just as we do with the HDFS API in
> the
> > FileSystemPlugin, which is currently a part of the exec module.
> >
> > +1
> >
> > On Tue, Jun 23, 2015 at 4:31 PM, Aditya  wrote:
> >
> > > I have always interpreted "contrib" as an experimental code which are
> > > provided "as is"[1], not fully validated by the main project and
> possibly
> > > has not reached the same maturity as the rest of the project.
> > >
> > > It will help us in communicate the maturity of few plugins like hbase,
> > hive
> > > and mongo as such while accepting more user contributions say
> cassandra,
> > > couchbase into Drill's code base.
> > >
> > > [1]
> > >
> >
> http://stackoverflow.com/questions/7328852/what-are-apache-contrib-modules
> > >
> > > On Tue, Jun 23, 2015 at 2:33 PM, Jacques Nadeau 
> > > wrote:
> > >
> > > > Unless we move contrib out of the build (and to a different repo),
> I'm
> > > not
> > > > sure what the change really means.  Thoughts on this?
> > > >
> > > > On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra <
> pchan...@maprtech.com>
> > > > wrote:
> > > >
> > > > > I'd be in favor of doing this in the 1.2 release cycle.
> > > > >
> > > > > On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:
> > > > >
> > > > > > Few of the storage plugins like HBase and Hive have matured
> enough
> > to
> > > > be
> > > > > > moved out of contrib and into the mainline, probably under
> > > > > "exec/storage".
> > > > > >
> > > > > > If people think that it is time to do that, I can take this up.
> > > > > >
> > > > > > aditya...
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Jacques Nadeau
Ted,

That's a great repo.  Do you have a blog entry that can drive to it for
others?

On Tue, Jun 23, 2015 at 9:34 PM, Ted Dunning  wrote:

> As a transitional stage, it does make sense.
>
> Having separated builds at least for a few examples is really helpful for
> the people interested in extending Drill without necessarily contributing
> their code back in (at least not right away).
>
> That was my motive behind
> https://github.com/mapr-demos/simple-drill-functions
>
>
>
> On Tue, Jun 23, 2015 at 9:17 PM, Chris Westin 
> wrote:
>
> > It's never been clear to me why these are in the same repo at all. If
> > someone wants to go off and build a plug-in, they don't really need to be
> > part of the project. Being part of the project brings a certain
> expectation
> > by users of support from developers, and Aditya's explanation basically
> > negates that.
> >
> > On Tue, Jun 23, 2015 at 4:38 PM, Jason Altekruse <
> altekruseja...@gmail.com
> > >
> > wrote:
> >
> > > In light of Aditya's explanation I think this is a good way to clarify
> > the
> > > relationship between Drill, Hive and Hbase. These are core plugins that
> > we
> > > are planning to support completely, just as we do with the HDFS API in
> > the
> > > FileSystemPlugin, which is currently a part of the exec module.
> > >
> > > +1
> > >
> > > On Tue, Jun 23, 2015 at 4:31 PM, Aditya 
> wrote:
> > >
> > > > I have always interpreted "contrib" as an experimental code which are
> > > > provided "as is"[1], not fully validated by the main project and
> > possibly
> > > > has not reached the same maturity as the rest of the project.
> > > >
> > > > It will help us in communicate the maturity of few plugins like
> hbase,
> > > hive
> > > > and mongo as such while accepting more user contributions say
> > cassandra,
> > > > couchbase into Drill's code base.
> > > >
> > > > [1]
> > > >
> > >
> >
> http://stackoverflow.com/questions/7328852/what-are-apache-contrib-modules
> > > >
> > > > On Tue, Jun 23, 2015 at 2:33 PM, Jacques Nadeau 
> > > > wrote:
> > > >
> > > > > Unless we move contrib out of the build (and to a different repo),
> > I'm
> > > > not
> > > > > sure what the change really means.  Thoughts on this?
> > > > >
> > > > > On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra <
> > pchan...@maprtech.com>
> > > > > wrote:
> > > > >
> > > > > > I'd be in favor of doing this in the 1.2 release cycle.
> > > > > >
> > > > > > On Thu, Jun 18, 2015 at 6:30 PM, Aditya  wrote:
> > > > > >
> > > > > > > Few of the storage plugins like HBase and Hive have matured
> > enough
> > > to
> > > > > be
> > > > > > > moved out of contrib and into the mainline, probably under
> > > > > > "exec/storage".
> > > > > > >
> > > > > > > If people think that it is time to do that, I can take this up.
> > > > > > >
> > > > > > > aditya...
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Review Request 35673: DRILL-3326: Query with unsupported windows function containing "AS" blocks correct error message

2015-06-23 Thread Aman Sinha

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35673/#review89130
---

Ship it!


Ship It!

- Aman Sinha


On June 24, 2015, 12:15 a.m., Sean Hsuan-Yi Chu wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35673/
> ---
> 
> (Updated June 24, 2015, 12:15 a.m.)
> 
> 
> Review request for drill and Aman Sinha.
> 
> 
> Bugs: DRILL-3326
> https://issues.apache.org/jira/browse/DRILL-3326
> 
> 
> Repository: drill-git
> 
> 
> Description
> ---
> 
> DRILL-3326: When inspecting SELECT-LIST, UnsupportedOperatorsVisitor will dig 
> into AS clause
> 
> 
> Diffs
> -
> 
>   
> exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
>  544a838 
>   exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
> 1c8b0db 
> 
> Diff: https://reviews.apache.org/r/35673/diff/
> 
> 
> Testing
> ---
> 
> Unit, others are on the way
> 
> 
> Thanks,
> 
> Sean Hsuan-Yi Chu
> 
>



Re: Moving matured storage plugins out of contrib

2015-06-23 Thread Ted Dunning
Not yet.



On Wed, Jun 24, 2015 at 12:38 AM, Jacques Nadeau  wrote:

> Ted,
>
> That's a great repo.  Do you have a blog entry that can drive to it for
> others?
>
> On Tue, Jun 23, 2015 at 9:34 PM, Ted Dunning 
> wrote:
>
> > As a transitional stage, it does make sense.
> >
> > Having separated builds at least for a few examples is really helpful for
> > the people interested in extending Drill without necessarily contributing
> > their code back in (at least not right away).
> >
> > That was my motive behind
> > https://github.com/mapr-demos/simple-drill-functions
> >
> >
> >
> > On Tue, Jun 23, 2015 at 9:17 PM, Chris Westin 
> > wrote:
> >
> > > It's never been clear to me why these are in the same repo at all. If
> > > someone wants to go off and build a plug-in, they don't really need to
> be
> > > part of the project. Being part of the project brings a certain
> > expectation
> > > by users of support from developers, and Aditya's explanation basically
> > > negates that.
> > >
> > > On Tue, Jun 23, 2015 at 4:38 PM, Jason Altekruse <
> > altekruseja...@gmail.com
> > > >
> > > wrote:
> > >
> > > > In light of Aditya's explanation I think this is a good way to
> clarify
> > > the
> > > > relationship between Drill, Hive and Hbase. These are core plugins
> that
> > > we
> > > > are planning to support completely, just as we do with the HDFS API
> in
> > > the
> > > > FileSystemPlugin, which is currently a part of the exec module.
> > > >
> > > > +1
> > > >
> > > > On Tue, Jun 23, 2015 at 4:31 PM, Aditya 
> > wrote:
> > > >
> > > > > I have always interpreted "contrib" as an experimental code which
> are
> > > > > provided "as is"[1], not fully validated by the main project and
> > > possibly
> > > > > has not reached the same maturity as the rest of the project.
> > > > >
> > > > > It will help us in communicate the maturity of few plugins like
> > hbase,
> > > > hive
> > > > > and mongo as such while accepting more user contributions say
> > > cassandra,
> > > > > couchbase into Drill's code base.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> http://stackoverflow.com/questions/7328852/what-are-apache-contrib-modules
> > > > >
> > > > > On Tue, Jun 23, 2015 at 2:33 PM, Jacques Nadeau <
> jacq...@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Unless we move contrib out of the build (and to a different
> repo),
> > > I'm
> > > > > not
> > > > > > sure what the change really means.  Thoughts on this?
> > > > > >
> > > > > > On Tue, Jun 23, 2015 at 2:22 PM, Parth Chandra <
> > > pchan...@maprtech.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I'd be in favor of doing this in the 1.2 release cycle.
> > > > > > >
> > > > > > > On Thu, Jun 18, 2015 at 6:30 PM, Aditya 
> wrote:
> > > > > > >
> > > > > > > > Few of the storage plugins like HBase and Hive have matured
> > > enough
> > > > to
> > > > > > be
> > > > > > > > moved out of contrib and into the mainline, probably under
> > > > > > > "exec/storage".
> > > > > > > >
> > > > > > > > If people think that it is time to do that, I can take this
> up.
> > > > > > > >
> > > > > > > > aditya...
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (DRILL-3352) Extra re-distribution when evaluating window function after GROUP BY

2015-06-23 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-3352:
-

 Summary: Extra re-distribution when evaluating window function 
after GROUP BY
 Key: DRILL-3352
 URL: https://issues.apache.org/jira/browse/DRILL-3352
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.0.0
Reporter: Aman Sinha
Assignee: Aman Sinha


Consider the following query and plan: 
{code}
explain plan for select min(l_partkey) over (partition by l_suppkey) from 
lineitem group by l_partkey, l_suppkey limit 1;

00-00Screen
00-01  Project(EXPR$0=[$0])
00-02SelectionVectorRemover
00-03  Limit(fetch=[1])
00-04UnionExchange
01-01  Project($0=[$3])
01-02Window(window#0=[window(partition {1} order by [] range 
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [MIN($0)])])
01-03  SelectionVectorRemover
01-04Sort(sort0=[$1], dir0=[ASC])
01-05  Project(l_partkey=[$0], l_suppkey=[$1], $f2=[$2])
01-06HashToRandomExchange(dist0=[[$1]])
02-01  UnorderedMuxExchange
03-01Project(l_partkey=[$0], l_suppkey=[$1], 
$f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1))])
03-02  HashAgg(group=[{0, 1}], agg#0=[MIN($2)])
03-03Project(l_partkey=[$0], l_suppkey=[$1], 
$f2=[$2])
03-04  HashToRandomExchange(dist0=[[$0]], 
dist1=[[$1]])
04-01UnorderedMuxExchange
05-01  Project(l_partkey=[$0], 
l_suppkey=[$1], $f2=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($1, 
hash64AsDouble($0)))])
05-02HashAgg(group=[{0, 1}], 
agg#0=[MIN($0)])
05-03  Project(l_partkey=[$1], 
l_suppkey=[$0])
05-04
Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=file:/Users/asinha/data/tpch-sf1/lineitem]], 
selectionRoot=/Users/asinha/data/tpch-sf1/lineitem, numFiles=1, 
columns=[`l_partkey`, `l_suppkey`]]])
{code}

Here, we do a distribution for the HashAgg on 2 columns: {l_partkey, 
l_suppkey}.  Subsequently, we re-distribute on {l_suppkey} only since the 
window function has a partition-by l_suppkey.  The second re-distribute could 
be avoided if the first distribution for the HashAgg was done on l_suppkey 
only.   The reason we do distribution on all grouping columns is to avoid skew 
problems.   However, in many cases especially when a window function is 
involved, it may make sense to only distribute on 1 column. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3320) Do away with "rebuffing" Drill jar

2015-06-23 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore resolved DRILL-3320.
---
  Resolution: Fixed
   Fix Version/s: 1.1.0
Target Version/s: 1.1.0

Resolved by 
[3aa82bc|https://fisheye6.atlassian.com/changelog/incubator-drill?cs=3aa82bc923cdcb15fdb918baa8b99a5bb85ef6cb].

> Do away with "rebuffing" Drill jar
> --
>
> Key: DRILL-3320
> URL: https://issues.apache.org/jira/browse/DRILL-3320
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Aditya Kishore
>Assignee: Aditya Kishore
> Fix For: 1.1.0
>
> Attachments: DRILL-3320-Do-away-with-rebuffing-Drill-jar.patch
>
>
> The maven build process for some modules (common, protocol, java-exec), 
> generate a "-rebuffed" jar during the build which are the actual jars to be 
> used. 
> We should hint maven to install and deploy these jars as the primary artifact 
> (jars without any classifiers) of that module.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)