[jira] [Resolved] (DRILL-3364) Prune scan range if the filter is on the leading field with byte comparable encoding

2015-08-14 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore resolved DRILL-3364.
---
Resolution: Fixed

Resolved by 
[645e43f|https://fisheye6.atlassian.com/changelog/incubator-drill?cs=645e43fd34ce3b70f14df4e3d21c9c04ca9314f1].

> Prune scan range if the filter is on the leading field with byte comparable 
> encoding
> 
>
> Key: DRILL-3364
> URL: https://issues.apache.org/jira/browse/DRILL-3364
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Storage - HBase
>Reporter: Aditya Kishore
>Assignee: Smidth Panchamia
> Fix For: 1.2.0
>
> Attachments: 
> 0001-Add-convert_from-and-convert_to-methods-for-TIMESTAM.patch, 
> 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
> 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
> 0001-DRILL-3364-Prune-scan-range-if-the-filter-is-on-the-.patch, 
> 0001-PATCH-DRILL-3364-Prune-scan-range-if-the-filter-is-o.patch, 
> composite.jun26.diff
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3655) TIME - (minus) TIME dpesm

2015-08-14 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3655:
-

 Summary: TIME - (minus) TIME dpesm
 Key: DRILL-3655
 URL: https://issues.apache.org/jira/browse/DRILL-3655
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


0: jdbc:drill:> VALUES CURRENT_TIME - CURRENT_TIME;
Error: PARSE ERROR: From line 1, column 8 to line 1, column 34: Cannot apply 
'-' to arguments of type ' - '. Supported form(s): ' 
- '
' - '
' - '


[Error Id: ede6c073-ca82-4359-8adb-db413e051e29 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:> 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3654) FIRST_VALUE(/) returns IOB Exception

2015-08-14 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3654:
-

 Summary: FIRST_VALUE(/) returns IOB 
Exception
 Key: DRILL-3654
 URL: https://issues.apache.org/jira/browse/DRILL-3654
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.2.0
 Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
Reporter: Khurram Faraaz
Assignee: Chris Westin
Priority: Critical


FIRST_VALUE(/) returns IOB Exception on developers 
private branch.

{code}
0: jdbc:drill:schema=dfs.tmp> select first_value(col8) over(partition by col7 
order by col8) first_value_col8 from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))

Fragment 0:0

[Error Id: a4799155-ba8a-4117-9381-45ec10e02aa6 on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
0: jdbc:drill:schema=dfs.tmp> select first_value(col9) over(partition by col7 
order by col9) first_value_col8 from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
IndexOutOfBoundsException: index: 0, length: 4 (expected: range(0, 0))

Fragment 0:0

[Error Id: 30bcaf2b-98ce-4ec5-a9aa-4fa8fa1f3403 on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
0: jdbc:drill:schema=dfs.tmp> 
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3653) Assert in a query with both avg aggregate and avg window aggregate functions

2015-08-14 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3653:
---

 Summary: Assert in a query with both avg aggregate and avg window 
aggregate functions
 Key: DRILL-3653
 URL: https://issues.apache.org/jira/browse/DRILL-3653
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning & Optimization
Affects Versions: 1.2.0
Reporter: Victoria Markman
Assignee: Jinfeng Ni


It seems to be the problem with just this combination and I can't believe I did 
not find it earlier ...

{code}
0: jdbc:drill:schema=dfs> select avg(a1), avg(a1) over () from t1 group by a1;
Error: SYSTEM ERROR: AssertionError: Internal error: invariant violated: 
conversion result not null
[Error Id: 2f850005-a7f6-4215-bbc1-90da57cbb71f on atsqa4-133.qa.lab:31010] 
(state=,code=0)
{code}

Works:
{code}
0: jdbc:drill:schema=dfs> select avg(a1), sum(a1) over () from t1 group by a1;
+-+-+
| EXPR$0  | EXPR$1  |
+-+-+
| 1.0 | 47  |
| 2.0 | 47  |
| 3.0 | 47  |
| 4.0 | 47  |
| 5.0 | 47  |
| 6.0 | 47  |
| 7.0 | 47  |
| 9.0 | 47  |
| 10.0| 47  |
| null| 47  |
+-+-+
10 rows selected (0.54 seconds)
{code}

{code}
0: jdbc:drill:schema=dfs> select avg(a1), count(a1) over () from t1 group by a1;
+-+-+
| EXPR$0  | EXPR$1  |
+-+-+
| 1.0 | 9   |
| 2.0 | 9   |
| 3.0 | 9   |
| 4.0 | 9   |
| 5.0 | 9   |
| 6.0 | 9   |
| 7.0 | 9   |
| 9.0 | 9   |
| 10.0| 9   |
| null| 9   |
+-+-+
10 rows selected (0.304 seconds)
{code}

{code}
0: jdbc:drill:schema=dfs> select avg(a1), count(a1) over (), sum(a1) 
over(partition by b1) from t1 group by a1, b1;
+-+-+-+
| EXPR$0  | EXPR$1  | EXPR$2  |
+-+-+-+
| 1.0 | 9   | 1   |
| 2.0 | 9   | 2   |
| 3.0 | 9   | 3   |
| 5.0 | 9   | 5   |
| 6.0 | 9   | 6   |
| 7.0 | 9   | 7   |
| null| 9   | null|
| 9.0 | 9   | 9   |
| 10.0| 9   | 10  |
| 4.0 | 9   | 4   |
+-+-+-+
10 rows selected (0.788 seconds)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3652) Need to document order of operations with window functions and flatten

2015-08-14 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3652:
---

 Summary: Need to document order of operations with window 
functions and flatten
 Key: DRILL-3652
 URL: https://issues.apache.org/jira/browse/DRILL-3652
 Project: Apache Drill
  Issue Type: Improvement
  Components: Documentation
Affects Versions: 1.2.0
Reporter: Victoria Markman
Assignee: Bridget Bevens


In standard SQL, window functions are the last set of operations performed in a 
query except for the final order by clause. 
Using window function with flatten is a bit confusing, because it appears as an 
operator in the query plan and I expected flatten to run first followed by a 
window function.

This is not what is happening:
{code}
0: jdbc:drill:schema=dfs> select * from `complex.json`;
++---+--+
| x  | y |z |
++---+--+
| 5  | a string  | [1,2,3]  |
++---+--+
1 row selected (0.128 seconds)

0: jdbc:drill:schema=dfs> select sum(x) over(), x , y, flatten(z) from 
`complex.json`;
+-++---+-+
| EXPR$0  | x  | y | EXPR$3  |
+-++---+-+
| 5   | 5  | a string  | 1   |
| 5   | 5  | a string  | 2   |
| 5   | 5  | a string  | 3   |
+-++---+-+
3 rows selected (0.152 seconds)

0: jdbc:drill:schema=dfs> explain plan for select sum(x) over(), x , y, 
flatten(z) from `complex.json`;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  ProjectAllowDup(EXPR$0=[$0], x=[$1], y=[$2], EXPR$3=[$3])
00-02Project(w0$o0=[$3], x=[$0], y=[$1], EXPR$3=[$4])
00-03  Flatten(flattenField=[$4])
00-04Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3], 
EXPR$5=[$2])
00-05  Project(x=[$1], y=[$2], z=[$3], w0$o0=[$4])
00-06Window(window#0=[window(partition {} order by [] range 
between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1)])])
00-07  Project(T38¦¦*=[$0], x=[$1], y=[$2], z=[$3])
00-08Scan(groupscan=[EasyGroupScan 
[selectionRoot=maprfs:/drill/testdata/subqueries/complex.json, numFiles=1, 
columns=[`*`], files=[maprfs:///drill/testdata/subqueries/complex.json]]]
{code}

We should suggest to users to put flatten in a subquery if they want to run 
window function on top of the result set returned by flatten.

{code}
0: jdbc:drill:schema=dfs> select x, y, a, sum(x) over() from  ( select x , y, 
flatten(z) as a from `complex.json`);
++---++-+
| x  | y | a  | EXPR$3  |
++---++-+
| 5  | a string  | 1  | 15  |
| 5  | a string  | 2  | 15  |
| 5  | a string  | 3  | 15  |
++---++-+
3 rows selected (0.145 seconds)
{code}

I suggest we document this issue in the window function section, perhaps in 
"Usage notes".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] drill pull request: Drill 3180: Implement JDBC Storage Plugin

2015-08-14 Thread jacques-n
GitHub user jacques-n opened a pull request:

https://github.com/apache/drill/pull/115

Drill 3180: Implement JDBC Storage Plugin



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jacques-n/drill DRILL-3180

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/115.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #115


commit 075e7950a4a59d17fee11e02b1c38361070b2656
Author: MPierre 
Date:   2015-08-02T01:07:18Z

DRILL-3180: Initial JDBC plugin implementation.

commit 0c907331f5275db17d77abcc154d36e5fb6278de
Author: Jacques Nadeau 
Date:   2015-08-02T01:11:51Z

DRILL-3180: JDBC Storage Plugin updates.

- Move to leverage Calcite's JDBC adapter capabilities for pushdowns, 
schema, etc.
- Start to simplify the RecordReader classes.
- Planning flow with JDBC rules works, pushdowns not working (cost or 
function?)

commit 330f4ec9243e942aec88b9e02fa6cfb329aaad5a
Author: Jacques Nadeau 
Date:   2015-08-14T21:47:25Z

Updates to add password and get filter and join pushdown working.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: anyone seen these errors on master ?

2015-08-14 Thread Abdel Hakim Deneche
You can safely ignore the TestImpersonationMetadata error, it's most likely
caused by my change

On Fri, Aug 14, 2015 at 12:36 PM, Abdel Hakim Deneche  wrote:

> In addition to test*VectorReallocation errors I am seeing frequently on my
> linux VM, today I saw the following error multiple times on my VM and once
> on my Macbook:
>
> Tests in error:
>>   TestImpersonationMetadata>BaseTestQuery.closeClient:239 » IllegalState
>> Failure...
>> java.lang.IllegalStateException: Failure while closing accountor.
>> Expected private and shared pools to be set to initial values.  However,
>> one or more were not.  Stats are
>> zone init allocated delta
>> private 0 0 0
>> shared 3221225472 3214166779 7058693.
>> at
>> org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:200)
>> ~[classes/:na]
>> at org.apache.drill.exec.memory.Accountor.close(Accountor.java:390)
>> ~[classes/:na]
>> at
>> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:185)
>> ~[classes/:na]
>> at
>> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
>> ~[classes/:na]
>> at com.google.common.io.Closeables.close(Closeables.java:77)
>> ~[guava-14.0.1.jar:na]
>> at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
>> ~[guava-14.0.1.jar:na]
>> at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:294)
>> ~[classes/:na]
>> at
>> org.apache.drill.exec.server.Drillbit$ShutdownThread.run(Drillbit.java:332)
>> ~[classes/:na]
>
>
> On Wed, Aug 5, 2015 at 2:52 PM, Chris Westin 
> wrote:
>
>> Given that the difference is just
>>
>> > java.lang.Exception: Unexpected exception,
>> > expected
>> but
>> > was
>>
>> The question of "what constitutes an oversized allocation?" comes to mind.
>> Is this test fragile relative to being run in different environments?
>> I haven't seen the test so how is the determination that something is
>> oversized made? It seems like that criterion sometimes fails, and we get
>> an
>> OOM because whatever the request is is still very large.
>>
>>
>> On Wed, Aug 5, 2015 at 2:26 PM, Hanifi Gunes  wrote:
>>
>> > I don't seem to be able to re-prod this. Let me look at this and update
>> you
>> > all.
>> >
>> > On Thu, Aug 6, 2015 at 12:03 AM, Abdel Hakim Deneche <
>> > adene...@maprtech.com>
>> > wrote:
>> >
>> > > I didn't make any change, I'm running 2 forks (the default). I got
>> those
>> > > errors 3 times now, 2 on a linux VM and 1 on a linux physical node
>> > >
>> > > On Wed, Aug 5, 2015 at 1:03 PM, Hanifi Gunes 
>> > wrote:
>> > >
>> > > > Did you tighten your memory settings? How many forks are you running
>> > > with?
>> > > > I bet you are truly running out of memory while executing this
>> > particular
>> > > > test case.
>> > > >
>> > > > -H+
>> > > >
>> > > > On Wed, Aug 5, 2015 at 8:56 PM, Sudheesh Katkam <
>> skat...@maprtech.com>
>> > > > wrote:
>> > > >
>> > > > > b2bbd99 committed on July 6th introduced the test.
>> > > > >
>> > > > > > On Aug 5, 2015, at 10:21 AM, Jinfeng Ni 
>> > > wrote:
>> > > > > >
>> > > > > > In that case,  we probably need do binary search to figure out
>> > which
>> > > > > recent
>> > > > > > patch is causing this problem.
>> > > > > >
>> > > > > > On Wed, Aug 5, 2015 at 10:03 AM, Abdel Hakim Deneche <
>> > > > > adene...@maprtech.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > >> Just got those errors on master too
>> > > > > >>
>> > > > > >> On Wed, Aug 5, 2015 at 9:07 AM, Abdel Hakim Deneche <
>> > > > > adene...@maprtech.com
>> > > > > >>>
>> > > > > >> wrote:
>> > > > > >>
>> > > > > >>> I'm seeing those errors intermittently when building my
>> private
>> > > > > branch, I
>> > > > > >>> don't believe I made any change that would have caused them.
>> > Anyone
>> > > > > seen
>> > > > > >>> them too ?
>> > > > > >>>
>> > > > > >>>
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> testBitVectorReallocation(org.apache.drill.exec.record.vector.TestValueVector)
>> > > > >  Time elapsed: 2.043 sec  <<< ERROR!
>> > > > >  java.lang.Exception: Unexpected exception,
>> > > > > 
>> > > >
>> expected
>> > > > > >> but
>> > > > >  was
>> > > > >  at java.nio.Bits.reserveMemory(Bits.java:658)
>> > > > >  at
>> java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
>> > > > >  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
>> > > > >  at
>> > > > > 
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108)
>> > > > >  at
>> > > > > 
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> io.netty.buffer.UnpooledUnsafeDirectByteBuf.(UnpooledUnsafeDirectByteBuf.java:69)
>> > > > >  at
>> > > > > 
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:50)
>> > > > >  at
>> > > > > 
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> io.netty.buffer.AbstractByteBufAllocator.directBuf

[GitHub] drill pull request: DRILL-2743: Parquet file metadata caching

2015-08-14 Thread StevenMPhillips
GitHub user StevenMPhillips opened a pull request:

https://github.com/apache/drill/pull/114

DRILL-2743: Parquet file metadata caching



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/StevenMPhillips/incubator-drill meta2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/114.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #114


commit 371153d7dfb4cd0ad5b0042f6ca2df255e81c52f
Author: Steven Phillips 
Date:   2015-03-13T23:12:34Z

DRILL-2743: Parquet file metadata caching




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request: DRILL-3641: Doc. RecordBatch.IterOutcome (enum...

2015-08-14 Thread dsbos
GitHub user dsbos opened a pull request:

https://github.com/apache/drill/pull/113

DRILL-3641: Doc. RecordBatch.IterOutcome (enumerators and possible 
sequences).

Documented RecordBatch.IterOutcome (RecordBatch.next() protocol) much more.

Also moved AbstractRecordBatch.BatchState's documentation test from
non-documentation comments to documentation comments.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dsbos/incubator-drill bugs/drill-3641

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/113.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #113


commit 5ca94981442ea0a899c690ff1471369618a5471d
Author: dbarclay 
Date:   2015-07-31T01:54:59Z

DRILL-3641: Doc. RecordBatch.IterOutcome (enumerators and possible 
sequences).

Documented RecordBatch.IterOutcome (RecordBatch.next() protocol) much more.

Also moved AbstractRecordBatch.BatchState's documentation test from
non-documentation comments to documentation comments.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: anyone seen these errors on master ?

2015-08-14 Thread Abdel Hakim Deneche
In addition to test*VectorReallocation errors I am seeing frequently on my
linux VM, today I saw the following error multiple times on my VM and once
on my Macbook:

Tests in error:
>   TestImpersonationMetadata>BaseTestQuery.closeClient:239 » IllegalState
> Failure...
> java.lang.IllegalStateException: Failure while closing accountor.
> Expected private and shared pools to be set to initial values.  However,
> one or more were not.  Stats are
> zone init allocated delta
> private 0 0 0
> shared 3221225472 3214166779 7058693.
> at
> org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:200)
> ~[classes/:na]
> at org.apache.drill.exec.memory.Accountor.close(Accountor.java:390)
> ~[classes/:na]
> at
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:185)
> ~[classes/:na]
> at
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
> ~[classes/:na]
> at com.google.common.io.Closeables.close(Closeables.java:77)
> ~[guava-14.0.1.jar:na]
> at com.google.common.io.Closeables.closeQuietly(Closeables.java:108)
> ~[guava-14.0.1.jar:na]
> at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:294)
> ~[classes/:na]
> at
> org.apache.drill.exec.server.Drillbit$ShutdownThread.run(Drillbit.java:332)
> ~[classes/:na]


On Wed, Aug 5, 2015 at 2:52 PM, Chris Westin 
wrote:

> Given that the difference is just
>
> > java.lang.Exception: Unexpected exception,
> > expected
> but
> > was
>
> The question of "what constitutes an oversized allocation?" comes to mind.
> Is this test fragile relative to being run in different environments?
> I haven't seen the test so how is the determination that something is
> oversized made? It seems like that criterion sometimes fails, and we get an
> OOM because whatever the request is is still very large.
>
>
> On Wed, Aug 5, 2015 at 2:26 PM, Hanifi Gunes  wrote:
>
> > I don't seem to be able to re-prod this. Let me look at this and update
> you
> > all.
> >
> > On Thu, Aug 6, 2015 at 12:03 AM, Abdel Hakim Deneche <
> > adene...@maprtech.com>
> > wrote:
> >
> > > I didn't make any change, I'm running 2 forks (the default). I got
> those
> > > errors 3 times now, 2 on a linux VM and 1 on a linux physical node
> > >
> > > On Wed, Aug 5, 2015 at 1:03 PM, Hanifi Gunes 
> > wrote:
> > >
> > > > Did you tighten your memory settings? How many forks are you running
> > > with?
> > > > I bet you are truly running out of memory while executing this
> > particular
> > > > test case.
> > > >
> > > > -H+
> > > >
> > > > On Wed, Aug 5, 2015 at 8:56 PM, Sudheesh Katkam <
> skat...@maprtech.com>
> > > > wrote:
> > > >
> > > > > b2bbd99 committed on July 6th introduced the test.
> > > > >
> > > > > > On Aug 5, 2015, at 10:21 AM, Jinfeng Ni 
> > > wrote:
> > > > > >
> > > > > > In that case,  we probably need do binary search to figure out
> > which
> > > > > recent
> > > > > > patch is causing this problem.
> > > > > >
> > > > > > On Wed, Aug 5, 2015 at 10:03 AM, Abdel Hakim Deneche <
> > > > > adene...@maprtech.com>
> > > > > > wrote:
> > > > > >
> > > > > >> Just got those errors on master too
> > > > > >>
> > > > > >> On Wed, Aug 5, 2015 at 9:07 AM, Abdel Hakim Deneche <
> > > > > adene...@maprtech.com
> > > > > >>>
> > > > > >> wrote:
> > > > > >>
> > > > > >>> I'm seeing those errors intermittently when building my private
> > > > > branch, I
> > > > > >>> don't believe I made any change that would have caused them.
> > Anyone
> > > > > seen
> > > > > >>> them too ?
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> testBitVectorReallocation(org.apache.drill.exec.record.vector.TestValueVector)
> > > > >  Time elapsed: 2.043 sec  <<< ERROR!
> > > > >  java.lang.Exception: Unexpected exception,
> > > > > 
> > > >
> expected
> > > > > >> but
> > > > >  was
> > > > >  at java.nio.Bits.reserveMemory(Bits.java:658)
> > > > >  at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123)
> > > > >  at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306)
> > > > >  at
> > > > > 
> > > > > >>
> > > > >
> > > >
> > >
> >
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108)
> > > > >  at
> > > > > 
> > > > > >>
> > > > >
> > > >
> > >
> >
> io.netty.buffer.UnpooledUnsafeDirectByteBuf.(UnpooledUnsafeDirectByteBuf.java:69)
> > > > >  at
> > > > > 
> > > > > >>
> > > > >
> > > >
> > >
> >
> io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:50)
> > > > >  at
> > > > > 
> > > > > >>
> > > > >
> > > >
> > >
> >
> io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:155)
> > > > >  at
> > > > > 
> > > > > >>
> > > > >
> > > >
> > >
> >
> io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:130)
> > > > >  at
> > > > > 
> > > > > >>
> > > > >
> > > >
> > >
> >
> io.netty.buffer.PooledByteBufAllocatorL.directBuffer

[jira] [Created] (DRILL-3651) Window function should not be allowed in order by clause of over clause of window function

2015-08-14 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3651:
---

 Summary: Window function should not be allowed in order by clause 
of over clause of window function
 Key: DRILL-3651
 URL: https://issues.apache.org/jira/browse/DRILL-3651
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Victoria Markman
Assignee: Jinfeng Ni
Priority: Minor


Should not parse and throw an error according to SQL standard.
"ISO/IEC 9075-2:2011(E) 7.11 " 
d) If WDX has a window ordering clause, then WDEF shall not specify  (hope I'm reading it correctly)

{code}
SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY c1)) from t1;
{code}

Instead, drill returns result:
{code}
0: jdbc:drill:schema=dfs> SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY 
c1)) from t1;
+-+
| EXPR$0  |
+-+
| 1   |
| 2   |
| 3   |
| 4   |
| 5   |
| 6   |
| 7   |
| 8   |
| 9   |
| 10  |
+-+
10 rows selected (0.336 seconds)
{code}

Postgres throws an error in this case:
{code}
postgres=# SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY c1)) from t1;
ERROR:  window functions are not allowed in window definitions
LINE 1: SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY c1)) from...
{code}

Courtesy of postgres test suite.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3650) "Fetch Parquet Footer" times out for a seemingly small parquet files

2015-08-14 Thread Deneche A. Hakim (JIRA)
Deneche A. Hakim created DRILL-3650:
---

 Summary: "Fetch Parquet Footer" times out for a seemingly small 
parquet files
 Key: DRILL-3650
 URL: https://issues.apache.org/jira/browse/DRILL-3650
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Reporter: Deneche A. Hakim
Assignee: Steven Phillips


running the following query in our internal (MapR) test cluster:
{noformat}
select c_integer, sum(c_integer) over(partition by c_varchar order by 
c_integer) from j8 order by c_integer
{noformat}

sometimes fails with the following error:
{noformat}
Failed with exception
java.sql.SQLException: RESOURCE ERROR: Waited for 15000ms, but tasks for 'Fetch 
Parquet Footers' are not complete. Total runnable size 7, parallelism 7.
{noformat}

j8 table contains 8 parquet files that don't exceed 5KB each



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 37482: DRILL-3536: Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions

2015-08-14 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37482/
---

(Updated Aug. 14, 2015, 6:40 p.m.)


Review request for drill and Aman Sinha.


Changes
---

fixed how NTILE output is computed (DRILL-3648)


Bugs: DRILL-3536
https://issues.apache.org/jira/browse/DRILL-3536


Repository: drill-git


Description
---

- added support for the new functions in DefaultFrameTemplate
- use of an internal batch buffer to store values between batches when 
computing LAG
- added new aggregate function "holdLast" to store intermediate values between 
batches when computing FIRST_VALUE
- added unit tests for the new functions
- fixed DRILL-3604, 3605 and 3606
- GenerateTestData is an internal tool used to generate data files and their 
expected results for window function unit tests


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/DefaultFrameTemplate.java
 535deaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/Partition.java
 8d6728e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowDataBatch.java
 5045cb3 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
 9c8cfc0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
 69866af 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFunction.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 04d1231 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
9e09106 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java
 PRE-CREATION 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
 553c4e8 
  exec/java-exec/src/test/resources/window/3604.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3605.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3605.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/3606.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3606.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/3648.parquet PRE-CREATION 
  exec/java-exec/src/test/resources/window/3648.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3648.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4.ntile.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.fval.pby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lag.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lag.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lead.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lead.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lval.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.oby.tsv 528f2f3 
  exec/java-exec/src/test/resources/window/b4.p4.pby.oby.tsv a5d630b 
  exec/java-exec/src/test/resources/window/b4.p4.pby.tsv b2bd5e1 
  exec/java-exec/src/test/resources/window/b4.p4.tsv 1731fe9 
  exec/java-exec/src/test/resources/window/b4.p4/0.data.json e91a75c 
  exec/java-exec/src/test/resources/window/b4.p4/1.data.json 52f375b 
  exec/java-exec/src/test/resources/window/b4.p4/2.data.json 9ecc5ed 
  exec/java-exec/src/test/resources/window/b4.p4/3.data.json 32d2ad1 
  exec/java-exec/src/test/resources/window/fewRowsAllData.parquet PRE-CREATION 
  exec/java-exec/src/test/resources/window/fval.alltypes.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/fval.pby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lag.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lag.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lead.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lead.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lval.alltypes.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lval.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/ntile.sql PRE-CREATION 

Diff: https://reviews.apache.org/r/37482/diff/


Testing
---


Thanks,

abdelhakim deneche



[jira] [Created] (DRILL-3649) LEAD , LAG , NTILE , FIRST_VALUE , LAST_VALUE report RuntimeException for missing OVER clause

2015-08-14 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3649:
-

 Summary: LEAD , LAG , NTILE , FIRST_VALUE , LAST_VALUE report 
RuntimeException for missing OVER clause
 Key: DRILL-3649
 URL: https://issues.apache.org/jira/browse/DRILL-3649
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
 Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
Reporter: Khurram Faraaz
Assignee: Chris Westin


Missing OVER clause must be caught at query plan time, instead we see 
RuntimeException.

{code}
0: jdbc:drill:schema=dfs.tmp> select NTILE(1) from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
SchemaChangeException: Failure while materializing expression. 
Error in expression at index -1.  Error: Missing function implementation: 
[ntile(INT-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.

Fragment 0:0

[Error Id: 5f2f6ffa-7557-447e-a015-63cc87e3e543 on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp> select FIRST_VALUE(1) from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
SchemaChangeException: Failure while materializing expression. 
Error in expression at index -1.  Error: Missing function implementation: 
[first_value(INT-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.

Fragment 0:0

[Error Id: cc52f460-bd85-4588-9f1c-bcf5c6e4729c on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp> select LAST_VALUE(1) from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
SchemaChangeException: Failure while materializing expression. 
Error in expression at index -1.  Error: Missing function implementation: 
[last_value(INT-REQUIRED)].  Full expression: --UNKNOWN EXPRESSION--.

Fragment 0:0

[Error Id: b02c7c59-f9b0-4dc5-89f8-2eb754fcd27b on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp> select LEAD(1) from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
CompileException: Line 89, Column 60: Unknown variable or type "index"

Fragment 0:0

[Error Id: c8375ab9-69ed-4f37-83a4-639d8780762e on centos-04.qa.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp> select LAG(1) from FEWRWSPQQ_101;
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: 
CompileException: Line 89, Column 60: Unknown variable or type "index"

Fragment 0:0

[Error Id: 45d65e74-680d-4332-9b91-4496e0825576 on centos-04.qa.lab:31010]
at

Re: [DISCUSS] Publishing advanced/functional tests

2015-08-14 Thread Ramana I N
So what is the status on this? It would be nice to have this out with 1.2
coming out.

Regards
Ramana



On Wed, Aug 5, 2015 at 11:08 AM, Abhishek Girish 
wrote:

> Ramana,
>
> I think the issue with licenses is mostly resolved. It was discussed that
> for TPC-*, since we shall not be redistributing the data-gen software, but
> distributing a randomized variant of the data generated by it, we should be
> okay to include it part of our framework. For other datasets, we shall
> either provide their copy of license with our framework, or simply provide
> a link for users to download data before they execute.
>
> For now we should focus on having the framework out with minimal cleanup.
> In near future we can work on setting up infrastructure and enhancing the
> framework itself.
>
> -Abhishek
>
> On Wed, Aug 5, 2015 at 10:46 AM, Ramana I N  > wrote:
>
> > @Jacques, Ted
> >
> > in the mean time, we risk patches being merged that have less than
> complete
> > > testing.
> >
> >
> > While I agree with the premise of getting the tests out as soon as
> possible
> > it does not help us achieve anything except transparency. Your statement
> > that getting the tests out will increase quality is dependent on someone
> > actually being able to run the tests once they have access to it.
> >
> > Maybe we should focus on making a jenkins job to run the tests publicly.
> > With that in place we can exclude the TPC* datasets as well as the yelp
> > data sets from the framework and avoid licensing issues.
> >
> > Regards
> > Ramana
> >
> >
> > On Tue, Aug 4, 2015 at 11:39 AM, Abhishek Girish <
> > abhishek.gir...@gmail.com
> > >
> > wrote:
> >
> > > We not only re-distribute external data-sets as-is, but also include
> > > variants for those (text -> parquet, json, ...). So the challenge here
> is
> > > not simply disabling automatic downloads via the framework, and point
> > users
> > > to manually download the files before running the framework, but also
> > about
> > > how we will handle tests which require variants of the data sets. It
> > simply
> > > isn't practical to users of the framework to (1) download data-gen
> > manually
> > > (2) use specific seed / options before generating data, (3) convert
> them
> > to
> > > parquet, etc.. (4) move them to specific locations inside their copy of
> > the
> > > framework.
> > >
> > > Something we'll need to know is how other projects are handling
> > bench-mark
> > > & other external datasets.
> > >
> > > -Abhishek
> > >
> > > On Tue, Aug 4, 2015 at 11:23 AM, rahul challapalli <
> > > challapallira...@gmail.com
> > > wrote:
> > >
> > > > Thanks for your inputs.
> > > >
> > > > Once issue with just publishing the tests in their current state is
> > that,
> > > > the framework re-distributes tpch, tpcds, yelp data sets without
> > > requiring
> > > > the users to accept their relevant licenses. A good number of tests
> > uses
> > > > these data sets. Any thoughts on how to handle this?
> > > >
> > > > - Rahul
> > > >
> > > > On Wed, Jul 29, 2015 at 12:07 AM, Ted Dunning  > >
> > > > wrote:
> > > >
> > > > > +1.  Get it out there.
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jul 28, 2015 at 10:12 PM, Jacques Nadeau <
> jacq...@dremio.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Hey Rahul,
> > > > > >
> > > > > > My suggestion would be to the lower bar--do the absolute bare
> > minimum
> > > > to
> > > > > > get the tests out there.  For example, simply remove proprietary
> > > > > > information and then get it on a public github (whether your
> > personal
> > > > > > github or a corporate one).  From there, people can help by
> > > submitting
> > > > > pull
> > > > > > requests to improve the infrastructure and harness.  Making
> things
> > > > easier
> > > > > > is something that can be done over time.  For example, we've had
> > > offers
> > > > > > from a couple different Linux Admins to help on something.  I'm
> > sure
> > > > that
> > > > > > they could help with a number of the items you've identified.  In
> > the
> > > > > mean
> > > > > > time, we risk patches being merged that have less than complete
> > > > testing.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Jacques Nadeau
> > > > > > CTO and Co-Founder, Dremio
> > > > > >
> > > > > > On Mon, Jul 27, 2015 at 2:16 PM, rahul challapalli <
> > > > > > challapallira...@gmail.com
> > > wrote:
> > > > > >
> > > > > > > Jacques,
> > > > > > >
> > > > > > > I am breaking down steps 1,2 & 3 into sub-tasks so we can
> > > > > add/prioritize
> > > > > > > these tasks
> > > > > > >
> > > > > > > Item #TaskSub-TaskCommentsPriority1*Publish the tests*
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Remove Proprietary Data & Queries
> > > > > > > 0
> > > > > > >
> > > > > > > Redact Propriety Data/Queries
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Move tests into drill repo
> > > > > > > This requires some refactoring to the framework code since the
> > test
> > > > > > > framework u

[GitHub] drill pull request: DRILL-3616: Memory leak in a cleanup code afte...

2015-08-14 Thread adeneche
GitHub user adeneche opened a pull request:

https://github.com/apache/drill/pull/112

DRILL-3616: Memory leak in a cleanup code after canceling queries wit…

…h window functions spilling to disk

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/adeneche/incubator-drill DRILL-3616

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/112.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #112


commit 4ddcfbb04cadd0e6fe777593d865e4ae1be82ea8
Author: adeneche 
Date:   2015-07-13T16:09:40Z

DRILL-3616: Memory leak in a cleanup code after canceling queries with 
window functions spilling to disk




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-3648) NTILE function returns incorrect results

2015-08-14 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3648:
-

 Summary: NTILE function returns incorrect results
 Key: DRILL-3648
 URL: https://issues.apache.org/jira/browse/DRILL-3648
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.2.0
 Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
Reporter: Khurram Faraaz
Assignee: Chris Westin


NTILE function returns incorrect results for larger dataset. I am working on 
reproducing the problem with a smaller dataset.

The inner query that uses NTILE should have divided the rows into two sets 
(tiles)  where each tile consists of (937088 + 1 ) rows , 937088 rows

{code}
0: jdbc:drill:schema=dfs.tmp> select ntile_key2, count(ntile_key2) from (select 
ntile(2) over(partition by key2 order by key1) ntile_key2 from `twoKeyJsn.json` 
where key2 = 'm') group by ntile_key2;
+-+--+
| ntile_key2  |  EXPR$1  |
+-+--+
| 1   | 1|
| 2   | 1874176  |
+-+--+
2 rows selected (49.406 seconds)
{code}

Explain plan for  inner query that returns wrong results.

{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select ntile(2) over(partition 
by key2 order by key1) from `twoKeyJsn.json` where key2 = 'm';
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  UnionExchange
01-01Project(EXPR$0=[$0])
01-02  Project($0=[$2])
01-03Window(window#0=[window(partition {0} order by [1] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [NTILE($2)])])
01-04  SelectionVectorRemover
01-05Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC])
01-06  Project(key2=[$0], key1=[$1])
01-07HashToRandomExchange(dist0=[[$0]])
02-01  UnorderedMuxExchange
03-01Project(key2=[$0], key1=[$1], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[castInt(hash64AsDouble($0))])
03-02  SelectionVectorRemover
03-03Filter(condition=[=($0, 'm')])
03-04  Scan(groupscan=[EasyGroupScan 
[selectionRoot=maprfs:/tmp/twoKeyJsn.json, numFiles=1, columns=[`key2`, 
`key1`], files=[maprfs:///tmp/twoKeyJsn.json]]])
{code}

Total number of rows in partition that has key2 = 'm'

{code}
0: jdbc:drill:schema=dfs.tmp> select count(key1) from `twoKeyJsn.json` where 
key2 = 'm';
+--+
|  EXPR$0  |
+--+
| 1874177  |
+--+
1 row selected (37.581 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 37482: DRILL-3536: Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions

2015-08-14 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37482/
---

(Updated Aug. 14, 2015, 4:34 p.m.)


Review request for drill and Aman Sinha.


Changes
---

- refactored window functions code generation and value vector materialization 
into separate classes that extend from WindowFunction
- removed "holdLast" aggregate function, using internal batch for FIRST_VALUE 
instead


Bugs: DRILL-3536
https://issues.apache.org/jira/browse/DRILL-3536


Repository: drill-git


Description
---

- added support for the new functions in DefaultFrameTemplate
- use of an internal batch buffer to store values between batches when 
computing LAG
- added new aggregate function "holdLast" to store intermediate values between 
batches when computing FIRST_VALUE
- added unit tests for the new functions
- fixed DRILL-3604, 3605 and 3606
- GenerateTestData is an internal tool used to generate data files and their 
expected results for window function unit tests


Diffs (updated)
-

  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/DefaultFrameTemplate.java
 535deaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/Partition.java
 8d6728e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowDataBatch.java
 5045cb3 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
 9c8cfc0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
 69866af 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFunction.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 04d1231 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
9e09106 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java
 PRE-CREATION 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
 553c4e8 
  exec/java-exec/src/test/resources/window/3604.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3605.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3605.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/3606.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3606.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4.ntile.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.fval.pby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lag.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lag.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lead.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lead.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lval.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.oby.tsv 528f2f3 
  exec/java-exec/src/test/resources/window/b4.p4.pby.oby.tsv a5d630b 
  exec/java-exec/src/test/resources/window/b4.p4.pby.tsv b2bd5e1 
  exec/java-exec/src/test/resources/window/b4.p4.tsv 1731fe9 
  exec/java-exec/src/test/resources/window/b4.p4/0.data.json e91a75c 
  exec/java-exec/src/test/resources/window/b4.p4/1.data.json 52f375b 
  exec/java-exec/src/test/resources/window/b4.p4/2.data.json 9ecc5ed 
  exec/java-exec/src/test/resources/window/b4.p4/3.data.json 32d2ad1 
  exec/java-exec/src/test/resources/window/fewRowsAllData.parquet PRE-CREATION 
  exec/java-exec/src/test/resources/window/fval.alltypes.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/fval.pby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lag.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lag.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lead.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lead.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lval.alltypes.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lval.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/ntile.sql PRE-CREATION 

Diff: https://reviews.apache.org/r/37482/diff/


Testing
---


Thanks,

abdelhakim deneche



Review Request 37482: DRILL-3536: Add support for LEAD, LAG, NTILE, FIRST_VALUE and LAST_VALUE window functions

2015-08-14 Thread abdelhakim deneche

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/37482/
---

Review request for drill and Aman Sinha.


Bugs: DRILL-3536
https://issues.apache.org/jira/browse/DRILL-3536


Repository: drill-git


Description (updated)
---

- added support for the new functions in DefaultFrameTemplate
- use of an internal batch buffer to store values between batches when 
computing LAG
- added new aggregate function "holdLast" to store intermediate values between 
batches when computing FIRST_VALUE
- added unit tests for the new functions
- fixed DRILL-3604, 3605 and 3606
- GenerateTestData is an internal tool used to generate data files and their 
expected results for window function unit tests


Diffs (updated)
-

  exec/java-exec/src/main/codegen/config.fmpp c70f6da 
  exec/java-exec/src/main/codegen/data/HoldLastTypes.tdd PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/HoldDateFunctions.java PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/HoldDecimalFunctions.java 
PRE-CREATION 
  exec/java-exec/src/main/codegen/templates/HoldVarBytesFunctions.java 
PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/DefaultFrameTemplate.java
 535deaa 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/Partition.java
 8d6728e 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowDataBatch.java
 5045cb3 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFrameRecordBatch.java
 9c8cfc0 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFramer.java
 69866af 
  
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/window/WindowFunction.java
 PRE-CREATION 
  
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/UnsupportedOperatorsVisitor.java
 04d1231 
  exec/java-exec/src/test/java/org/apache/drill/exec/TestWindowFunctions.java 
9e09106 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/GenerateTestData.java
 PRE-CREATION 
  
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/window/TestWindowFrame.java
 553c4e8 
  exec/java-exec/src/test/resources/window/3604.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3605.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3605.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/3606.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/3606.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b2.p4.ntile.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.fval.pby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lag.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lag.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lead.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lead.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.lval.pby.oby.tsv PRE-CREATION 
  exec/java-exec/src/test/resources/window/b4.p4.oby.tsv 528f2f3 
  exec/java-exec/src/test/resources/window/b4.p4.pby.oby.tsv a5d630b 
  exec/java-exec/src/test/resources/window/b4.p4.pby.tsv b2bd5e1 
  exec/java-exec/src/test/resources/window/b4.p4.tsv 1731fe9 
  exec/java-exec/src/test/resources/window/b4.p4/0.data.json e91a75c 
  exec/java-exec/src/test/resources/window/b4.p4/1.data.json 52f375b 
  exec/java-exec/src/test/resources/window/b4.p4/2.data.json 9ecc5ed 
  exec/java-exec/src/test/resources/window/b4.p4/3.data.json 32d2ad1 
  exec/java-exec/src/test/resources/window/fewRowsAllData.parquet PRE-CREATION 
  exec/java-exec/src/test/resources/window/fval.alltypes.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/fval.pby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lag.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lag.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lead.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lead.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lval.alltypes.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/lval.pby.oby.sql PRE-CREATION 
  exec/java-exec/src/test/resources/window/ntile.sql PRE-CREATION 

Diff: https://reviews.apache.org/r/37482/diff/


Testing
---


Thanks,

abdelhakim deneche



[jira] [Created] (DRILL-3647) Handle null as input to window function NTILE

2015-08-14 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3647:
-

 Summary: Handle null as input to window function NTILE 
 Key: DRILL-3647
 URL: https://issues.apache.org/jira/browse/DRILL-3647
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.2.0
 Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
Reporter: Khurram Faraaz
Assignee: Chris Westin


We need to handle null as input to window functions. NTILE function must return 
null as output when input is null.

{code}
0: jdbc:drill:schema=dfs.tmp> select col7 , col0 , ntile(null) over(partition 
by col7 order by col0) lead_col0 from FEWRWSPQQ_101;
Error: PARSE ERROR: From line 1, column 22 to line 1, column 37: Argument to 
function 'NTILE' must not be NULL


[Error Id: e5e69582-8502-4a99-8ba1-dffdfb8ac028 on centos-04.qa.lab:31010] 
(state=,code=0)
{code}

{code}
0: jdbc:drill:schema=dfs.tmp> select col7 , col0 , lead(null) over(partition by 
col7 order by col0) lead_col0 from FEWRWSPQQ_101;
Error: PARSE ERROR: From line 1, column 27 to line 1, column 30: Illegal use of 
'NULL'


[Error Id: 6824ca01-e3f1-4338-b4c8-5535e7a42e13 on centos-04.qa.lab:31010] 
(state=,code=0)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3646) Show tables in DFS workspace

2015-08-14 Thread Hari Sekhon (JIRA)
Hari Sekhon created DRILL-3646:
--

 Summary: Show tables in DFS workspace
 Key: DRILL-3646
 URL: https://issues.apache.org/jira/browse/DRILL-3646
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata, Storage - Information Schema
Affects Versions: 1.1.0
Reporter: Hari Sekhon
Assignee: Steven Phillips


Drill does not show tables in a DFS workspace, even when I just created a 
parquet table there using CTAS through Drill itself.

The output results are blank, a zero row table with blank column header.
{code}
0: jdbc:drill:zk=local> show tables in dfs.hari;
+--+
|  |
+--+
+--+
No rows selected (0.137 seconds)
{code}
although I can still query the table I just created as long as I know it's 
there and query it blindly:
{code}
0: jdbc:drill:zk=local> select count(*) from dfs.hari.auditlogs_parquet_drill;
+-+
| EXPR$0  |
+-+
| 2579|
+-+
1 row selected (0.129 seconds)
{code}
I can't describe the table so I really do have to query it blindly too (I 
previously raised a different jira for parquet describe support DRILL-3525 and 
other formats DRILL-3524 to DRILL-3529):
{code}
0: jdbc:drill:zk=local> select count(*) from dfs.hari.auditlogs_parquet_drill;
+-+
| EXPR$0  |
+-+
| 2579|
+-+
1 row selected (0.129 seconds)
{code}

This jira is specifically to address the inability to list tables (or perhaps 
files/dirs) in a DFS workspace though.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)