[jira] [Updated] (DRILL-3150) Error when filtering non-existent field with a string

2015-06-01 Thread Adam Gilmore (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Gilmore updated DRILL-3150:

Attachment: DRILL-3150.1.patch.txt

> Error when filtering non-existent field with a string
> -
>
> Key: DRILL-3150
> URL: https://issues.apache.org/jira/browse/DRILL-3150
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.0.0
>Reporter: Adam Gilmore
>Assignee: Adam Gilmore
>Priority: Critical
> Fix For: 1.1.0
>
> Attachments: DRILL-3150.1.patch.txt
>
>
> The following query throws an exception:
> {code}
> select count(*) from cp.`employee.json` where `blah` = 'test'
> {code}
> "blah" does not exist as a field in the JSON.  The expected behaviour would 
> be to filter out all rows as that field is not present (thus cannot equal the 
> string 'test').
> Instead, the following exception occurs:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: test
> Fragment 0:0
> [Error Id: 5d6c9a82-8f87-41b2-a496-67b360302b76 on 
> ip-10-1-50-208.ec2.internal:31010]
> {code}
> Apart from the fact the real error message is hidden, the issue is that we're 
> trying to cast the varchar to int ('test' to an int).  This seems to be 
> because the projection out of the scan when a field is not found becomes 
> INT:OPTIONAL.
> The filter should not fail on this - if the varchar fails to convert to an 
> int, the filter should just simply not allow any records through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3150) Error when filtering non-existent field with a string

2015-06-01 Thread Adam Gilmore (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Gilmore updated DRILL-3150:

Attachment: (was: DRILL-3150.1.patch.txt)

> Error when filtering non-existent field with a string
> -
>
> Key: DRILL-3150
> URL: https://issues.apache.org/jira/browse/DRILL-3150
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.0.0
>Reporter: Adam Gilmore
>Assignee: Adam Gilmore
>Priority: Critical
> Fix For: 1.1.0
>
>
> The following query throws an exception:
> {code}
> select count(*) from cp.`employee.json` where `blah` = 'test'
> {code}
> "blah" does not exist as a field in the JSON.  The expected behaviour would 
> be to filter out all rows as that field is not present (thus cannot equal the 
> string 'test').
> Instead, the following exception occurs:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: test
> Fragment 0:0
> [Error Id: 5d6c9a82-8f87-41b2-a496-67b360302b76 on 
> ip-10-1-50-208.ec2.internal:31010]
> {code}
> Apart from the fact the real error message is hidden, the issue is that we're 
> trying to cast the varchar to int ('test' to an int).  This seems to be 
> because the projection out of the scan when a field is not found becomes 
> INT:OPTIONAL.
> The filter should not fail on this - if the varchar fails to convert to an 
> int, the filter should just simply not allow any records through.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3216) Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3216:
--
Description: 
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").
- "CHAR" -> "CHARACTER"
- "VARCHAR" -> "CHARACTER VARYING" 
- "VARBINARY" -> "BINARY VARYING"
- "... ARRAY" -> "ARRAY"
- "(...) MAP" -> "MAP"
- "STRUCT (...)" -> "STRUCT"

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.

Update implementation of JDBC's {{DatabaseMeta.getColumns()}} (at least enough 
to not break; maybe also to use newly available data to fix some partial 
implementations).


  was:
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").
- "CHAR" -> "CHARACTER"
- "VARCHAR" -> "CHARACTER VARYING" 
- "VARBINARY" -> "BINARY VARYING"
- "... ARRAY" -> "ARRAY"
- "(...) MAP" -> "MAP"
- "STRUCT (...)" -> "STRUCT"

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.



> Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns
> --
>
> Key: DRILL-3216
> URL: https://issues.ap

[jira] [Created] (DRILL-3243) Need a better error message - Use of alias in window function definition

2015-06-01 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-3243:
-

 Summary: Need a better error message - Use of alias in window 
function definition
 Key: DRILL-3243
 URL: https://issues.apache.org/jira/browse/DRILL-3243
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.0.0
Reporter: Khurram Faraaz
Assignee: Chris Westin


Need a better error message when we use alias for window definition in query 
that uses window functions. for example, OVER(PARTITION BY columns[0] ORDER BY 
columns[1]) tmp, and if alias "tmp" is used in the predicate we need a message 
that says, column "tmp" does not exist, that is how it is in Postgres 9.3

Postgres 9.3

{code}
postgres=# select count(*) OVER(partition by type order by id) `tmp` from 
airports where tmp is not null;
ERROR:  column "tmp" does not exist
LINE 1: ...ect count(*) OVER(partition by type order by id) `tmp` from ...
 ^
{code}

Drill 1.0
{code}
0: jdbc:drill:schema=dfs.tmp> select count(*) OVER(partition by columns[2] 
order by columns[0]) tmp from `airports.csv` where tmp is not null;
Error: SYSTEM ERROR: java.lang.IllegalArgumentException: Selected column(s) 
must have name 'columns' or must be plain '*'

Fragment 0:0

[Error Id: 66987b81-fe50-422d-95e4-9ce61c873584 on centos-02.qa.lab:31010] 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3242) Enhance RPC layer to offload all request work onto a separate thread.

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3242:
-

 Summary: Enhance RPC layer to offload all request work onto a 
separate thread.
 Key: DRILL-3242
 URL: https://issues.apache.org/jira/browse/DRILL-3242
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - RPC
Reporter: Jacques Nadeau
Assignee: Jacques Nadeau
 Fix For: 1.1.0


Right now, the app is responsible for ensuring that very small amounts of work 
are done on the RPC thread.  In some cases, the app doesn't do this correctly.  
Additionally, in high load situations these small amounts of work become no 
trivial.  As such, we need to make RPC layer protect itself from slow 
requests/responses.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3241) Query with window function runs out of direct memory and does not report back to client that it did

2015-06-01 Thread Victoria Markman (JIRA)
Victoria Markman created DRILL-3241:
---

 Summary: Query with window function runs out of direct memory and 
does not report back to client that it did
 Key: DRILL-3241
 URL: https://issues.apache.org/jira/browse/DRILL-3241
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.0.0
Reporter: Victoria Markman
Assignee: Chris Westin


Even though query run out of memory and was cancelled on the server, client 
(sqlline) was never notified of the event and it appears to the user that query 
is hung. 

Configuration:
Single drillbit configured with:
DRILL_MAX_DIRECT_MEMORY="2G"
DRILL_HEAP="1G"
TPCDS100 parquet files

Query:
{code}
select 
  sum(ss_quantity) over(partition by ss_store_sk order by ss_sold_date_sk) 
from store_sales;
{code}

drillbit.log
{code}
2015-06-01 21:42:29,514 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler 
- Exception in RPC communication.  Connection: /10.10.88.133:31012 <--> 
/10.10.88.133:38887 (data server).  Closing connection.
io.netty.handler.codec.DecoderException: java.lang.OutOfMemoryError: Direct 
buffer memory
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:233)
 ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
 [netty-transport-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:618)
 [netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at 
io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:329) 
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:250) 
[netty-transport-native-epoll-4.0.27.Final-linux-x86_64.jar:na]
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
 [netty-common-4.0.27.Final.jar:4.0.27.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
Caused by: java.lang.OutOfMemoryError: Direct buffer memory
at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_71]
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
~[na:1.7.0_71]
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
~[na:1.7.0_71]
at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PoolArena.reallocate(PoolArena.java:280) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.PooledByteBuf.capacity(PooledByteBuf.java:110) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at io.netty.buffer.WrappedByteBuf.writeBytes(WrappedByteBuf.java:600) 
~[netty-buffer-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.buffer.UnsafeDirectLittleEndian.writeBytes(UnsafeDirectLittleEndian.java:28)
 ~[drill-java-exec-1.0.0-mapr-r1-rebuffed.jar:4.0.27.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92)
 ~[netty-codec-4.0.27.Final.jar:4.0.27.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:227)
 ~[

[jira] [Created] (DRILL-3240) Fetch hadoop maven profile specific Hive version in Hive storage plugin

2015-06-01 Thread Venki Korukanti (JIRA)
Venki Korukanti created DRILL-3240:
--

 Summary: Fetch hadoop maven profile specific Hive version in Hive 
storage plugin
 Key: DRILL-3240
 URL: https://issues.apache.org/jira/browse/DRILL-3240
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Hive, Tools, Build & Test
Affects Versions: 0.4.0
Reporter: Venki Korukanti
Assignee: Venki Korukanti
Priority: Minor
 Fix For: 1.1.0


Currently we always fetch the Apache Hive libs irrespective of the Hadoop 
vendor profile used in {{mvn clean install}}. This jira is to allow specifying 
the custom version of Hive in hadoop vendor profile.

Note: Hive storage plugin assumes there are no major differences in Hive APIs 
between different vendor specific custom Hive builds. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3239) Join between empty hive tables throws an IllegalStateException

2015-06-01 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-3239:


 Summary: Join between empty hive tables throws an 
IllegalStateException
 Key: DRILL-3239
 URL: https://issues.apache.org/jira/browse/DRILL-3239
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive
Reporter: Rahul Challapalli
Assignee: Venki Korukanti
 Attachments: error.log

git.commit.id.abbrev=6f54223

Created 2 hive tables on top of tpch data in orc format. The tables are empty. 
Below query returns 0 rows from hive. However it fails with an 
IllegalStateException from drill

{code}
select * from customer c, orders o where c.c_custkey = o.o_custkey;
Error: SYSTEM ERROR: java.lang.IllegalStateException: You tried to do a batch 
data read operation when you were in a state of NONE.  You can only do this 
type of operation when you are in a state of OK or OK_NEW_SCHEMA.

Fragment 0:0

[Error Id: 8483cab2-d771-4337-ae65-1db41eb5720d on qa-node191.qa.lab:31010] 
(state=,code=0)
{code}

Below is the hive ddl I used
{code}
create table if not exists tpch01_orc.customer (
c_custkey int,
c_name string,
c_address string,
c_nationkey int,
c_phone string,
c_acctbal double,
c_mktsegment string,
c_comment string
)
STORED AS orc
LOCATION '/drill/testdata/Tpch0.01/orc/customer';

create table if not exists tpch01_orc.orders (
o_orderkey int,
o_custkey int,
o_orderstatus string,
o_totalprice double,
o_orderdate date,
o_orderpriority string,
o_clerk string,
o_shippriority int,
o_comment string
)
STORED AS orc
LOCATION '/drill/testdata/Tpch0.01/orc/orders';
{code}

I attached the log files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3239) Join between empty hive tables throws an IllegalStateException

2015-06-01 Thread Rahul Challapalli (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rahul Challapalli updated DRILL-3239:
-
Attachment: error.log

> Join between empty hive tables throws an IllegalStateException
> --
>
> Key: DRILL-3239
> URL: https://issues.apache.org/jira/browse/DRILL-3239
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Reporter: Rahul Challapalli
>Assignee: Venki Korukanti
> Attachments: error.log
>
>
> git.commit.id.abbrev=6f54223
> Created 2 hive tables on top of tpch data in orc format. The tables are 
> empty. Below query returns 0 rows from hive. However it fails with an 
> IllegalStateException from drill
> {code}
> select * from customer c, orders o where c.c_custkey = o.o_custkey;
> Error: SYSTEM ERROR: java.lang.IllegalStateException: You tried to do a batch 
> data read operation when you were in a state of NONE.  You can only do this 
> type of operation when you are in a state of OK or OK_NEW_SCHEMA.
> Fragment 0:0
> [Error Id: 8483cab2-d771-4337-ae65-1db41eb5720d on qa-node191.qa.lab:31010] 
> (state=,code=0)
> {code}
> Below is the hive ddl I used
> {code}
> create table if not exists tpch01_orc.customer (
> c_custkey int,
> c_name string,
> c_address string,
> c_nationkey int,
> c_phone string,
> c_acctbal double,
> c_mktsegment string,
> c_comment string
> )
> STORED AS orc
> LOCATION '/drill/testdata/Tpch0.01/orc/customer';
> create table if not exists tpch01_orc.orders (
> o_orderkey int,
> o_custkey int,
> o_orderstatus string,
> o_totalprice double,
> o_orderdate date,
> o_orderpriority string,
> o_clerk string,
> o_shippriority int,
> o_comment string
> )
> STORED AS orc
> LOCATION '/drill/testdata/Tpch0.01/orc/orders';
> {code}
> I attached the log files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3238) Cannot Plan Exception is raised when the same window partition is defined in select & window clauses

2015-06-01 Thread Victoria Markman (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victoria Markman updated DRILL-3238:

Labels: window_functions  (was: )

> Cannot Plan Exception is raised when the same window partition is defined in 
> select & window clauses
> 
>
> Key: DRILL-3238
> URL: https://issues.apache.org/jira/browse/DRILL-3238
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
>  Labels: window_functions
>
> While this works:
> {code}
> select sum(a2) over(partition by a2 order by a2), count(*) over(partition by 
> a2 order by a2) 
> from t
> {code}
> , this fails
> {code}
> select sum(a2) over(w), count(*) over(partition by a2 order by a2) 
> from t
> window w as (partition by a2 order by a2)
> {code}
> Notice these two queries are logically the same thing if we plug-in the 
> window definition back into the SELECT-CLAUSE in the 2nd query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3238) Cannot Plan Exception is raised when the same window partition is defined in select & window clauses

2015-06-01 Thread Victoria Markman (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568055#comment-14568055
 ] 

Victoria Markman commented on DRILL-3238:
-

Interestingly, this case "over W" works:

{code}
select sum(a2) over w, count(*) over(partition by a2 order by a2) from t2 
window w as (partition by a2 order by a2);
{code}

I did not realized that over(W) is the supported grammar ...

> Cannot Plan Exception is raised when the same window partition is defined in 
> select & window clauses
> 
>
> Key: DRILL-3238
> URL: https://issues.apache.org/jira/browse/DRILL-3238
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
>
> While this works:
> {code}
> select sum(a2) over(partition by a2 order by a2), count(*) over(partition by 
> a2 order by a2) 
> from t
> {code}
> , this fails
> {code}
> select sum(a2) over(w), count(*) over(partition by a2 order by a2) 
> from t
> window w as (partition by a2 order by a2)
> {code}
> Notice these two queries are logically the same thing if we plug-in the 
> window definition back into the SELECT-CLAUSE in the 2nd query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3238) Cannot Plan Exception is raised when the same window partition is defined in select & window clauses

2015-06-01 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu updated DRILL-3238:
-
Description: 
While this works:
{code}
select sum(a2) over(partition by a2 order by a2), count(*) over(partition by a2 
order by a2) 
from t
{code}
, this fails

{code}
select sum(a2) over(w), count(*) over(partition by a2 order by a2) 
from t
window w as (partition by a2 order by a2)
{code}

Notice these two queries are logically the same thing if we plug-in the window 
definition back into the SELECT-CLAUSE in the 2nd query.

> Cannot Plan Exception is raised when the same window partition is defined in 
> select & window clauses
> 
>
> Key: DRILL-3238
> URL: https://issues.apache.org/jira/browse/DRILL-3238
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
>
> While this works:
> {code}
> select sum(a2) over(partition by a2 order by a2), count(*) over(partition by 
> a2 order by a2) 
> from t
> {code}
> , this fails
> {code}
> select sum(a2) over(w), count(*) over(partition by a2 order by a2) 
> from t
> window w as (partition by a2 order by a2)
> {code}
> Notice these two queries are logically the same thing if we plug-in the 
> window definition back into the SELECT-CLAUSE in the 2nd query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3238) Cannot Plan Exception is raised when the same window partition is defined in select & window clauses

2015-06-01 Thread Sean Hsuan-Yi Chu (JIRA)
Sean Hsuan-Yi Chu created DRILL-3238:


 Summary: Cannot Plan Exception is raised when the same window 
partition is defined in select & window clauses
 Key: DRILL-3238
 URL: https://issues.apache.org/jira/browse/DRILL-3238
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Sean Hsuan-Yi Chu
Assignee: Sean Hsuan-Yi Chu






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3155) Composite vectors leak memory

2015-06-01 Thread Mehant Baid (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3155:
---
Assignee: Hanifi Gunes  (was: Mehant Baid)

> Composite vectors leak memory
> -
>
> Key: DRILL-3155
> URL: https://issues.apache.org/jira/browse/DRILL-3155
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Hanifi Gunes
> Fix For: 1.1.0
>
> Attachments: DRILL-3155-1.patch, DRILL-3155-2.patch
>
>
> While allocating memory for variable width vectors we first allocate the 
> necessary memory for the actual data followed by the memory needed for the 
> offset vector. However if the first allocation for the data buffer succeeds 
> and the one for the offset vector fails we don't release the buffer allocated 
> for the data causing memory leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3155) Composite vectors leak memory

2015-06-01 Thread Mehant Baid (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3155:
---
Summary: Composite vectors leak memory  (was: Variable width vectors leak 
memory)

> Composite vectors leak memory
> -
>
> Key: DRILL-3155
> URL: https://issues.apache.org/jira/browse/DRILL-3155
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
> Fix For: 1.1.0
>
> Attachments: DRILL-3155-1.patch, DRILL-3155-2.patch
>
>
> While allocating memory for variable width vectors we first allocate the 
> necessary memory for the actual data followed by the memory needed for the 
> offset vector. However if the first allocation for the data buffer succeeds 
> and the one for the offset vector fails we don't release the buffer allocated 
> for the data causing memory leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3155) Composite vectors leak memory

2015-06-01 Thread Mehant Baid (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mehant Baid updated DRILL-3155:
---
Attachment: DRILL-3155-2.patch
DRILL-3155-1.patch

First patch is a minor refactoring patch moving the classes into the correct 
package. 
Second patch is the one that fixes the issue.

> Composite vectors leak memory
> -
>
> Key: DRILL-3155
> URL: https://issues.apache.org/jira/browse/DRILL-3155
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Mehant Baid
>Assignee: Mehant Baid
> Fix For: 1.1.0
>
> Attachments: DRILL-3155-1.patch, DRILL-3155-2.patch
>
>
> While allocating memory for variable width vectors we first allocate the 
> necessary memory for the actual data followed by the memory needed for the 
> offset vector. However if the first allocation for the data buffer succeeds 
> and the one for the offset vector fails we don't release the buffer allocated 
> for the data causing memory leaks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3216) Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3216:
--
Description: 
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").
- "CHAR" -> "CHARACTER"
- "VARCHAR" -> "CHARACTER VARYING" 
- "VARBINARY" -> "BINARY VARYING"
- "... ARRAY" -> "ARRAY"
- "(...) MAP" -> "MAP"
- "STRUCT (...)" -> "STRUCT"

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.


  was:
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.



> Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns
> --
>
> Key: DRILL-3216
> URL: https://issues.apache.org/jira/browse/DRILL-3216
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> [Editing in progress]
> Change logical null from {{-1}} to actual {{NULL}}:
> - Change column {{CHARACTER_MAXIMUM_LENGTH}}.
> - Change column {{NUMERIC_PRECISION}}.
> - Change column {{NUMERIC_PRECIS

[jira] [Commented] (DRILL-2658) Add ilike and regex substring functions

2015-06-01 Thread Patrick Toole (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567881#comment-14567881
 ] 

Patrick Toole commented on DRILL-2658:
--

It appears the alias version of "substr" does not work (substring):
0: jdbc:drill:> select substr('a','b') from sys.version;
+-+
| EXPR$0  |
+-+
| null|
+-+
1 row selected (0.288 seconds)
0: jdbc:drill:> select substring('a','b') from sys.version;
Error: PARSE ERROR: From line 1, column 8 to line 1, column 25: Cannot apply 
'SUBSTRING' to arguments of type 'SUBSTRING( FROM )'. 
Supported form(s): 'SUBSTRING( FROM )'
'SUBSTRING( FROM  FOR )'
'SUBSTRING( FROM )'
'SUBSTRING( FROM  FOR )'
'SUBSTRING( FROM )'
'SUBSTRING( FROM  FOR )'
'SUBSTRING( FROM )'
'SUBSTRING( FROM  FOR )'



> Add ilike and regex substring functions
> ---
>
> Key: DRILL-2658
> URL: https://issues.apache.org/jira/browse/DRILL-2658
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2658.patch, DRILL-2658.patch
>
>
> This will not modify the parser, so postgress syntax such as:
> "... where c ILIKE '%ABC%'"
> will not be currently supported. It will simply be a function:
> "... where ILIKE(c, '%ABC%')"
> Same for substring:
> "select substr(c, 'abc')..."
> will be equivalent to postgress
> "select substr(c from 'abc')",
> but 'abc' will be treated as a java regex pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2658) Add ilike and regex substring functions

2015-06-01 Thread Patrick Toole (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567866#comment-14567866
 ] 

Patrick Toole commented on DRILL-2658:
--

This form breaks other downstream items:

SELECT * FROM table_name WHERE column_name ilike '4\t' ESCAPE '\'

 

> Add ilike and regex substring functions
> ---
>
> Key: DRILL-2658
> URL: https://issues.apache.org/jira/browse/DRILL-2658
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Functions - Drill
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.0.0
>
> Attachments: DRILL-2658.patch, DRILL-2658.patch
>
>
> This will not modify the parser, so postgress syntax such as:
> "... where c ILIKE '%ABC%'"
> will not be currently supported. It will simply be a function:
> "... where ILIKE(c, '%ABC%')"
> Same for substring:
> "select substr(c, 'abc')..."
> will be equivalent to postgress
> "select substr(c from 'abc')",
> but 'abc' will be treated as a java regex pattern.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3237) Come up with enhanced AbstractRecordBatch and AbstractSingleRecordBatch to better handle type promotion and schema change

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3237:
-

 Summary: Come up with enhanced AbstractRecordBatch and 
AbstractSingleRecordBatch to better handle type promotion and schema change
 Key: DRILL-3237
 URL: https://issues.apache.org/jira/browse/DRILL-3237
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3236) Enhance JSON writer to write EmbeddedType

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3236:
-

 Summary: Enhance JSON writer to write EmbeddedType
 Key: DRILL-3236
 URL: https://issues.apache.org/jira/browse/DRILL-3236
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3235) Enhance JSON reader to leverage EmbeddedType

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3235:
-

 Summary: Enhance JSON reader to leverage EmbeddedType
 Key: DRILL-3235
 URL: https://issues.apache.org/jira/browse/DRILL-3235
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3234) Drill fails to implicit cast hive tinyint and smallint data as int

2015-06-01 Thread Krystal (JIRA)
Krystal created DRILL-3234:
--

 Summary: Drill fails to implicit cast hive tinyint and smallint 
data as int
 Key: DRILL-3234
 URL: https://issues.apache.org/jira/browse/DRILL-3234
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive
Affects Versions: 1.0.0
Reporter: Krystal
Assignee: Venki Korukanti


I have the following hive table:
 describe `hive.default`.voter_hive;
+++--+
|  COLUMN_NAME   | DATA_TYPE  | IS_NULLABLE  |
+++--+
| voter_id   | SMALLINT   | YES  |
| name   | VARCHAR| YES  |
| age| TINYINT| YES  |
| registration   | VARCHAR| YES  |
| contributions  | DECIMAL| YES  |
| voterzone  | INTEGER| YES  |
| create_time| TIMESTAMP  | YES  |
+++--+

If just include the voter_id and age fields in the select, then the query works 
fine.  However if I include them in the where clause, the query would fail. For 
example:

select voter_id, name, age from voter_hive where age < 30;
Error: SYSTEM ERROR: org.apache.drill.exec.exception.SchemaChangeException: 
Failure while trying to materialize incoming schema.  Errors:
 
Error in expression at index -1.  Error: Missing function implementation: 
[castINT(TINYINT-OPTIONAL)].  Full expression: --UNKNOWN EXPRESSION--..




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3233) Update code generation & function code to support reading and writing embedded type

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3233:
-

 Summary: Update code generation & function code to support reading 
and writing embedded type
 Key: DRILL-3233
 URL: https://issues.apache.org/jira/browse/DRILL-3233
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Jacques Nadeau






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3232) Modify existing vectors to allow type promotion

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3232:
-

 Summary: Modify existing vectors to allow type promotion
 Key: DRILL-3232
 URL: https://issues.apache.org/jira/browse/DRILL-3232
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Jacques Nadeau


Support the ability for existing vectors to be promoted similar to supported 
implicit casting rules.

For example:

INT > DOUBLE > STRING > EMBEDDED




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3231) Throw better error messages for schema changes

2015-06-01 Thread Hanifi Gunes (JIRA)
Hanifi Gunes created DRILL-3231:
---

 Summary: Throw better error messages for schema changes
 Key: DRILL-3231
 URL: https://issues.apache.org/jira/browse/DRILL-3231
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 1.0.0
Reporter: Hanifi Gunes
Assignee: Hanifi Gunes


This task is concerned about making error messages more intelligible especially 
for the case of schema changes.

{code:title=current error message}
Error: DATA_READ ERROR: Error parsing JSON - You tried to write a BigInt
type when you are using a ValueWriter of type NullableFloat8WriterImpl.
{code}

Proposed message should be non-technical possibly with some more context that 
helps investigate the problem such like line and column number and name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3216) Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3216:
--
Summary: Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns  (was: Fix 
existing INFORMATION_SCHEMA.COLUMNS columns)

> Fix existing(+) INFORMATION_SCHEMA.COLUMNS columns
> --
>
> Key: DRILL-3216
> URL: https://issues.apache.org/jira/browse/DRILL-3216
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> [Editing in progress]
> Change logical null from {{-1}} to actual {{NULL}}:
> - Change column {{CHARACTER_MAXIMUM_LENGTH}}.
> - Change column {{NUMERIC_PRECISION}}.
> - Change column {{NUMERIC_PRECISION_RADIX}}.
> - Change column {{NUMERIC_SCALE}}.
> Change column {{ORDINAL_POSITION}} from zero-based to one-based.
> Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified 
> names (e.g., "CHARACTER").
> Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to 
> "INTERVAL":
> - Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
> - Add column {{INTERVAL_TYPE}}.
> Move {{CHAR}} length from {{NUMERIC_PRECISION}} to 
> {{CHARACTER_MAXIMUM_LENGTH}} (same as {{VARCHAR}} length):
> - Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
> - Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
> CHAR.
> Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
> {{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
> - Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
> and VARBINARY.
> - Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
> BINARY and VARBINARY.
> To correct ordinal position of some existing columns:
> - Add column {{COLUMN_DEFAULT}}.
> - Add column {{CHARACTER_OCTET_LENGTH}}.
> - Reorder column {{NUMERIC_PRECISION}}.
> Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
> {{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
> - Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
> interval types.
> - Add column {{DATETIME_PRECISION}}.
> - Add column {{INTERVAL_PRECISION}}.
> Implement {{NUMERIC_PRECISION_RADIX}}:
> - Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
> appropriate values (2, 10, NULL).
> Add missing numeric precision and scale values (for non-DECIMAL types):
> - Change NUMERIC_SCALE from logical null to zero for integer types.
> - Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
> numeric types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3130) Project can be pushed below union all / union to improve performance

2015-06-01 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-3130.
--
  Resolution: Fixed
   Fix Version/s: 1.1.0
Target Version/s:   (was: Future)

> Project can be pushed below union all / union to improve performance
> 
>
> Key: DRILL-3130
> URL: https://issues.apache.org/jira/browse/DRILL-3130
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.1.0
>
>
> A query such as 
> {code}
> Select a from 
> (select a, b, c, ..., union all select a, b, c, ...)
> {code}
> will perform Union-All over all the specified columns on the two sides, 
> despite the fact that only one column is asked for at the end. Ideally, we 
> should perform ProjectPushDown rule for Union & Union-All to avoid them to 
> generate results which will be discarded at the end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3130) Project can be pushed below union all / union to improve performance

2015-06-01 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567769#comment-14567769
 ] 

Sean Hsuan-Yi Chu commented on DRILL-3130:
--

Review completed at :
https://reviews.apache.org/r/34528/

Commit#: bca20655283d351d5f5c4090e9047419ff22c75e

> Project can be pushed below union all / union to improve performance
> 
>
> Key: DRILL-3130
> URL: https://issues.apache.org/jira/browse/DRILL-3130
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Sean Hsuan-Yi Chu
>
> A query such as 
> {code}
> Select a from 
> (select a, b, c, ..., union all select a, b, c, ...)
> {code}
> will perform Union-All over all the specified columns on the two sides, 
> despite the fact that only one column is asked for at the end. Ideally, we 
> should perform ProjectPushDown rule for Union & Union-All to avoid them to 
> generate results which will be discarded at the end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-3216) Fix existing INFORMATION_SCHEMA.COLUMNS columns

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3216:
--
Comment: was deleted

(was: Note:  Regarding INFORMATION_SCHEMA.COLUMNS.DATA_TYPE:

Need to analyze:
1) ISO/IEC 9075-2:2011(E) section 9.9, "Type name determination" (which defines 
some kind of type name (an  return value) which uses short forms, 
e.g., "VARCHAR"), and uses of (explicit or implement references to) it, versus 
2) DEFINITION_SCHEMA's DATA_TYPE_DESCRIPTOR base table's constraint  
DATA_TYPE_DESCRIPTOR_DATA_TYPE_CHECK_COMBINATIONS (which clearly requires long 
forms, e.g., 'CHARACTER VARYING').)

> Fix existing INFORMATION_SCHEMA.COLUMNS columns
> ---
>
> Key: DRILL-3216
> URL: https://issues.apache.org/jira/browse/DRILL-3216
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> [Editing in progress]
> Change logical null from {{-1}} to actual {{NULL}}:
> - Change column {{CHARACTER_MAXIMUM_LENGTH}}.
> - Change column {{NUMERIC_PRECISION}}.
> - Change column {{NUMERIC_PRECISION_RADIX}}.
> - Change column {{NUMERIC_SCALE}}.
> Change column {{ORDINAL_POSITION}} from zero-based to one-based.
> Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified 
> names (e.g., "CHARACTER").
> Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to 
> "INTERVAL":
> - Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
> - Add column {{INTERVAL_TYPE}}.
> Move {{CHAR}} length from {{NUMERIC_PRECISION}} to 
> {{CHARACTER_MAXIMUM_LENGTH}} (same as {{VARCHAR}} length):
> - Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
> - Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
> CHAR.
> Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
> {{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
> - Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
> and VARBINARY.
> - Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
> BINARY and VARBINARY.
> To correct ordinal position of some existing columns:
> - Add column {{COLUMN_DEFAULT}}.
> - Add column {{CHARACTER_OCTET_LENGTH}}.
> - Reorder column {{NUMERIC_PRECISION}}.
> Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
> {{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
> - Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
> interval types.
> - Add column {{DATETIME_PRECISION}}.
> - Add column {{INTERVAL_PRECISION}}.
> Implement {{NUMERIC_PRECISION_RADIX}}:
> - Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
> appropriate values (2, 10, NULL).
> Add missing numeric precision and scale values (for non-DECIMAL types):
> - Change NUMERIC_SCALE from logical null to zero for integer types.
> - Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
> numeric types.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2746) Filter is not pushed into subquery past UNION ALL

2015-06-01 Thread Sean Hsuan-Yi Chu (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567767#comment-14567767
 ] 

Sean Hsuan-Yi Chu commented on DRILL-2746:
--

Review completed at :
https://reviews.apache.org/r/34528/

Commit#: bca20655283d351d5f5c4090e9047419ff22c75e

> Filter is not pushed into subquery past UNION ALL
> -
>
> Key: DRILL-2746
> URL: https://issues.apache.org/jira/browse/DRILL-2746
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.1.0
>
>
> I expected to see filter pushed to at least left side of UNION ALL, instead 
> it is applied after UNION ALL
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select * from (select a1, b1, c1 
> from t1 union all select a2, b2, c2 from t2 )  where a1 = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(a1=[$0], b1=[$1], c1=[$2])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[=($0, 10)])
> 00-04UnionAll(all=[true])
> 00-06  Project(a1=[$2], b1=[$1], c1=[$0])
> 00-08Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> 00-05  Project(a2=[$1], b2=[$0], c2=[$2])
> 00-07Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t2]], 
> selectionRoot=/drill/testdata/predicates/t2, numFiles=1, columns=[`a2`, `b2`, 
> `c2`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2746) Filter is not pushed into subquery past UNION ALL

2015-06-01 Thread Sean Hsuan-Yi Chu (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Hsuan-Yi Chu resolved DRILL-2746.
--
   Resolution: Fixed
Fix Version/s: (was: 1.2.0)
   1.1.0

> Filter is not pushed into subquery past UNION ALL
> -
>
> Key: DRILL-2746
> URL: https://issues.apache.org/jira/browse/DRILL-2746
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.1.0
>
>
> I expected to see filter pushed to at least left side of UNION ALL, instead 
> it is applied after UNION ALL
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select * from (select a1, b1, c1 
> from t1 union all select a2, b2, c2 from t2 )  where a1 = 10;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(a1=[$0], b1=[$1], c1=[$2])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[=($0, 10)])
> 00-04UnionAll(all=[true])
> 00-06  Project(a1=[$2], b1=[$1], c1=[$0])
> 00-08Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t1]], 
> selectionRoot=/drill/testdata/predicates/t1, numFiles=1, columns=[`a1`, `b1`, 
> `c1`]]])
> 00-05  Project(a2=[$1], b2=[$0], c2=[$2])
> 00-07Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/drill/testdata/predicates/t2]], 
> selectionRoot=/drill/testdata/predicates/t2, numFiles=1, columns=[`a2`, `b2`, 
> `c2`]]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3216) Fix existing INFORMATION_SCHEMA.COLUMNS columns

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3216:
--
Description: 
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.


  was:
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.



> Fix existing INFORMATION_SCHEMA.COLUMNS columns
> ---
>
> Key: DRILL-3216
> URL: https://issues.apache.org/jira/browse/DRILL-3216
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> [Editing in progress]
> Change logical null from {{-1}} to actual {{NULL}}:
> - Change column {{CHARACTER_MAXIMUM_LENGTH}}.
> - Change column {{NUMERIC_PRECISION}}.
> - Change column {{NUMERIC_PRECISION_RADIX}}.
> - Change column {{NUMERIC_SCALE}}.
> Change column {{ORDINAL_POSITION}} from zero-based to one-based.
> Change column {{DATA_TYPE}} from short names (e.g., "CHAR"

[jira] [Updated] (DRILL-3216) Fix existing INFORMATION_SCHEMA.COLUMNS columns

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3216:
--
Description: 
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified names 
(e.g., "CHARACTER").

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.


  was:
[Editing in progress]

Change logical null from {{-1}} to actual {{NULL}}:
- Change column {{CHARACTER_MAXIMUM_LENGTH}}.
- Change column {{NUMERIC_PRECISION}}.
- Change column {{NUMERIC_PRECISION_RADIX}}.
- Change column {{NUMERIC_SCALE}}.

Change column {{ORDINAL_POSITION}} from zero-based to one-based.

Move {{CHAR}} length from {{NUMERIC_PRECISION}} to {{CHARACTER_MAXIMUM_LENGTH}} 
(same as {{VARCHAR}} length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for CHAR.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
CHAR.

Move {{BINARY}} and {{VARBINARY}} length from {{NUMERIC_PRECISION}} to 
{{CHARACTER_MAXIMUM_LENGTH}} (same as CHAR and VARCHAR length):
- Change column {{NUMERIC_PRECISION}} from length to logical null for BINARY 
and VARBINARY.
- Change column {{CHARACTER_MAXIMUM_LENGTH}} from logical null to length for 
BINARY and VARBINARY.

Fix data type names "INTERVAL_DAY_TIME" and "INTERVAL_YEAR_MONTH" to "INTERVAL":
- Change column {{DATA_TYPE}} to list "INTERVAL" for interval types.
- Add column {{INTERVAL_TYPE}}.

To correct ordinal position of some existing columns:
- Add column {{COLUMN_DEFAULT}}.
- Add column {{CHARACTER_OCTET_LENGTH}}.
- Reorder column {{NUMERIC_PRECISION}}.

Move date/time and interval precisions from {{NUMERIC_PRECISION}} to 
{{DATETIME_PRECISION}} and {{INTERVAL_PRECISION}}:
- Change column {{NUMERIC_PRECISION}} to logically null for date/time and 
interval types.
- Add column {{DATETIME_PRECISION}}.
- Add column {{INTERVAL_PRECISION}}.

Implement {{NUMERIC_PRECISION_RADIX}}:
- Change column {{NUMERIC_PRECISION_RADIX}} from always logically null to 
appropriate values (2, 10, NULL).

Add missing numeric precision and scale values (for non-DECIMAL types):
- Change NUMERIC_SCALE from logical null to zero for integer types.
- Change NUMERIC_PRECISION from logical null to precision for non-DECIMAL 
numeric types.



> Fix existing INFORMATION_SCHEMA.COLUMNS columns
> ---
>
> Key: DRILL-3216
> URL: https://issues.apache.org/jira/browse/DRILL-3216
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> [Editing in progress]
> Change logical null from {{-1}} to actual {{NULL}}:
> - Change column {{CHARACTER_MAXIMUM_LENGTH}}.
> - Change column {{NUMERIC_PRECISION}}.
> - Change column {{NUMERIC_PRECISION_RADIX}}.
> - Change column {{NUMERIC_SCALE}}.
> Change column {{DATA_TYPE}} from short names (e.g., "CHAR") to specified 
> names (e.g., "CHARACTER").
> Change column {{ORDINAL_POSITION}} from zero-based to one-based.
> Move {{CHAR}} length from {{NUMERIC_PRECISION}} to 
> {

[jira] [Commented] (DRILL-2530) getColumns() doesn't return right COLUMN_SIZE for INTERVAL types

2015-06-01 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567635#comment-14567635
 ] 

Daniel Barclay (Drill) commented on DRILL-2530:
---

Note:  A change a while ago improved MetaImpl.,getColumns() a little bit to 
return upper bound values for COLUMN_SIZE for INTERVAL_YEAR_MONTH and 
INTERVAR_DAY_TIME (using maximum possible precisions in lieu of having actual 
precisions).

> getColumns() doesn't return right COLUMN_SIZE for INTERVAL types
> 
>
> Key: DRILL-2530
> URL: https://issues.apache.org/jira/browse/DRILL-2530
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3230) Local file system plug-in must be disabled in distributed mode

2015-06-01 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-3230:
--

 Summary: Local file system plug-in must be disabled in distributed 
mode
 Key: DRILL-3230
 URL: https://issues.apache.org/jira/browse/DRILL-3230
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - HTTP
Reporter: Abhishek Girish
Assignee: Jacques Nadeau


The local file system plug-in (The "file:///" connection string in dfs storage 
plug-in) does not behave as expected for both CTAS and querying files, when 
Drill is configured with distributed mode (multiple drill-bits across nodes). 

In case of CTAS, parquet files will be written to a specific node's local file 
system, depending on which Drill-bit the client connects to. And if the table 
is moderate to large in size, Drill may process them in a distributed manner 
and write data into more than one node - data is partitioned into different 
nodes. 

In case of queries, it could be confusing again, as the behavior will depend on 
which drill-bit the client connects to. Hence the behavior seen would be 
inconsistent - queries would return only partial results, which depend on the 
drillbit connected to.

My suggestion would be that the local file system plugin be disabled with 
distributed mode. With multiple drill bits and a centralized plugin for local 
file system, consistent behavior cannot be expected. 

It should be either disabled when distributed mode is detected or we could add 
support for multiple namespaces (using IP of nodes) with local file systems 
(might still not fix all issues). Or may be there could be other ways to 
resolve this, which I might be overlooking or not aware of. 

There have been many issues seen on the user ML, where inconsistent behaviors 
have been observed by users.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3229) Create a new EmbeddedVector

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3229:
-

 Summary: Create a new EmbeddedVector
 Key: DRILL-3229
 URL: https://issues.apache.org/jira/browse/DRILL-3229
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Jacques Nadeau


Embedded Vector will leverage a binary encoding for holding information about 
type for each individual field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3228) Implement Embedded Type

2015-06-01 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3228:
-

 Summary: Implement Embedded Type
 Key: DRILL-3228
 URL: https://issues.apache.org/jira/browse/DRILL-3228
 Project: Apache Drill
  Issue Type: Task
  Components: Execution - Codegen, Execution - Data Types, Execution - 
Relational Operators, Functions - Drill
Reporter: Jacques Nadeau
Assignee: Jacques Nadeau


An Umbrella for the implementation of Embedded types within Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3209) [Umbrella] Plan reads of Hive tables as native Drill reads when a native reader for the underlying table format exists

2015-06-01 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3209:
---
Description: All reads against Hive are currently done through the Hive 
Serde interface, while this provides the most flexibility the API is not 
optimized for maximum performance while reading the data into Drill's native 
data structures. For Parquet and Text file backed tables, we can plan these 
reads as Drill native reads. Currently reads of these file types provide 
untyped data. While parquet has metadata in the file we currently do not make 
use of the type information while planning. For text files we read all of the 
files as lists of varchars. In both of these cases, casts will need to be 
injected to provide the same datatypes provided by the reads through the SerDe 
interface.

> [Umbrella] Plan reads of Hive tables as native Drill reads when a native 
> reader for the underlying table format exists
> --
>
> Key: DRILL-3209
> URL: https://issues.apache.org/jira/browse/DRILL-3209
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization, Storage - Hive
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
>
> All reads against Hive are currently done through the Hive Serde interface, 
> while this provides the most flexibility the API is not optimized for maximum 
> performance while reading the data into Drill's native data structures. For 
> Parquet and Text file backed tables, we can plan these reads as Drill native 
> reads. Currently reads of these file types provide untyped data. While 
> parquet has metadata in the file we currently do not make use of the type 
> information while planning. For text files we read all of the files as lists 
> of varchars. In both of these cases, casts will need to be injected to 
> provide the same datatypes provided by the reads through the SerDe interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3209) [Umbrella] Plan reads of Hive tables as native Drill reads when a native reader for the underlying table format exists

2015-06-01 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse updated DRILL-3209:
---
Description: All reads against Hive are currently done through the Hive 
Serde interface. While this provides the most flexibility, the API is not 
optimized for maximum performance while reading the data into Drill's native 
data structures. For Parquet and Text file backed tables, we can plan these 
reads as Drill native reads. Currently reads of these file types provide 
untyped data. While parquet has metadata in the file we currently do not make 
use of the type information while planning. For text files we read all of the 
files as lists of varchars. In both of these cases, casts will need to be 
injected to provide the same datatypes provided by the reads through the SerDe 
interface.  (was: All reads against Hive are currently done through the Hive 
Serde interface, while this provides the most flexibility the API is not 
optimized for maximum performance while reading the data into Drill's native 
data structures. For Parquet and Text file backed tables, we can plan these 
reads as Drill native reads. Currently reads of these file types provide 
untyped data. While parquet has metadata in the file we currently do not make 
use of the type information while planning. For text files we read all of the 
files as lists of varchars. In both of these cases, casts will need to be 
injected to provide the same datatypes provided by the reads through the SerDe 
interface.)

> [Umbrella] Plan reads of Hive tables as native Drill reads when a native 
> reader for the underlying table format exists
> --
>
> Key: DRILL-3209
> URL: https://issues.apache.org/jira/browse/DRILL-3209
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning & Optimization, Storage - Hive
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
>
> All reads against Hive are currently done through the Hive Serde interface. 
> While this provides the most flexibility, the API is not optimized for 
> maximum performance while reading the data into Drill's native data 
> structures. For Parquet and Text file backed tables, we can plan these reads 
> as Drill native reads. Currently reads of these file types provide untyped 
> data. While parquet has metadata in the file we currently do not make use of 
> the type information while planning. For text files we read all of the files 
> as lists of varchars. In both of these cases, casts will need to be injected 
> to provide the same datatypes provided by the reads through the SerDe 
> interface.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)