date:20160322

Drill on YARN

2016-03-22 Thread Paul Rogers

Hi All,

I’m a new member of the Drill Team here at MapR. We’d like to take a look at 
running Drill on YARN for production customers. JIRA suggests some early work 
may have been done (DRILL-142 
, DRILL-1170 
, DRILL-3675 
).

YARN is a complex beast and the Drill community is large and growing. So, a 
good place to start is to ask if anyone has already done work on integrating 
Drill with YARN (see DRILL-142)?  Or has thought about what might be needed?

DRILL-1170 (YARN support for Drill) seems a good place to gather requirements, 
designs and so on. I’ve posted a “starter set” of requirements to spur 
discussion.

Thanks,

- Paul

Fwd: drill git commit: DRILL-3623: For limit 0 queries, optionally use a shorter execution path when result column types are known

2016-03-22 Thread Jacques Nadeau

Awesome job on this Sudheesh.  Thanks for all the hard work. Thanks also to
Sean for all his work on the previous patch.
-- Forwarded message --
From: 
Date: Mar 22, 2016 4:33 PM
Subject: drill git commit: DRILL-3623: For limit 0 queries, optionally use
a shorter execution path when result column types are known
To: 
Cc:

Repository: drill
Updated Branches:
  refs/heads/master 600ba9ee1 -> 5dbaafbe6


DRILL-3623: For limit 0 queries, optionally use a shorter execution path
when result column types are known

+ "planner.enable_limit0_optimization" option is disabled by default

+ Print plan in PlanTestBase if TEST_QUERY_PRINTING_SILENT is set
+ Fix DrillTestWrapper to verify expected and actual schema
+ Correct the schema of results in TestInbuiltHiveUDFs#testXpath_Double

This closes #405


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/5dbaafbe
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/5dbaafbe
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/5dbaafbe

Branch: refs/heads/master
Commit: 5dbaafbe6651b0a284fef69d5c952d82ce506e20
Parents: 600ba9e
Author: Sudheesh Katkam 
Authored: Tue Mar 22 15:21:51 2016 -0700
Committer: Sudheesh Katkam 
Committed: Tue Mar 22 16:19:01 2016 -0700

--
 .../drill/exec/fn/hive/TestInbuiltHiveUDFs.java |   2 +-
 .../org/apache/drill/exec/ExecConstants.java|   3 +
 .../drill/exec/physical/base/ScanStats.java |   6 +-
 .../apache/drill/exec/planner/PlannerPhase.java |   2 +
 .../planner/logical/DrillDirectScanRel.java |  70 ++
 .../exec/planner/physical/DirectScanPrule.java  |  49 ++
 .../planner/sql/handlers/DefaultSqlHandler.java |  12 +
 .../planner/sql/handlers/FindLimit0Visitor.java | 124 +++-
 .../server/options/SystemOptionManager.java |   1 +
 .../exec/store/direct/DirectGroupScan.java  |  27 +-
 .../java/org/apache/drill/DrillTestWrapper.java |  25 +-
 .../java/org/apache/drill/PlanTestBase.java |   9 +-
 .../impl/limit/TestEarlyLimit0Optimization.java | 663 +++
 13 files changed, 963 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/drill/blob/5dbaafbe/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
--
diff --git
a/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
b/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
index a287c89..a126aaa 100644
---
a/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
+++
b/contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
@@ -58,7 +58,7 @@ public class TestInbuiltHiveUDFs extends HiveTestBase {

 final TypeProtos.MajorType majorType =
TypeProtos.MajorType.newBuilder()
 .setMinorType(TypeProtos.MinorType.FLOAT8)
-.setMode(TypeProtos.DataMode.REQUIRED)
+.setMode(TypeProtos.DataMode.OPTIONAL)
 .build();

 final List> expectedSchema =
Lists.newArrayList();

http://git-wip-us.apache.org/repos/asf/drill/blob/5dbaafbe/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
--
diff --git
a/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
b/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
index b8f25ad..963934d 100644
--- a/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
+++ b/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java
@@ -202,6 +202,9 @@ public interface ExecConstants {
   String AFFINITY_FACTOR_KEY = "planner.affinity_factor";
   OptionValidator AFFINITY_FACTOR = new
DoubleValidator(AFFINITY_FACTOR_KEY, 1.2d);

+  String EARLY_LIMIT0_OPT_KEY = "planner.enable_limit0_optimization";
+  BooleanValidator EARLY_LIMIT0_OPT = new
BooleanValidator(EARLY_LIMIT0_OPT_KEY, false);
+
   String ENABLE_MEMORY_ESTIMATION_KEY =
"planner.memory.enable_memory_estimation";
   OptionValidator ENABLE_MEMORY_ESTIMATION = new
BooleanValidator(ENABLE_MEMORY_ESTIMATION_KEY, false);


http://git-wip-us.apache.org/repos/asf/drill/blob/5dbaafbe/exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/ScanStats.java
--
diff --git
a/exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/ScanStats.java
b/exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/ScanStats.java
index ba36931..1886c14 100644
---

[jira] [Resolved] (DRILL-3623) Limit 0 should avoid execution when querying a known schema

2016-03-22 Thread Sudheesh Katkam (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sudheesh Katkam resolved DRILL-3623.

Resolution: Fixed

Fixed in 
[5dbaafb|https://github.com/apache/drill/commit/5dbaafbe6651b0a284fef69d5c952d82ce506e20].

> Limit 0 should avoid execution when querying a known schema
> ---
>
> Key: DRILL-3623
> URL: https://issues.apache.org/jira/browse/DRILL-3623
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Storage - Hive
>Affects Versions: 1.1.0
> Environment: MapR cluster
>Reporter: Andries Engelbrecht
>Assignee: Sudheesh Katkam
>  Labels: doc-impacting
> Fix For: Future
>
>
> Running a select * from hive.table limit 0 does not return (hangs).
> Select * from hive.table limit 1 works fine
> Hive table is about 6GB with 330 files with parquet using snappy compression.
> Data types are int, bigint, string and double.
> Querying directory with parquet files through the DFS plugin works fine
> select * from dfs.root.`/user/hive/warehouse/database/table` limit 0;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/405


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (DRILL-4529) SUM() with windows function result in mismatch nullability

2016-03-22 Thread Krystal (JIRA)

Krystal created DRILL-4529:
--

 Summary: SUM() with windows function result in mismatch nullability
 Key: DRILL-4529
 URL: https://issues.apache.org/jira/browse/DRILL-4529
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Krystal
Assignee: Sean Hsuan-Yi Chu


git.commit.id.abbrev=cee5317

select 
sum(1)  over w sum1, 
sum(5)  over w sum5,
sum(10) over w sum10
from 
j1_v
where 
c_date is not null
window w as (partition by c_date);

Output from test:
limit 0: [columnNoNulls, columnNoNulls, columnNoNulls]
regular: [columnNullable, columnNullable, columnNullable]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: Next Release

2016-03-22 Thread Parth Chandra

Wonderful !
>From my experience with 1.6, I was going to suggest we start the process of
'finalizing' the items that are a must have for the release a week before
the end of the month. Otherwise the release takes an extra week.

Parth

On Tue, Mar 22, 2016 at 8:55 AM, Jacques Nadeau  wrote:

> Hey All,
>
> I'd like to volunteer to be the 1.7 release manager. I'd also like to plan
> putting together a target feature list for the release now so we can all
> plan ahead. I'll share an initial stab at this later today if people think
> that sounds good.
>
> Thanks
> Jacques
>

Re: [DISCUSS] Remove required type

2016-03-22 Thread Parth Chandra

I'm not entirely convinced that this would have no performance impact. Do
we have any experiments?


On Tue, Mar 22, 2016 at 1:36 PM, Jacques Nadeau  wrote:

> My suggestion is we use explicit observation at the batch level. If there
> are no nulls we can optimize this batch. This would ultimately improve over
> our current situation where most parquet and all json data is nullable so
> we don't optimize. I'd estimate that the vast majority of Drills workloads
> are marked nullable whether they are or not. So what we're really
> suggesting is deleting a bunch of code which is rarely in the execution
> path.
> On Mar 22, 2016 1:22 PM, "Aman Sinha"  wrote:
>
> > I was thinking about it more after sending the previous concerns.  Agree,
> > this is an execution side change...but some details need to be worked
> out.
> > If the planner indicates to the executor that a column is non-nullable
> (e.g
> > a primary key),  the run-time generated code is more efficient since it
> > does not have to check the null bit.  Are you thinking we would use the
> > existing nullable vector and add some additional metadata (at a record
> > batch level rather than record level) to indicate non-nullability ?
> >
> >
> > On Tue, Mar 22, 2016 at 12:27 PM, Jacques Nadeau 
> > wrote:
> >
> > > Hey Aman, I believe both Steven and I were only suggesting removal only
> > > from execution, not planning. It seems like your concerns are all
> related
> > > to planning. Iit seems like the real tradeoffs in execution are
> nominal.
> > > On Mar 22, 2016 9:03 AM, "Aman Sinha"  wrote:
> > >
> > > > While it is true that there is code complexity due to the required
> > type,
> > > > what would we be trading off ?  some important considerations:
> > > >   - We don't currently have null count statistics which would need to
> > be
> > > > implemented for various data sources
> > > >   - Primary keys in the RDBMS sources (or rowkeys in hbase) are
> always
> > > > non-null, and although today we may not be doing optimizations to
> > > leverage
> > > > that,  one could easily add a rule that converts  WHERE primary_key
> IS
> > > NULL
> > > > to a FALSE filter.
> > > >
> > > >
> > > > On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky <
> > doshin...@commvault.com>
> > > > wrote:
> > > >
> > > > > Hi Jacques,
> > > > > Marginally related to this, I made a small change in PR-372
> > > (DRILL-4184)
> > > > > to support variable widths for decimal quantities in Parquet.  I
> > found
> > > > the
> > > > > (decimal) vectoring code to be very difficult to understand
> (probably
> > > > > because it's overly complex, but also because I'm new to Drill code
> > in
> > > > > general), so I made a small, surgical change in my pull request to
> > > > support
> > > > > keeping track of variable widths (lengths) and null booleans within
> > the
> > > > > existing fixed width decimal vectoring scheme.  Can my changes be
> > > > > reviewed/accepted, and then we discuss how to fix properly
> long-term?
> > > > >
> > > > > Thanks,
> > > > > Dave Oshinsky
> > > > >
> > > > > -Original Message-
> > > > > From: Jacques Nadeau [mailto:jacq...@dremio.com]
> > > > > Sent: Monday, March 21, 2016 11:43 PM
> > > > > To: dev
> > > > > Subject: Re: [DISCUSS] Remove required type
> > > > >
> > > > > Definitely in support of this. The required type is a huge
> > maintenance
> > > > and
> > > > > code complexity nightmare that provides little to no benefit. As
> you
> > > > point
> > > > > out, we can do better performance optimizations though null count
> > > > > observation since most sources are nullable anyway.
> > > > > On Mar 21, 2016 7:41 PM, "Steven Phillips" 
> > wrote:
> > > > >
> > > > > > I have been thinking about this for a while now, and I feel it
> > would
> > > > > > be a good idea to remove the Required vector types from Drill,
> and
> > > > > > only use the Nullable version of vectors. I think this will
> greatly
> > > > > simplify the code.
> > > > > > It will also simplify the creation of UDFs. As is, if a function
> > has
> > > > > > custom null handling (i.e. INTERNAL), the function has to be
> > > > > > separately implemented for each permutation of nullability of the
> > > > > > inputs. But if drill data types are always nullable, this
> wouldn't
> > > be a
> > > > > problem.
> > > > > >
> > > > > > I don't think there would be much impact on performance. In
> > practice,
> > > > > > I think the required type is used very rarely. And there are
> other
> > > > > > ways we can optimize for when a column is known to have no nulls.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > ***Legal
> > Disclaimer***
> > > > > "This communication may contain confidential and privileged
> material
> > > for
> > > > > the
> > > > > sole use of the intended recipient. Any unauthorized

Re: [DISCUSS] Remove required type

2016-03-22 Thread Hanifi GUNES

My major concern here too would be possible performance implications. That
being said, I can see ways to speed up execution relying on vector
density(for instance count) not sure how batch density would work. Perhaps
an example would throw some more light.

Why don't we think about some "good bad cases" to evaluate performance
impact? I wonder to which degree performance would degrade(if any) from
required to optional.

Also big chunk of code to handle required is already in. Any particular
reason to remove them?


-Hanifi


2016-03-22 13:36 GMT-07:00 Jacques Nadeau :

> My suggestion is we use explicit observation at the batch level. If there
> are no nulls we can optimize this batch. This would ultimately improve over
> our current situation where most parquet and all json data is nullable so
> we don't optimize. I'd estimate that the vast majority of Drills workloads
> are marked nullable whether they are or not. So what we're really
> suggesting is deleting a bunch of code which is rarely in the execution
> path.
> On Mar 22, 2016 1:22 PM, "Aman Sinha"  wrote:
>
> > I was thinking about it more after sending the previous concerns.  Agree,
> > this is an execution side change...but some details need to be worked
> out.
> > If the planner indicates to the executor that a column is non-nullable
> (e.g
> > a primary key),  the run-time generated code is more efficient since it
> > does not have to check the null bit.  Are you thinking we would use the
> > existing nullable vector and add some additional metadata (at a record
> > batch level rather than record level) to indicate non-nullability ?
> >
> >
> > On Tue, Mar 22, 2016 at 12:27 PM, Jacques Nadeau 
> > wrote:
> >
> > > Hey Aman, I believe both Steven and I were only suggesting removal only
> > > from execution, not planning. It seems like your concerns are all
> related
> > > to planning. Iit seems like the real tradeoffs in execution are
> nominal.
> > > On Mar 22, 2016 9:03 AM, "Aman Sinha"  wrote:
> > >
> > > > While it is true that there is code complexity due to the required
> > type,
> > > > what would we be trading off ?  some important considerations:
> > > >   - We don't currently have null count statistics which would need to
> > be
> > > > implemented for various data sources
> > > >   - Primary keys in the RDBMS sources (or rowkeys in hbase) are
> always
> > > > non-null, and although today we may not be doing optimizations to
> > > leverage
> > > > that,  one could easily add a rule that converts  WHERE primary_key
> IS
> > > NULL
> > > > to a FALSE filter.
> > > >
> > > >
> > > > On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky <
> > doshin...@commvault.com>
> > > > wrote:
> > > >
> > > > > Hi Jacques,
> > > > > Marginally related to this, I made a small change in PR-372
> > > (DRILL-4184)
> > > > > to support variable widths for decimal quantities in Parquet.  I
> > found
> > > > the
> > > > > (decimal) vectoring code to be very difficult to understand
> (probably
> > > > > because it's overly complex, but also because I'm new to Drill code
> > in
> > > > > general), so I made a small, surgical change in my pull request to
> > > > support
> > > > > keeping track of variable widths (lengths) and null booleans within
> > the
> > > > > existing fixed width decimal vectoring scheme.  Can my changes be
> > > > > reviewed/accepted, and then we discuss how to fix properly
> long-term?
> > > > >
> > > > > Thanks,
> > > > > Dave Oshinsky
> > > > >
> > > > > -Original Message-
> > > > > From: Jacques Nadeau [mailto:jacq...@dremio.com]
> > > > > Sent: Monday, March 21, 2016 11:43 PM
> > > > > To: dev
> > > > > Subject: Re: [DISCUSS] Remove required type
> > > > >
> > > > > Definitely in support of this. The required type is a huge
> > maintenance
> > > > and
> > > > > code complexity nightmare that provides little to no benefit. As
> you
> > > > point
> > > > > out, we can do better performance optimizations though null count
> > > > > observation since most sources are nullable anyway.
> > > > > On Mar 21, 2016 7:41 PM, "Steven Phillips" 
> > wrote:
> > > > >
> > > > > > I have been thinking about this for a while now, and I feel it
> > would
> > > > > > be a good idea to remove the Required vector types from Drill,
> and
> > > > > > only use the Nullable version of vectors. I think this will
> greatly
> > > > > simplify the code.
> > > > > > It will also simplify the creation of UDFs. As is, if a function
> > has
> > > > > > custom null handling (i.e. INTERNAL), the function has to be
> > > > > > separately implemented for each permutation of nullability of the
> > > > > > inputs. But if drill data types are always nullable, this
> wouldn't
> > > be a
> > > > > problem.
> > > > > >
> > > > > > I don't think there would be much impact on performance. In
> > practice,
> > > > > > I think the required type is used very rarely. And there

[GitHub] drill pull request: remove DrillAvgVarianceConvertlet

2016-03-22 Thread minji-kim

GitHub user minji-kim opened a pull request:

https://github.com/apache/drill/pull/441

remove DrillAvgVarianceConvertlet

Removed unnecessary class, and ran regression. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/minji-kim/drill DRILL-4527

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/441.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #441






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread sudheeshkatkam

Github user sudheeshkatkam commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-200026538
  
Thank you for the reviews.

All regression tests passed; I am running unit tests right now.

Note that, the `planner.enable_limit0_optimization` option is disabled by 
default. To summarize (and document) the limitations:

If, during validation, the planner is able to resolve that the types of the 
columns (i.e. types are non late binding), the shorter execution path is taken. 
Some types are excluded:
+ DECIMAL type is not fully supported in general.
+ VARBINARY is not fully tested.
+ MAP, ARRAY are currently not exposed to the planner.
+ TINYINT, SMALLINT are defined in the Drill type system but have been 
turned off for now.
+ SYMBOL, MULTISET, DISTINCT, STRUCTURED, ROW, OTHER, CURSOR, COLUMN_LIST 
are Calcite types currently not supported by Drill, nor defined in the Drill 
type list.

Three scenarios when the planner can do type resolution during validation:
+ Queries on Hive tables
+ Queries with explicit casts on table columns, example: `SELECT CAST(col1 
AS BIGINT), ABS(CAST(col2 AS INTEGER)) FROM table;`
+ Queries on views with casts on table columns

In the latter two cases, the schema of the query with LIMIT 0 clause has 
relaxed nullability compared to the query without the LIMIT 0 clause. Example:
Say the schema definition of the Parquet file (`numbers.parquet`) is:
```
message Numbers {
  required int col1;
  optional int col2;
 }
```

Since the view definition does not specify nullability of columns, and 
schema of a parquet file is not yet leveraged by Drill's planner:
```
CREATE VIEW dfs.tmp.mynumbers AS SELECT CAST(col1 AS INTEGER) as col1, 
CAST(col2 AS INTEGER) AS col2 FROM dfs.tmp.`numbers.parquet`;
```
(1) For query with LIMIT 0 clause, since the file/ metadata is not read, 
Drill assumes the nullability of both columns is 
[`columnNullable`](https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSetMetaData.html#columnNullable).
```
SELECT col1, col2 FROM dfs.tmp.mynumbers LIMIT 0;
```

(2) For query without LIMIT 0 clause, since the file is read, Drill knows 
the nullability of `col1` is 
[`columnNoNulls`](https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSetMetaData.html#columnNoNulls),
 and `col2` is 
[`columnNullable`](https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSetMetaData.html#columnNullable).
```
SELECT col1, col2 FROM dfs.tmp.mynumbers LIMIT 1;
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4514 : Add describe schema ...

2016-03-22 Thread arina-ielchiieva

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/436#discussion_r57066817
  
--- Diff: exec/java-exec/src/main/codegen/includes/parserImpls.ftl ---
@@ -278,3 +278,19 @@ SqlNode SqlRefreshMetadata() :
 }
 }
 
+/**
--- End diff --

I would say, it's rather specific syntax.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: [DISCUSS] Remove required type

2016-03-22 Thread Jacques Nadeau

My suggestion is we use explicit observation at the batch level. If there
are no nulls we can optimize this batch. This would ultimately improve over
our current situation where most parquet and all json data is nullable so
we don't optimize. I'd estimate that the vast majority of Drills workloads
are marked nullable whether they are or not. So what we're really
suggesting is deleting a bunch of code which is rarely in the execution
path.
On Mar 22, 2016 1:22 PM, "Aman Sinha"  wrote:

> I was thinking about it more after sending the previous concerns.  Agree,
> this is an execution side change...but some details need to be worked out.
> If the planner indicates to the executor that a column is non-nullable (e.g
> a primary key),  the run-time generated code is more efficient since it
> does not have to check the null bit.  Are you thinking we would use the
> existing nullable vector and add some additional metadata (at a record
> batch level rather than record level) to indicate non-nullability ?
>
>
> On Tue, Mar 22, 2016 at 12:27 PM, Jacques Nadeau 
> wrote:
>
> > Hey Aman, I believe both Steven and I were only suggesting removal only
> > from execution, not planning. It seems like your concerns are all related
> > to planning. Iit seems like the real tradeoffs in execution are nominal.
> > On Mar 22, 2016 9:03 AM, "Aman Sinha"  wrote:
> >
> > > While it is true that there is code complexity due to the required
> type,
> > > what would we be trading off ?  some important considerations:
> > >   - We don't currently have null count statistics which would need to
> be
> > > implemented for various data sources
> > >   - Primary keys in the RDBMS sources (or rowkeys in hbase) are always
> > > non-null, and although today we may not be doing optimizations to
> > leverage
> > > that,  one could easily add a rule that converts  WHERE primary_key IS
> > NULL
> > > to a FALSE filter.
> > >
> > >
> > > On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky <
> doshin...@commvault.com>
> > > wrote:
> > >
> > > > Hi Jacques,
> > > > Marginally related to this, I made a small change in PR-372
> > (DRILL-4184)
> > > > to support variable widths for decimal quantities in Parquet.  I
> found
> > > the
> > > > (decimal) vectoring code to be very difficult to understand (probably
> > > > because it's overly complex, but also because I'm new to Drill code
> in
> > > > general), so I made a small, surgical change in my pull request to
> > > support
> > > > keeping track of variable widths (lengths) and null booleans within
> the
> > > > existing fixed width decimal vectoring scheme.  Can my changes be
> > > > reviewed/accepted, and then we discuss how to fix properly long-term?
> > > >
> > > > Thanks,
> > > > Dave Oshinsky
> > > >
> > > > -Original Message-
> > > > From: Jacques Nadeau [mailto:jacq...@dremio.com]
> > > > Sent: Monday, March 21, 2016 11:43 PM
> > > > To: dev
> > > > Subject: Re: [DISCUSS] Remove required type
> > > >
> > > > Definitely in support of this. The required type is a huge
> maintenance
> > > and
> > > > code complexity nightmare that provides little to no benefit. As you
> > > point
> > > > out, we can do better performance optimizations though null count
> > > > observation since most sources are nullable anyway.
> > > > On Mar 21, 2016 7:41 PM, "Steven Phillips" 
> wrote:
> > > >
> > > > > I have been thinking about this for a while now, and I feel it
> would
> > > > > be a good idea to remove the Required vector types from Drill, and
> > > > > only use the Nullable version of vectors. I think this will greatly
> > > > simplify the code.
> > > > > It will also simplify the creation of UDFs. As is, if a function
> has
> > > > > custom null handling (i.e. INTERNAL), the function has to be
> > > > > separately implemented for each permutation of nullability of the
> > > > > inputs. But if drill data types are always nullable, this wouldn't
> > be a
> > > > problem.
> > > > >
> > > > > I don't think there would be much impact on performance. In
> practice,
> > > > > I think the required type is used very rarely. And there are other
> > > > > ways we can optimize for when a column is known to have no nulls.
> > > > >
> > > > > Thoughts?
> > > > >
> > > >
> > > >
> > > >
> > > > ***Legal
> Disclaimer***
> > > > "This communication may contain confidential and privileged material
> > for
> > > > the
> > > > sole use of the intended recipient. Any unauthorized review, use or
> > > > distribution
> > > > by others is strictly prohibited. If you have received the message by
> > > > mistake,
> > > > please advise the sender by reply email and delete the message. Thank
> > > you."
> > > >
> **
> > >
> >
>

Re: [DISCUSS] Remove required type

2016-03-22 Thread Aman Sinha

I was thinking about it more after sending the previous concerns.  Agree,
this is an execution side change...but some details need to be worked out.
If the planner indicates to the executor that a column is non-nullable (e.g
a primary key),  the run-time generated code is more efficient since it
does not have to check the null bit.  Are you thinking we would use the
existing nullable vector and add some additional metadata (at a record
batch level rather than record level) to indicate non-nullability ?


On Tue, Mar 22, 2016 at 12:27 PM, Jacques Nadeau  wrote:

> Hey Aman, I believe both Steven and I were only suggesting removal only
> from execution, not planning. It seems like your concerns are all related
> to planning. Iit seems like the real tradeoffs in execution are nominal.
> On Mar 22, 2016 9:03 AM, "Aman Sinha"  wrote:
>
> > While it is true that there is code complexity due to the required type,
> > what would we be trading off ?  some important considerations:
> >   - We don't currently have null count statistics which would need to be
> > implemented for various data sources
> >   - Primary keys in the RDBMS sources (or rowkeys in hbase) are always
> > non-null, and although today we may not be doing optimizations to
> leverage
> > that,  one could easily add a rule that converts  WHERE primary_key IS
> NULL
> > to a FALSE filter.
> >
> >
> > On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky 
> > wrote:
> >
> > > Hi Jacques,
> > > Marginally related to this, I made a small change in PR-372
> (DRILL-4184)
> > > to support variable widths for decimal quantities in Parquet.  I found
> > the
> > > (decimal) vectoring code to be very difficult to understand (probably
> > > because it's overly complex, but also because I'm new to Drill code in
> > > general), so I made a small, surgical change in my pull request to
> > support
> > > keeping track of variable widths (lengths) and null booleans within the
> > > existing fixed width decimal vectoring scheme.  Can my changes be
> > > reviewed/accepted, and then we discuss how to fix properly long-term?
> > >
> > > Thanks,
> > > Dave Oshinsky
> > >
> > > -Original Message-
> > > From: Jacques Nadeau [mailto:jacq...@dremio.com]
> > > Sent: Monday, March 21, 2016 11:43 PM
> > > To: dev
> > > Subject: Re: [DISCUSS] Remove required type
> > >
> > > Definitely in support of this. The required type is a huge maintenance
> > and
> > > code complexity nightmare that provides little to no benefit. As you
> > point
> > > out, we can do better performance optimizations though null count
> > > observation since most sources are nullable anyway.
> > > On Mar 21, 2016 7:41 PM, "Steven Phillips"  wrote:
> > >
> > > > I have been thinking about this for a while now, and I feel it would
> > > > be a good idea to remove the Required vector types from Drill, and
> > > > only use the Nullable version of vectors. I think this will greatly
> > > simplify the code.
> > > > It will also simplify the creation of UDFs. As is, if a function has
> > > > custom null handling (i.e. INTERNAL), the function has to be
> > > > separately implemented for each permutation of nullability of the
> > > > inputs. But if drill data types are always nullable, this wouldn't
> be a
> > > problem.
> > > >
> > > > I don't think there would be much impact on performance. In practice,
> > > > I think the required type is used very rarely. And there are other
> > > > ways we can optimize for when a column is known to have no nulls.
> > > >
> > > > Thoughts?
> > > >
> > >
> > >
> > >
> > > ***Legal Disclaimer***
> > > "This communication may contain confidential and privileged material
> for
> > > the
> > > sole use of the intended recipient. Any unauthorized review, use or
> > > distribution
> > > by others is strictly prohibited. If you have received the message by
> > > mistake,
> > > please advise the sender by reply email and delete the message. Thank
> > you."
> > > **
> >
>

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread jinfengni

Github user jinfengni commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-20859
  
LGTM.

+1 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Re: [GitHub] drill pull request: Elasticsearch storage plugin

2016-03-22 Thread Stefán Baxter

Hi hamdanuk,

Can you please tell me what the status is of the Solr integration?

We have been looking into the Lucene plugin and are very interested trying
the Solr plugin.

Thank you,
 -Stefán

On Tue, Mar 22, 2016 at 7:47 PM, hamdanuk  wrote:

> GitHub user hamdanuk opened a pull request:
>
> https://github.com/apache/drill/pull/440
>
> Elasticsearch storage plugin
>
> Would you please add Elasticsearch storage plugin.
>
> You can merge this pull request into a Git repository by running:
>
> $ git pull https://github.com/apache/drill master
>
> Alternatively you can review and apply these changes as the patch at:
>
> https://github.com/apache/drill/pull/440.patch
>
> To close this pull request, make a commit to your master/trunk branch
> with (at least) the following in the commit message:
>
> This closes #440
>
> 
> commit 1f23b89623c72808f2ee866cec9b4b8a48929d68
> Author: Parth Chandra 
> Date:   2016-03-11T01:02:16Z
>
> Update version to 1.7.0-SNAPSHOT
>
> commit b979bebe83d7017880b0763adcbf8eb80acfcee8
> Author: Hsuan-Yi Chu 
> Date:   2016-03-04T21:50:02Z
>
> DRILL-4476: Allow UnionAllRecordBatch to manager situations where left
> input side or both sides come(s) from empty source(s).
>
> close apache/drill#407
>
> commit 3cf0514e50a46f0e491e9cd5860ed42890c18fa1
> Author: Parth Chandra 
> Date:   2016-03-13T16:50:54Z
>
> Added Parth's GPG Key
>
> commit 46e3de790da8f9c6d2d18e7e40fd37c01b3b1681
> Author: Hsuan-Yi Chu 
> Date:   2016-03-10T01:25:11Z
>
> DRILL-4490: Ensure the count generated by ConvertCountToDirectScan is
> non-nullable
>
> commit f7197596d61bf2f3652df8318113636ef1eb5c18
> Author: Aman Sinha 
> Date:   2016-03-08T17:27:32Z
>
> DRILL-4479: For empty fields under all_text_mode enabled (a) use
> varchar for the default columns and (b) ensure we create fields
> corresponding to all columns.
>
> close apache/drill#420
>
> commit dd4f03be93c7c804954b2f027f6a9071d5291b38
> Author: Arina Ielchiieva 
> Date:   2016-02-19T17:03:52Z
>
> DRILL-3745: Hive CHAR not supported
>
> commit 050ff9679d99b5cdacc86f5501802c3d2a6dd3e3
> Author: Aditya Kishore 
> Date:   2016-03-14T22:15:38Z
>
> DRILL-4050: Add zip archives to the list of artifacts in
> verify_release.sh
>
> This enhanced version of the script allows integrated download and
> verification of a Drill release. It can be used to verify both the main
> release artifacts and maven repository artifacts.
>
> For example, to verify the 1.6 rc0 release artifacts, I ran
>
> ./verify_release.sh
> https://repository.apache.org/content/repositories/orgapachedrill-1030/
> /tmp/drill-1.6/maven/
> ./verify_release.sh
> http://home.apache.org/~parthc/drill/releases/1.6.0/rc0/
> /tmp/drill-1.6/main/
>
> If I had pre-downloaded the files in the respective folders, I'd run
>
> ./verify_release.sh /tmp/drill-1.6/maven/
> ./verify_release.sh /tmp/drill-1.6/main/
>
> Finally, run with `-nv` option to reduce the verbosity of the output.
>
> Closes #249.
>
> commit 11fe8d7cdb1df4100cd48bcce1de0b2c3c5f983a
> Author: adeneche 
> Date:   2016-03-09T12:44:02Z
>
> DRILL-4376: Wrong results when doing a count(*) on part of directories
> with metadata cache
>
> commit 71608ca9fb53ff0af4f1d09f32d61e7280377e7a
> Author: adeneche 
> Date:   2016-03-10T09:40:06Z
>
> DRILL-4484: NPE when querying  empty directory
>
> commit 245da9790813569c5da9404e0fc5e45cc88e22bb
> Author: Aditya Kishore 
> Date:   2016-03-12T19:12:34Z
>
> DRILL-4501: Complete MapOrListWriter for all supported data types
>
> Closes #427
>
> commit 9ecf4a484e2cc03f73aacd1b4f3801bb1909b71f
> Author: Hsuan-Yi Chu 
> Date:   2016-03-04T04:14:59Z
>
> DRILL-4372: (continued) Type inference for HiveUDFs
>
> commit c0293354ec79b42ff27ce4ad2113a2ff52a934bd
> Author: Hsuan-Yi Chu 
> Date:   2016-03-04T06:38:04Z
>
> DRILL-4372: Expose the functions return type to Drill
>
> - Drill-Calite version update:
> This commit needs to have Calcite's patch (CALCITE-1062) to plugin
> customized SqlOperator.
>
> - FunctionTemplate
> Add FunctionArgumentNumber annotation. This annotation element tells
> if the number of argument(s) is fixed or arbitrary (e.g., String
> concatenation function).
>
> Due to this modification, there are some minor changes in
> DrillFuncHolder, DrillFunctionRegistry and FunctionAttributes.
>
> - Checker
> Add a new Checker (which Calcite uses to validate the legitimacy of
> the number of argument(s) for a function) to allow functions with arbitrary
> arguments to pass Caclite's validation
>
> - Type conversion between Drill and Calcite
> DrillConstExector is given a static method

[GitHub] drill pull request: Elasticsearch storage plugin

2016-03-22 Thread hamdanuk

GitHub user hamdanuk opened a pull request:

https://github.com/apache/drill/pull/440

Elasticsearch storage plugin

Would you please add Elasticsearch storage plugin.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/drill master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/440.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #440


commit 1f23b89623c72808f2ee866cec9b4b8a48929d68
Author: Parth Chandra 
Date:   2016-03-11T01:02:16Z

Update version to 1.7.0-SNAPSHOT

commit b979bebe83d7017880b0763adcbf8eb80acfcee8
Author: Hsuan-Yi Chu 
Date:   2016-03-04T21:50:02Z

DRILL-4476: Allow UnionAllRecordBatch to manager situations where left 
input side or both sides come(s) from empty source(s).

close apache/drill#407

commit 3cf0514e50a46f0e491e9cd5860ed42890c18fa1
Author: Parth Chandra 
Date:   2016-03-13T16:50:54Z

Added Parth's GPG Key

commit 46e3de790da8f9c6d2d18e7e40fd37c01b3b1681
Author: Hsuan-Yi Chu 
Date:   2016-03-10T01:25:11Z

DRILL-4490: Ensure the count generated by ConvertCountToDirectScan is 
non-nullable

commit f7197596d61bf2f3652df8318113636ef1eb5c18
Author: Aman Sinha 
Date:   2016-03-08T17:27:32Z

DRILL-4479: For empty fields under all_text_mode enabled (a) use varchar 
for the default columns and (b) ensure we create fields corresponding to all 
columns.

close apache/drill#420

commit dd4f03be93c7c804954b2f027f6a9071d5291b38
Author: Arina Ielchiieva 
Date:   2016-02-19T17:03:52Z

DRILL-3745: Hive CHAR not supported

commit 050ff9679d99b5cdacc86f5501802c3d2a6dd3e3
Author: Aditya Kishore 
Date:   2016-03-14T22:15:38Z

DRILL-4050: Add zip archives to the list of artifacts in verify_release.sh

This enhanced version of the script allows integrated download and 
verification of a Drill release. It can be used to verify both the main release 
artifacts and maven repository artifacts.

For example, to verify the 1.6 rc0 release artifacts, I ran

./verify_release.sh 
https://repository.apache.org/content/repositories/orgapachedrill-1030/ 
/tmp/drill-1.6/maven/
./verify_release.sh 
http://home.apache.org/~parthc/drill/releases/1.6.0/rc0/ /tmp/drill-1.6/main/

If I had pre-downloaded the files in the respective folders, I'd run

./verify_release.sh /tmp/drill-1.6/maven/
./verify_release.sh /tmp/drill-1.6/main/

Finally, run with `-nv` option to reduce the verbosity of the output.

Closes #249.

commit 11fe8d7cdb1df4100cd48bcce1de0b2c3c5f983a
Author: adeneche 
Date:   2016-03-09T12:44:02Z

DRILL-4376: Wrong results when doing a count(*) on part of directories with 
metadata cache

commit 71608ca9fb53ff0af4f1d09f32d61e7280377e7a
Author: adeneche 
Date:   2016-03-10T09:40:06Z

DRILL-4484: NPE when querying  empty directory

commit 245da9790813569c5da9404e0fc5e45cc88e22bb
Author: Aditya Kishore 
Date:   2016-03-12T19:12:34Z

DRILL-4501: Complete MapOrListWriter for all supported data types

Closes #427

commit 9ecf4a484e2cc03f73aacd1b4f3801bb1909b71f
Author: Hsuan-Yi Chu 
Date:   2016-03-04T04:14:59Z

DRILL-4372: (continued) Type inference for HiveUDFs

commit c0293354ec79b42ff27ce4ad2113a2ff52a934bd
Author: Hsuan-Yi Chu 
Date:   2016-03-04T06:38:04Z

DRILL-4372: Expose the functions return type to Drill

- Drill-Calite version update:
This commit needs to have Calcite's patch (CALCITE-1062) to plugin 
customized SqlOperator.

- FunctionTemplate
Add FunctionArgumentNumber annotation. This annotation element tells if the 
number of argument(s) is fixed or arbitrary (e.g., String concatenation 
function).

Due to this modification, there are some minor changes in DrillFuncHolder, 
DrillFunctionRegistry and FunctionAttributes.

- Checker
Add a new Checker (which Calcite uses to validate the legitimacy of the 
number of argument(s) for a function) to allow functions with arbitrary 
arguments to pass Caclite's validation

- Type conversion between Drill and Calcite
DrillConstExector is given a static method getDrillTypeFromCalcite() to 
convert Calcite types to Drill's.

- Extract function's return type inference
Unlike other functions, Extract function's return type can be determined 
solely based on the first argument. A logic is added in to allow this inference 
to happen

- DrillCalcite wrapper:
From the aspects of return type inference and argument type checks, 
Calcite's mechanism is very different from Drill's. In addition,

Re: [DISCUSS] Remove required type

2016-03-22 Thread Jacques Nadeau

Hey Aman, I believe both Steven and I were only suggesting removal only
from execution, not planning. It seems like your concerns are all related
to planning. Iit seems like the real tradeoffs in execution are nominal.
On Mar 22, 2016 9:03 AM, "Aman Sinha"  wrote:

> While it is true that there is code complexity due to the required type,
> what would we be trading off ?  some important considerations:
>   - We don't currently have null count statistics which would need to be
> implemented for various data sources
>   - Primary keys in the RDBMS sources (or rowkeys in hbase) are always
> non-null, and although today we may not be doing optimizations to leverage
> that,  one could easily add a rule that converts  WHERE primary_key IS NULL
> to a FALSE filter.
>
>
> On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky 
> wrote:
>
> > Hi Jacques,
> > Marginally related to this, I made a small change in PR-372 (DRILL-4184)
> > to support variable widths for decimal quantities in Parquet.  I found
> the
> > (decimal) vectoring code to be very difficult to understand (probably
> > because it's overly complex, but also because I'm new to Drill code in
> > general), so I made a small, surgical change in my pull request to
> support
> > keeping track of variable widths (lengths) and null booleans within the
> > existing fixed width decimal vectoring scheme.  Can my changes be
> > reviewed/accepted, and then we discuss how to fix properly long-term?
> >
> > Thanks,
> > Dave Oshinsky
> >
> > -Original Message-
> > From: Jacques Nadeau [mailto:jacq...@dremio.com]
> > Sent: Monday, March 21, 2016 11:43 PM
> > To: dev
> > Subject: Re: [DISCUSS] Remove required type
> >
> > Definitely in support of this. The required type is a huge maintenance
> and
> > code complexity nightmare that provides little to no benefit. As you
> point
> > out, we can do better performance optimizations though null count
> > observation since most sources are nullable anyway.
> > On Mar 21, 2016 7:41 PM, "Steven Phillips"  wrote:
> >
> > > I have been thinking about this for a while now, and I feel it would
> > > be a good idea to remove the Required vector types from Drill, and
> > > only use the Nullable version of vectors. I think this will greatly
> > simplify the code.
> > > It will also simplify the creation of UDFs. As is, if a function has
> > > custom null handling (i.e. INTERNAL), the function has to be
> > > separately implemented for each permutation of nullability of the
> > > inputs. But if drill data types are always nullable, this wouldn't be a
> > problem.
> > >
> > > I don't think there would be much impact on performance. In practice,
> > > I think the required type is used very rarely. And there are other
> > > ways we can optimize for when a column is known to have no nulls.
> > >
> > > Thoughts?
> > >
> >
> >
> >
> > ***Legal Disclaimer***
> > "This communication may contain confidential and privileged material for
> > the
> > sole use of the intended recipient. Any unauthorized review, use or
> > distribution
> > by others is strictly prohibited. If you have received the message by
> > mistake,
> > please advise the sender by reply email and delete the message. Thank
> you."
> > **
>

[jira] [Created] (DRILL-4528) AVG() windows function is not optimized for limit 0 queries

2016-03-22 Thread Krystal (JIRA)

Krystal created DRILL-4528:
--

 Summary: AVG() windows function is not optimized for limit 0 
queries
 Key: DRILL-4528
 URL: https://issues.apache.org/jira/browse/DRILL-4528
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Krystal
Assignee: Sean Hsuan-Yi Chu


git.commit.id.abbrev=cee5317

The following sample query contains the avg() windows function that is not 
optimized when wrapped with limit 0:

select * from (
SELECT AVG(cast( col1 as BIGINT )) OVER(PARTITION BY cast( col4 as TIMESTAMP) 
ORDER BY cast( col5 as DATE )) FROM `fewRowsAllData_v`) t limit 0

Physical Plan:
{code}
00-00Screen : rowType = RecordType(ANY EXPR$0): rowcount = 1.0, cumulative 
cost = {469.1 rows, 5717.190984570043 cpu, 0.0 io, 0.0 network, 1872.0 memory}, 
id = 5762023
00-01  Project(EXPR$0=[$0]) : rowType = RecordType(ANY EXPR$0): rowcount = 
1.0, cumulative cost = {469.0 rows, 5717.090984570043 cpu, 0.0 io, 0.0 network, 
1872.0 memory}, id = 5762022
00-02SelectionVectorRemover : rowType = RecordType(ANY EXPR$0): 
rowcount = 1.0, cumulative cost = {469.0 rows, 5717.090984570043 cpu, 0.0 io, 
0.0 network, 1872.0 memory}, id = 5762021
00-03  Limit(fetch=[0]) : rowType = RecordType(ANY EXPR$0): rowcount = 
1.0, cumulative cost = {468.0 rows, 5716.090984570043 cpu, 0.0 io, 0.0 network, 
1872.0 memory}, id = 5762020
00-04Project(EXPR$0=[/(CastHigh($3), $4)]) : rowType = 
RecordType(ANY EXPR$0): rowcount = 78.0, cumulative cost = {468.0 rows, 
5716.090984570043 cpu, 0.0 io, 0.0 network, 1872.0 memory}, id = 5762019
00-05  Window(window#0=[window(partition {2} order by [1] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [SUM($0), COUNT($0)])]) : 
rowType = RecordType(BIGINT $0, DATE $1, TIMESTAMP(0) $2, BIGINT w0$o0, BIGINT 
w0$o1): rowcount = 78.0, cumulative cost = {390.0 rows, 5404.090984570043 cpu, 
0.0 io, 0.0 network, 1872.0 memory}, id = 5762018
00-06SelectionVectorRemover : rowType = RecordType(BIGINT $0, 
DATE $1, TIMESTAMP(0) $2): rowcount = 78.0, cumulative cost = {312.0 rows, 
5170.090984570043 cpu, 0.0 io, 0.0 network, 1872.0 memory}, id = 5762017
00-07  Sort(sort0=[$2], sort1=[$1], dir0=[ASC], dir1=[ASC]) : 
rowType = RecordType(BIGINT $0, DATE $1, TIMESTAMP(0) $2): rowcount = 78.0, 
cumulative cost = {234.0 rows, 5092.090984570043 cpu, 0.0 io, 0.0 network, 
1872.0 memory}, id = 5762016
00-08Project($0=[CAST(CAST($0):BIGINT):BIGINT], 
$1=[CAST(CAST($1):DATE):DATE], $2=[CAST(CAST($2):TIMESTAMP(0)):TIMESTAMP(0)]) : 
rowType = RecordType(BIGINT $0, DATE $1, TIMESTAMP(0) $2): rowcount = 78.0, 
cumulative cost = {156.0 rows, 1170.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
= 5762015
00-09  Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/window_functions/fewRowsAllData.parquet]], 
selectionRoot=maprfs:/drill/testdata/window_functions/fewRowsAllData.parquet, 
numFiles=1, usedMetadataFile=false, columns=[`col1`, `col5`, `col4`]]]) : 
rowType = RecordType(ANY col1, ANY col5, ANY col4): rowcount = 78.0, cumulative 
cost = {78.0 rows, 234.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 5762014
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread StevenMPhillips

Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-199964093
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread sudheeshkatkam

Github user sudheeshkatkam commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-199962117
  
I have addressed @jinfengni 's and @hsuanyi 's comments 
[here](https://github.com/sudheeshkatkam/drill/commit/e4cfdfa9b0562d52ac07f6d80860a82fa8baba40)
 [I force pushed to this branch and somehow their comments are not referenced 
in this PR any longer.]


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (DRILL-4527) Remove unnecessary code: DrillAvgVarianceConvertlet.java

2016-03-22 Thread MinJi Kim (JIRA)

MinJi Kim created DRILL-4527:


 Summary: Remove unnecessary code:  DrillAvgVarianceConvertlet.java
 Key: DRILL-4527
 URL: https://issues.apache.org/jira/browse/DRILL-4527
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Reporter: MinJi Kim
Assignee: MinJi Kim


DrillConvertletTable is used as a way to have custom functions.  For example, 
for EXTRACT(), DrilLConvertletTable.get() returns DrillExtractConvertlet, which 
returns a custom RexNode for the extract function.  

On the other hand, DrillAvgVarianceConvertlet is never used.  
stddev/avg/variance functions are handled by DrillAggregateRule and 
DrillReduceAggregatesRule.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[GitHub] drill pull request: DRILL-4525: Allow SqlBetweenOperator to accept...

2016-03-22 Thread hsuanyi

GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/439

DRILL-4525: Allow SqlBetweenOperator to accept LOWER_OPERAND and UPPEâ¦

â¦R_OPERAND with different types

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-4525

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #439


commit 0f6bd0a398c017bdf40c01d2baab4050f8f00a2a
Author: Hsuan-Yi Chu 
Date:   2016-03-21T21:43:54Z

DRILL-4525: Allow SqlBetweenOperator to accept LOWER_OPERAND and 
UPPER_OPERAND with different types




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Hangout Today?

2016-03-22 Thread Zelaine Fong

Are we having one?

-- Zelaine

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread StevenMPhillips

Github user StevenMPhillips commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-199910572
  
Sudheesh, could you respond to the comments made by Jin Feng? If you have 
already discussed it with him in person, could you post a summary here.

I am in favor of merging this PR, but just the last commit. I think the 
second commit should be separate, and should be done as an inserted operator 
(similar to IteratorValidator), rather than modifying the constructor for 
Screen.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-1328: Support table statistics

2016-03-22 Thread vkorukanti

Github user vkorukanti commented on a diff in the pull request:

https://github.com/apache/drill/pull/425#discussion_r57026410
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/StatisticsAggrFunctions.java
 ---
@@ -0,0 +1,295 @@

+/***
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ 
**/
+
+/*
+ * This class is automatically generated from AggrTypeFunctions2.tdd using 
FreeMarker.
+ */
+
+package org.apache.drill.exec.expr.fn.impl;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillAggFunc;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import 
org.apache.drill.exec.expr.annotations.FunctionTemplate.NullHandling;
+import 
org.apache.drill.exec.expr.annotations.FunctionTemplate.FunctionScope;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.annotations.Workspace;
+import org.apache.drill.exec.expr.holders.BigIntHolder;
+import org.apache.drill.exec.expr.holders.NullableBigIntHolder;
+import org.apache.drill.exec.expr.holders.NullableVarBinaryHolder;
+import org.apache.drill.exec.expr.holders.ObjectHolder;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+
+import javax.inject.Inject;
+
+@SuppressWarnings("unused")
+public class StatisticsAggrFunctions {
+  static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(StatisticsAggrFunctions.class);
+
+  @FunctionTemplate(name = "statcount", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class StatCount implements DrillAggFunc {
+@Param
+FieldReader in;
+@Workspace
+BigIntHolder count;
+@Output
+NullableBigIntHolder out;
+
+@Override
+public void setup() {
+  count = new BigIntHolder();
+}
+
+@Override
+public void add() {
+  count.value++;
+}
+
+@Override
+public void output() {
+  out.isSet = 1;
+  out.value = count.value;
+}
+
+@Override
+public void reset() {
+  count.value = 0;
+}
+  }
+
+  @FunctionTemplate(name = "nonnullstatcount", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class NonNullStatCount implements DrillAggFunc {
+@Param
+FieldReader in;
+@Workspace
+BigIntHolder count;
+@Output
+NullableBigIntHolder out;
+
+@Override
+public void setup() {
+  count = new BigIntHolder();
+}
+
+@Override
+public void add() {
+  if (in.isSet()) {
+count.value++;
+  }
+}
+
+@Override
+public void output() {
+  out.isSet = 1;
+  out.value = count.value;
+}
+
+@Override
+public void reset() {
+  count.value = 0;
+}
+  }
+
+  @FunctionTemplate(name = "hll", scope = 
FunctionTemplate.FunctionScope.POINT_AGGREGATE)
+  public static class HllFieldReader implements DrillAggFunc {
+@Param
+FieldReader in;
+@Workspace
+ObjectHolder work;
+@Output
+NullableVarBinaryHolder out;
+@Inject
+DrillBuf buffer;
+
+@Override
+public void setup() {
+  work = new ObjectHolder();
+  work.obj = new 
com.clearspring.analytics.stream.cardinality.HyperLogLog(10);
+}
+
+@Override
+public void add() {
+  if (work.obj != null) {
+com.clearspring.analytics.stream.cardinality.HyperLogLog hll =
--- End diff --

From [1], it is released under Apache 2.0 license [1] . 

[1]

Re: [DISCUSS] Remove required type

2016-03-22 Thread Aman Sinha

While it is true that there is code complexity due to the required type,
what would we be trading off ?  some important considerations:
  - We don't currently have null count statistics which would need to be
implemented for various data sources
  - Primary keys in the RDBMS sources (or rowkeys in hbase) are always
non-null, and although today we may not be doing optimizations to leverage
that,  one could easily add a rule that converts  WHERE primary_key IS NULL
to a FALSE filter.


On Tue, Mar 22, 2016 at 7:31 AM, Dave Oshinsky 
wrote:

> Hi Jacques,
> Marginally related to this, I made a small change in PR-372 (DRILL-4184)
> to support variable widths for decimal quantities in Parquet.  I found the
> (decimal) vectoring code to be very difficult to understand (probably
> because it's overly complex, but also because I'm new to Drill code in
> general), so I made a small, surgical change in my pull request to support
> keeping track of variable widths (lengths) and null booleans within the
> existing fixed width decimal vectoring scheme.  Can my changes be
> reviewed/accepted, and then we discuss how to fix properly long-term?
>
> Thanks,
> Dave Oshinsky
>
> -Original Message-
> From: Jacques Nadeau [mailto:jacq...@dremio.com]
> Sent: Monday, March 21, 2016 11:43 PM
> To: dev
> Subject: Re: [DISCUSS] Remove required type
>
> Definitely in support of this. The required type is a huge maintenance and
> code complexity nightmare that provides little to no benefit. As you point
> out, we can do better performance optimizations though null count
> observation since most sources are nullable anyway.
> On Mar 21, 2016 7:41 PM, "Steven Phillips"  wrote:
>
> > I have been thinking about this for a while now, and I feel it would
> > be a good idea to remove the Required vector types from Drill, and
> > only use the Nullable version of vectors. I think this will greatly
> simplify the code.
> > It will also simplify the creation of UDFs. As is, if a function has
> > custom null handling (i.e. INTERNAL), the function has to be
> > separately implemented for each permutation of nullability of the
> > inputs. But if drill data types are always nullable, this wouldn't be a
> problem.
> >
> > I don't think there would be much impact on performance. In practice,
> > I think the required type is used very rarely. And there are other
> > ways we can optimize for when a column is known to have no nulls.
> >
> > Thoughts?
> >
>
>
>
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for
> the
> sole use of the intended recipient. Any unauthorized review, use or
> distribution
> by others is strictly prohibited. If you have received the message by
> mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **

Next Release

2016-03-22 Thread Jacques Nadeau

Hey All,

I'd like to volunteer to be the 1.7 release manager. I'd also like to plan
putting together a target feature list for the release now so we can all
plan ahead. I'll share an initial stab at this later today if people think
that sounds good.

Thanks
Jacques

[GitHub] drill pull request: DRILL-4525: Allow SqlBetweenOperator to accept...

2016-03-22 Thread hsuanyi

Github user hsuanyi closed the pull request at:

https://github.com/apache/drill/pull/438


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

RE: [DISCUSS] Remove required type

2016-03-22 Thread Dave Oshinsky

Hi Jacques,
Marginally related to this, I made a small change in PR-372 (DRILL-4184) to 
support variable widths for decimal quantities in Parquet.  I found the 
(decimal) vectoring code to be very difficult to understand (probably because 
it's overly complex, but also because I'm new to Drill code in general), so I 
made a small, surgical change in my pull request to support keeping track of 
variable widths (lengths) and null booleans within the existing fixed width 
decimal vectoring scheme.  Can my changes be reviewed/accepted, and then we 
discuss how to fix properly long-term?

Thanks,
Dave Oshinsky

-Original Message-
From: Jacques Nadeau [mailto:jacq...@dremio.com] 
Sent: Monday, March 21, 2016 11:43 PM
To: dev
Subject: Re: [DISCUSS] Remove required type

Definitely in support of this. The required type is a huge maintenance and code 
complexity nightmare that provides little to no benefit. As you point out, we 
can do better performance optimizations though null count observation since 
most sources are nullable anyway.
On Mar 21, 2016 7:41 PM, "Steven Phillips"  wrote:

> I have been thinking about this for a while now, and I feel it would 
> be a good idea to remove the Required vector types from Drill, and 
> only use the Nullable version of vectors. I think this will greatly simplify 
> the code.
> It will also simplify the creation of UDFs. As is, if a function has 
> custom null handling (i.e. INTERNAL), the function has to be 
> separately implemented for each permutation of nullability of the 
> inputs. But if drill data types are always nullable, this wouldn't be a 
> problem.
>
> I don't think there would be much impact on performance. In practice, 
> I think the required type is used very rarely. And there are other 
> ways we can optimize for when a column is known to have no nulls.
>
> Thoughts?
>

***Legal Disclaimer***
"This communication may contain confidential and privileged material for the
sole use of the intended recipient. Any unauthorized review, use or distribution
by others is strictly prohibited. If you have received the message by mistake,
please advise the sender by reply email and delete the message. Thank you."
**

Re: Help in Drill

2016-03-22 Thread Pawan Pawar

Thanks for your reply :)
Yes I have written code for in memory database, and didn't use any
framework, and table's data are not distributed. It is not a distributed
database, I want to make it distributed, means when I run a query it will
be run in distributed fashion. For this can drill help me or I need to use
any other tool (which one?).

On Tue, Mar 22, 2016 at 7:12 PM, John Omernik  wrote:

> When you say you have your own in memory database, is it a distributed in
> memory database?  Is it something you wrote? Is it a frame work running on
> something like Alluxio? If it's distributed, you could write a storage
> plugin for it, and allow Drill to query it in a distributed fashion.  If
> it's not distributed, you could run Drill in embedded mode and still write
> a storage plugin to query it (but losing some of the distributed power of
> drill).  If your in memory database has JDBC bindings, you may be able to
> use the JDBC storage plugin to connect to it it, that will take some
> testing and may come up with issues but they should be fixable.  To really
> help you, there needs to be more information.  As to being confused about
> the Drill functionality, I found that reading through the docs here:
> https://drill.apache.org/docs/drill-introduction/ can be very helpful.
>
> Happy Drilling!
>
> On Tue, Mar 22, 2016 at 5:10 AM, Pawan Pawar  wrote:
>
> > Hello Drill team, I am Pawan Pawar, Can you please help me, I have few
> > questions regarding drill functionality. I have my own in memory
> database,
> > I want it to be a distributed system. can I use driil to make it
> > distributed and how?
> >
> > Please help me I am very confused about the drill functionality.
> >
> > --
> > *Thanks & Regards*
> > *Pawan Pawar*
> > *Mobile: +91 9993585256*
> > *Email: pawarem...@gmail.com *
> > *Skype: pawarskype*
> >
>



-- 
*Thanks & Regards*
*Pawan Pawar*
*Mobile: +91 9993585256*
*Email: pawarem...@gmail.com *
*Skype: pawarskype*

Re: Help in Drill

2016-03-22 Thread John Omernik

When you say you have your own in memory database, is it a distributed in
memory database?  Is it something you wrote? Is it a frame work running on
something like Alluxio? If it's distributed, you could write a storage
plugin for it, and allow Drill to query it in a distributed fashion.  If
it's not distributed, you could run Drill in embedded mode and still write
a storage plugin to query it (but losing some of the distributed power of
drill).  If your in memory database has JDBC bindings, you may be able to
use the JDBC storage plugin to connect to it it, that will take some
testing and may come up with issues but they should be fixable.  To really
help you, there needs to be more information.  As to being confused about
the Drill functionality, I found that reading through the docs here:
https://drill.apache.org/docs/drill-introduction/ can be very helpful.

Happy Drilling!

On Tue, Mar 22, 2016 at 5:10 AM, Pawan Pawar  wrote:

> Hello Drill team, I am Pawan Pawar, Can you please help me, I have few
> questions regarding drill functionality. I have my own in memory database,
> I want it to be a distributed system. can I use driil to make it
> distributed and how?
>
> Please help me I am very confused about the drill functionality.
>
> --
> *Thanks & Regards*
> *Pawan Pawar*
> *Mobile: +91 9993585256*
> *Email: pawarem...@gmail.com *
> *Skype: pawarskype*
>

Help in Drill

2016-03-22 Thread Pawan Pawar

Hello Drill team, I am Pawan Pawar, Can you please help me, I have few
questions regarding drill functionality. I have my own in memory database,
I want it to be a distributed system. can I use driil to make it
distributed and how?

Please help me I am very confused about the drill functionality.

-- 
*Thanks & Regards*
*Pawan Pawar*
*Mobile: +91 9993585256*
*Email: pawarem...@gmail.com *
*Skype: pawarskype*

[GitHub] drill pull request: DRILL-4514 : Add describe schema ...

2016-03-22 Thread arina-ielchiieva

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/436#discussion_r56957772
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/parser/SqlDescribeSchema.java
 ---
@@ -0,0 +1,82 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.parser;
+
+import org.apache.calcite.sql.SqlCall;
+import org.apache.calcite.sql.SqlIdentifier;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.calcite.sql.SqlLiteral;
+import org.apache.calcite.sql.SqlNode;
+import org.apache.calcite.sql.SqlOperator;
+import org.apache.calcite.sql.SqlSpecialOperator;
+import org.apache.calcite.sql.SqlWriter;
+import org.apache.calcite.sql.parser.SqlParserPos;
+import org.apache.drill.exec.planner.sql.handlers.AbstractSqlHandler;
+import org.apache.drill.exec.planner.sql.handlers.DescribeSchemaHandler;
+import org.apache.drill.exec.planner.sql.handlers.SqlHandlerConfig;
+
+import java.util.Collections;
+import java.util.List;
+
+/**
+ * Sql parse tree node to represent statement:
+ * SHOW FILES [{FROM | IN} db_name] [LIKE 'pattern' | WHERE expr]
+ */
+public class SqlDescribeSchema extends DrillSqlCall {
+
+  private final SqlIdentifier schema;
+
+  public static final SqlSpecialOperator OPERATOR =
+  new SqlSpecialOperator("DESCRIBE_SCHEMA", SqlKind.OTHER) {
+@Override
+public SqlCall createCall(SqlLiteral functionQualifier, 
SqlParserPos pos, SqlNode... operands) {
+  return new SqlDescribeSchema(pos, (SqlIdentifier) operands[0]);
+}
+  };
+
+  public SqlDescribeSchema(SqlParserPos pos, SqlIdentifier schema) {
+super(pos);
+this.schema = schema;
+assert schema != null;
--- End diff --

Agree, preconditions is better. But I have checked that schema can't come 
as null, so I have removed check for not null at all.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4514 : Add describe schema ...

2016-03-22 Thread arina-ielchiieva

Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/436#discussion_r56957490
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/DescribeSchemaCommandResult.java
 ---
@@ -0,0 +1,30 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql.handlers;
+
+public class DescribeSchemaCommandResult {
+
+  public String name;
--- End diff --

Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-4525: Allow SqlBetweenOperator to accept...

2016-03-22 Thread hsuanyi

GitHub user hsuanyi opened a pull request:

https://github.com/apache/drill/pull/438

DRILL-4525: Allow SqlBetweenOperator to accept LOWER_OPERAND and UPPEâ¦

â¦R_OPERAND with different types

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hsuanyi/incubator-drill DRILL-4525

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #438


commit 5884e9b923057f1e9ed77ecf0ba8531e4d2f03b7
Author: Hsuan-Yi Chu 
Date:   2016-03-21T21:43:54Z

DRILL-4525: Allow SqlBetweenOperator to accept LOWER_OPERAND and 
UPPER_OPERAND with different types




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

2016-03-22 Thread hsuanyi

Github user hsuanyi commented on the pull request:

https://github.com/apache/drill/pull/405#issuecomment-199650953
  
Except for my comments (which are for future improvements and can be filed 
as follow-up jira issues)
LGTM +1 (non-binding)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Drill on YARN

Fwd: drill git commit: DRILL-3623: For limit 0 queries, optionally use a shorter execution path when result column types are known

[jira] [Resolved] (DRILL-3623) Limit 0 should avoid execution when querying a known schema

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

[jira] [Created] (DRILL-4529) SUM() with windows function result in mismatch nullability

Re: Next Release

Re: [DISCUSS] Remove required type

Re: [DISCUSS] Remove required type

[GitHub] drill pull request: remove DrillAvgVarianceConvertlet

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

[GitHub] drill pull request: DRILL-4514 : Add describe schema ...

Re: [DISCUSS] Remove required type

Re: [DISCUSS] Remove required type

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

Re: [GitHub] drill pull request: Elasticsearch storage plugin

[GitHub] drill pull request: Elasticsearch storage plugin

Re: [DISCUSS] Remove required type

[jira] [Created] (DRILL-4528) AVG() windows function is not optimized for limit 0 queries

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

[jira] [Created] (DRILL-4527) Remove unnecessary code: DrillAvgVarianceConvertlet.java

[GitHub] drill pull request: DRILL-4525: Allow SqlBetweenOperator to accept...

Hangout Today?

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

[GitHub] drill pull request: DRILL-1328: Support table statistics

Re: [DISCUSS] Remove required type

Next Release

[GitHub] drill pull request: DRILL-4525: Allow SqlBetweenOperator to accept...

RE: [DISCUSS] Remove required type

Re: Help in Drill

Re: Help in Drill

Help in Drill

[GitHub] drill pull request: DRILL-4514 : Add describe schema ...

[GitHub] drill pull request: DRILL-4514 : Add describe schema ...

[GitHub] drill pull request: DRILL-4525: Allow SqlBetweenOperator to accept...

[GitHub] drill pull request: DRILL-3623: For limit 0 queries, use a shorter...

36 matches

Site Navigation

Mail list logo

Footer information