[GitHub] drill pull request #710: DRILL-5126: Provide simplified, unified "cluster fi...

2016-12-27 Thread paul-rogers
GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/710

DRILL-5126: Provide simplified, unified "cluster fixture" for test

Drill provides a robust selection of test frameworks that have evolved to 
satisfy the needs of a variety of test cases. However, some do some of what a 
given test needs, while others to other parts. Also, the various frameworks 
make assumptions (in the form of boot-time configuration) that differs from 
what some test may need, forcing the test to start, then stop, then restart a 
Drillbit - an expensive operation.

Also, many ways exist to run queries, but they all do part of the job. 
Several ways exist to change runtime options.

This checkin shamelessly grabs the best parts from existing frameworks, 
adds a fluent builder facade and provides a complete, versitie test framework 
for new tests. Old tests are unaffected by this new code.

An adjustment was made to allow use of the existing TestBuilder mechanism. 
TestBuilder used to depend on static members of BaseTestQuery. A "shim" allows 
the same code to work in the old way for old tests, but with the new 
ClusterFixture for new tests.

Details are in the org.apache.drill.test.package-info.java file.

This commit modifies a single test case, TestSimpleExternalSort, to use the 
new framework. More cases will follow once this framework itself is committed.

Also, the framework will eventually allow use of the extended mock data 
source from SQL. However, that change must await checkin of the mock data 
source changes.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5126

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/710.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #710


commit 1df75e40535209c74f4e8dd807ced560bbea2e50
Author: Paul Rogers 
Date:   2016-12-13T21:41:23Z

DRILL-5126: Provide simplified, unified "cluster fixture" for test

Drill provides a robust selection of test frameworks that have evolved to 
satisfy the needs of a variety of test cases.
However, some do some of what a given test needs, while others to other 
parts. Also, the various frameworks make
assumptions (in the form of boot-time configuration) that differs from what 
some test may need, forcing the test
to start, then stop, then restart a Drillbit - an expensive operation.

Also, many ways exist to run queries, but they all do part of the job. 
Several ways exist to channge
runtime options.

This checkin shamelessly grabs the best parts from existing frameworks, 
adds a fluent builder facade
and provides a complete, versitie test framework for new tests. Old tests 
are unaffected by this
new code.

An adjustment was made to allow use of the existing TestBuilder mechanism. 
TestBuilder used to
depend on static members of BaseTestQuery. A "shim" allows the same code to 
work in the old
way for old tests, but with the new ClusterFixture for new tests.

Details are in the org.apache.drill.test.package-info.java file.

This commit modifies a single test case, TestSimpleExternalSort, to use the 
new framework.
More cases will follow once this framework itself is committed.

Also, the framework will eventually allow use of the extended mock data 
source
from SQL. However, that change must await checkin of the mock data source 
changes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #709: DRILL-5156: BootStrapContext should close threads

2016-12-27 Thread paul-rogers
GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/709

DRILL-5156: BootStrapContext should close threads

The Bit-Client thread (that's the thread name) finds a closed allocator in 
TestDrillbitResilience unit test. This fix (along with DRILL-5157) eliminates 
two run-time problems seen in this unit tests.

BootStrapContext creates two thread pools, but does not close them. This 
allows the code running in the threads to attempt to access their allocators 
after the allocator is closed. This fix ensures that the
thread pools are closed to avoid the issue.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5156

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/709.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #709


commit 73607c33c78dd89373412c8e4c70356fa19f81fd
Author: Paul Rogers 
Date:   2016-12-28T01:45:16Z

DRILL-5156: BootStrapContext should close threads

Bit-Client thread finds closed allocator in TestDrillbitResilience unit
test. This fix (along with DRILL-5157) eliminates two run-time problems
seen in this unit tests.

BootStrapContext creates two thread pools, but does not close them.
This allows the code running in the threads to attempt to access their
allocators after the allocator is closed. This fix ensures that the
thread pools are closed to avoid the issue.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill issue #698: DRILL-5136: Server unable to prepare non SELECT queries

2016-12-27 Thread superbstreak
Github user superbstreak commented on the issue:

https://github.com/apache/drill/pull/698
  
Closing PR due to widen scope.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #698: DRILL-5136: Server unable to prepare non SELECT que...

2016-12-27 Thread superbstreak
Github user superbstreak closed the pull request at:

https://github.com/apache/drill/pull/698


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #708: DRILL-5152: Enhance the mock data source: better da...

2016-12-27 Thread paul-rogers
GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/708

DRILL-5152: Enhance the mock data source: better data, SQL access

Provides an enhanced version of the mock data source. See the JIRA entry 
for motivation, package-info.java for details of operation.

Allows tests to write queries of the form:
```
select id_i, name_s50 from `mock`.`employee_1K` ...
```
Where id_i is a field of random, uniformly distributed integers and 
name_s50 is a VARCHAR column of width 50 of randomly generated strings. The _1K 
suffix says to generate 1000 rows. The names are just for convenience, the 
suffixes tell the mock data source what to generate.

Examples of use will appear in a later commit that includes a revised test 
framework. Existing tests that use the physical plan version of the mock data 
source work as before.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5152

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/708.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #708


commit e9741ce59621209e18cf405c8fe4a614d955ed2a
Author: Paul Rogers 
Date:   2016-12-22T05:47:20Z

DRILL-5152: Enhance the mock data source: better data, SQL access

Provides an enhanced version of the mock data source. See the JIRA
entry for motivation, package-info.java for details of operation.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #707: DRILL-5157: Multiple Snappy versions on class path

2016-12-27 Thread paul-rogers
GitHub user paul-rogers opened a pull request:

https://github.com/apache/drill/pull/707

DRILL-5157: Multiple Snappy versions on class path

Multiple Snappy versions on class path; causes unit test failures.

Drill's pom.xml files bring in multiple Snappy versions. Drill itself 
brings in a very old version that has a known problem loading the snappy native 
library. Other libraries bring in a newer version. The one that ends up first 
on the class path is non-deterministic, leading to random test failures.

This fix updates the Snappy library to the latest and adds dependency 
management to exclude older versions brought in by Avro and Parquet.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/paul-rogers/drill DRILL-5157

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/707.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #707


commit 909efcec92c4a259d139dfc4008c431969d061f7
Author: Paul Rogers 
Date:   2016-12-28T01:21:09Z

DRILL-5157: Multiple Snappy versions on class path

Multiple Snappy versions on class path; causes unit test failures.

This fix updates the Snappy library and adds dependency management to
exclude older versions brought in by Avro and Parquet.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Apache Drill Hangout Minutes - 12/27/16

2016-12-27 Thread Arina Yelchiyeva
Attendees: John, Vitalli, Roman, Serhii, Arina

1) John: email with NPE during select text file with options (email to user
list from Dec 8). It's hard to tell from error what is the reason of the
failure. Probably data set is needed to find the root cause.

2) Arina: when in storage plugin is indicated that text files have headers,
user can only indicated column names but not column[0], column[1] (as in
files without headers) in select statement. It would be useful to allow
both. Plus it would solve the issue with select count(1) from text file
with headers which currently results in exception (DRILL-4919).


Re: [HANGOUT] Suggestions for topics for 27/12

2016-12-27 Thread Arina Yelchiyeva
Handout is starting now.

On Mon, Dec 26, 2016 at 1:23 PM, Arina Yelchiyeva <
arina.yelchiy...@gmail.com> wrote:

> Hi all,
>
> The bi-weekly hangout is tomorrow (27/12/16, 10 AM PST). If you have any 
> suggestions
> for topics for tomorrow,  please add to this thread. We will also ask for
> topics at the beginning of the hangout.
>
> Hangout link:
> *https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc
> *
>
> Kind regards
> Arina
>


[GitHub] drill pull request #520: DRILL-3510: Add ANSI_QUOTES option so that Drill's ...

2016-12-27 Thread vdiravka
Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/520#discussion_r93939395
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/PlannerSettings.java
 ---
@@ -241,6 +244,10 @@ public long getInSubqueryThreshold() {
 return options.getOption(IN_SUBQUERY_THRESHOLD);
   }
 
+  public boolean isAnsiQuotesEnabled() {
--- End diff --

For this reason maybe it makes sense to implement not boolean option 
ANSI_QUOTES, but some string option, to turn on necessary lexical policy.  
I raised this question in jira 
[DRILL-3510](https://issues.apache.org/jira/browse/DRILL-3510?focusedCommentId=15780561=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15780561)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #706: DRILL-5039: NPE - CTAS PARTITION BY (

2016-12-27 Thread arina-ielchiieva
GitHub user arina-ielchiieva opened a pull request:

https://github.com/apache/drill/pull/706

DRILL-5039: NPE - CTAS PARTITION BY ()

1. Moved varchar `newPartitionValue` functions to `NewValueFunctions` 
template:
a. took advantage of code generation for varchar `newPartitionValue` 
functions;
b. all `newPartitionValue` functions will be in one file.
2. Fixed logic for reassigning nullable varchar holders during 
`newPartitionValue` function execution (previously resulted in NPE if varchar 
partition column contained nulls).
3. Added `alltypes_optional.parquet` (each type contains at least one null 
value) in test resources and updated `alltypes_required.parquet` to have the 
same types as in `alltypes_optional.parquet`.
4. Added new unit test `testPartitionByForAllTypes` to cover all types and 
cases (with and without nulls) during CTAS with partitioning.
5. Updated `minMaxEmptyNonNullableInput` test to reflect changes in 
`alltypes_required.parquet`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arina-ielchiieva/drill DRILL-5039

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/706.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #706


commit 3521339a8a39dc8cef5b769ee8abce00e6a88ba7
Author: Arina Ielchiieva 
Date:   2016-12-23T17:51:38Z

DRILL-5039: NPE - CTAS PARTITION BY ()




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5165) wrong results - LIMIT ALL and OFFSET clause in same query

2016-12-27 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-5165:
-

 Summary: wrong results - LIMIT ALL and OFFSET clause in same query
 Key: DRILL-5165
 URL: https://issues.apache.org/jira/browse/DRILL-5165
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.10.0
Reporter: Khurram Faraaz
Priority: Critical


This issue was reported by a user on Drill's user list.
Drill 1.10.0 commit ID : bbcf4b76

I tried a similar query on apache Drill 1.10.0 and Drill returns wrong results 
when compared to Postgres, for a query that uses LIMIT ALL and OFFSET clause in 
the same query. We need to file a JIRA to track this issue.

{noformat}
0: jdbc:drill:schema=dfs.tmp> select col_int from typeall_l order by 1 limit 
all offset 10;
+--+
| col_int  |
+--+
+--+
No rows selected (0.211 seconds)
0: jdbc:drill:schema=dfs.tmp> select col_int from typeall_l order by col_int 
limit all offset 10;
+--+
| col_int  |
+--+
+--+
No rows selected (0.24 seconds)
{noformat}

Query => select col_int from typeall_l limit all offset 10;
Drill 1.10.0 returns 85 rows

whereas for same query,
postgres=# select col_int from typeall_l limit all offset 10;
Postgres 9.3 returns 95 rows, which is the correct expected result.

Query plan for above query that returns wrong results

{noformat}
0: jdbc:drill:schema=dfs.tmp> explain plan for select col_int from typeall_l 
limit all offset 10;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(col_int=[$0])
00-02SelectionVectorRemover
00-03  Limit(offset=[10])
00-04Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=maprfs:///tmp/typeall_l]], selectionRoot=maprfs:/tmp/typeall_l, 
numFiles=1, usedMetadataFile=false, columns=[`col_int`]]])
{noformat} 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Get/Set Session Options from UDFs

2016-12-27 Thread Nagarajan Chinnasamy
Hi,

I need to get/set session options from a UDF that I am developing. Please
let me know how I can do this.

Best Regards,
Nagu.