[jira] [Created] (DRILL-6400) Hash-Aggr: Avoid recreating common Hash-Table setups for every partition

2018-05-09 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-6400:
---

 Summary: Hash-Aggr: Avoid recreating common Hash-Table setups for 
every partition
 Key: DRILL-6400
 URL: https://issues.apache.org/jira/browse/DRILL-6400
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Relational Operators
Affects Versions: 1.13.0
Reporter: Boaz Ben-Zvi
Assignee: Boaz Ben-Zvi
 Fix For: 1.14.0


 The current Hash-Aggr code (and soon the Hash-Join code) creates multiple 
partitions to hold the incoming data; each partition with its own HashTable. 

     The current code invokes the HashTable method _createAndSetupHashTable()_ 
for *each* partition. But most of the setups done by this method are identical 
for all the partitions (e.g., code generation).  Calling this method has a 
performance cost (some local tests measured between 3 - 30 milliseconds, 
depends on the key columns).

  Suggested performance improvement: Extract the common settings to be called 
*once*, and use the results later by all the partitions. When running with the 
default 32 partitions, this can have a measurable improvement (and if spilling, 
this method is used again).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6399) Use RowSets In MiniPlanUnitTestBase To Generate Test Data

2018-05-09 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-6399:
-

 Summary: Use RowSets In MiniPlanUnitTestBase To Generate Test Data
 Key: DRILL-6399
 URL: https://issues.apache.org/jira/browse/DRILL-6399
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Timothy Farkas
Assignee: Timothy Farkas






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [IMPORTANT] Gitbox enabled

2018-05-09 Thread Vlad Rozov

See [1] and [2]

Thank you,

Vlad

[1] 
https://help.github.com/articles/providing-your-2fa-authentication-code/#when-youll-be-asked-for-a-personal-access-token-as-a-password
[2] 
https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/



On 5/9/18 16:12, Boaz Ben-Zvi wrote:

  Note *committers* , in case you get the same error:

After successfully enabling the needed Two Factor Authentication (2FA), my “git 
push” started failing, like:

~/drill > git push origin
Username for 'https://github.com': ben-zvi
Password for 'https://ben-...@github.com':
remote: Invalid username or password.
fatal: Authentication failed for 'https://github.com/Ben-Zvi/drill.git/'

The solution: Need to enter a personal access token instead of the github 
password.
To generate a personal access token, go to https://github.com/settings/tokens
The token is a long hash code ; just copy and paste it as a password.

Thanks,

Boaz


On 5/3/18, 11:36 AM, "Parth Chandra"  wrote:

 Note to all the *committers* -
 
 Gitbox integration has been enabled. This means you can merge in a PR

 directly from Github. (i.e. the apache/drill repository on github is now
 the master repository, and  is writable. (It is no longer a mirror).
 
 This also means that the original git-wip repository will not be available

 and pushing to this repository will not achieve anything useful.
 
 [IMPORTANT] Please visit https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_setup_=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=fsXeKdxWoc0QL7vvlOPkm4D4aiyv_gLn0IM1oP8s7TM= to setup 2FA if

 you'd like to use GitHub as a remote g...@github.com:apache/drill.git
 
 You can also use GitBox as a remote

 
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=DOiW8QP9zBVK4JKcS-aDmTsrOvQJu7syQgTWpCRi2zM=
 
 Same thing for drill-site g...@github.com:apache/drill-site.git or

 
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill-2Dsite.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=XKrIn1R2oXPsF_Y4OPadWjIeRQXAOv7BJgEi-qWY-Kg=
 
 
 [IMPORTANT] - The github UI currently enables the option to "Create a merge

 commit" . Please *do not* use this option. Click on the drop down and chose
 the "rebase and merge" or "squash and merge" option
 
 @vrozov is the expert on this, so if you run into difficulties please

 include him in the communication. (Better still just post on the list).
 
 Thanks
 
 Parth
 





Re: [IMPORTANT] Gitbox enabled

2018-05-09 Thread Boaz Ben-Zvi
 Note *committers* , in case you get the same error:

After successfully enabling the needed Two Factor Authentication (2FA), my “git 
push” started failing, like:

~/drill > git push origin
Username for 'https://github.com': ben-zvi
Password for 'https://ben-...@github.com':
remote: Invalid username or password.
fatal: Authentication failed for 'https://github.com/Ben-Zvi/drill.git/'

The solution: Need to enter a personal access token instead of the github 
password. 
To generate a personal access token, go to https://github.com/settings/tokens 
The token is a long hash code ; just copy and paste it as a password.

   Thanks,

   Boaz


On 5/3/18, 11:36 AM, "Parth Chandra"  wrote:

Note to all the *committers* -

Gitbox integration has been enabled. This means you can merge in a PR
directly from Github. (i.e. the apache/drill repository on github is now
the master repository, and  is writable. (It is no longer a mirror).

This also means that the original git-wip repository will not be available
and pushing to this repository will not achieve anything useful.

[IMPORTANT] Please visit 
https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_setup_=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=fsXeKdxWoc0QL7vvlOPkm4D4aiyv_gLn0IM1oP8s7TM=
 to setup 2FA if
you'd like to use GitHub as a remote g...@github.com:apache/drill.git

You can also use GitBox as a remote

https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=DOiW8QP9zBVK4JKcS-aDmTsrOvQJu7syQgTWpCRi2zM=

Same thing for drill-site g...@github.com:apache/drill-site.git or

https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill-2Dsite.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=XKrIn1R2oXPsF_Y4OPadWjIeRQXAOv7BJgEi-qWY-Kg=


[IMPORTANT] - The github UI currently enables the option to "Create a merge
commit" . Please *do not* use this option. Click on the drop down and chose
the "rebase and merge" or "squash and merge" option

@vrozov is the expert on this, so if you run into difficulties please
include him in the communication. (Better still just post on the list).

Thanks

Parth




[jira] [Created] (DRILL-6398) Combine RowSetTestUtils with RowSetUtilities

2018-05-09 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-6398:
-

 Summary: Combine RowSetTestUtils with RowSetUtilities
 Key: DRILL-6398
 URL: https://issues.apache.org/jira/browse/DRILL-6398
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Timothy Farkas
Assignee: Timothy Farkas


There are two classes with RowSet utils, there should just be one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Apache Drill board report (draft) for May 2018

2018-05-09 Thread Aman Sinha
Hi Drill Devs,
the Apache board report for Drill for this quarter is due soon.
Here's a draft.  If you have any comments, let me know.  I plan to submit
by tomorrow morning.

Thanks,
Aman

===
## Description:
 - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud
Storage

## Issues:
 - There are no issues requiring board attention at this time

## Activity:
 - Since the last board report, Drill has released version 1.13.0.  The
following is a partial list of
   new features/enhancements that were added in addition to many other bug
fixes:
 - JDK 8 support.
 - Upgrade to Apache Calcite version 1.15.
 - JDBC Statement.setQueryTimeout(int) support.
 - Batch sizing improvements.
 - Support for SPNEGO to extend Kerberos to Web applications through
HTTP.
 - Ability to run Drill under YARN.
 - Parquet filter pushdown improvements and related performance
improvements.
 - Hive client for Drill is updated to version 2.3.2.
 - Ability to automatically manage memory allocations during Drill
startup.
 - Support SQL syntax highlighting of queries, auto-complete support in
SQL editors, and snippets.
 - Improved performance of the Single Merge Exchange operator.
 - Like operator optimization.
 - User/Distribution-specific configuration checks during startup.

## Health report:
 - The project is quite healthy. Development activity as reflected in the
pull requests and JIRAs
   is good.  Activity on the dev and user mailing lists continues to be
strong.
   Three new committers were added in the last period.

## PMC changes:

 - Currently 19 PMC members.
 - No new PMC members added in the last 3 months
 - Last PMC addition was Paul Rogers on Mon Jan 29 2018

## Committer base changes:

 - Currently 43 committers.
 - New commmitters:
- Kunal Khatua was added as a committer on Tue Feb 27 2018
- Vova Vysotskyi was added as a committer on Thu Mar 15 2018
- Sorabh Hamirwasia was added as a committer on Fri Apr 27 2018

## Releases:

 - 1.13.0 was released on Sun Mar 18 2018

## Mailing list activity:

 - dev@drill.apache.org:
- 437 subscribers (down -9 in the last 3 months):
- 2582 emails sent to list (2244 in previous quarter)

 - iss...@drill.apache.org:
- 19 subscribers (up 0 in the last 3 months):
- 3652 emails sent to list (3088 in previous quarter)

 - u...@drill.apache.org:
- 605 subscribers (down -8 in the last 3 months):
- 356 emails sent to list (181 in previous quarter)


[jira] [Created] (DRILL-6397) OperatorTestBuilder, should leverage RowSets for comparing baseline values.

2018-05-09 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-6397:
-

 Summary: OperatorTestBuilder, should leverage RowSets for 
comparing baseline values.
 Key: DRILL-6397
 URL: https://issues.apache.org/jira/browse/DRILL-6397
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Timothy Farkas






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6396) Remove unused getTempDir Method in BaseFixture

2018-05-09 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-6396:
-

 Summary: Remove unused getTempDir Method in BaseFixture
 Key: DRILL-6396
 URL: https://issues.apache.org/jira/browse/DRILL-6396
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Timothy Farkas
Assignee: Timothy Farkas


This tempDirectory method is no longer used. The DirTestWatcher and 
BaseDirTestWatcher classes are used instead for testing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6395) Value Window Function - LEAD and LAG on VarChar result in "No applicable constructor/method found" error

2018-05-09 Thread Raymond Wong (JIRA)
Raymond Wong created DRILL-6395:
---

 Summary: Value Window Function - LEAD and LAG on VarChar result in 
 "No applicable constructor/method found" error
 Key: DRILL-6395
 URL: https://issues.apache.org/jira/browse/DRILL-6395
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.13.0
 Environment: windows 10, apache drill 1.13.0, 32GB Ram
Reporter: Raymond Wong


{code:java}
SELECT 
col2,
LEAD(col1, 1) OVER (ORDER BY col2) AS nxtCol1
FROM (
SELECT 'A' AS col1, 1 AS col2
UNION 
SELECT 'B' AS col1, 2 AS col2
UNION 
SELECT 'C' AS col1, 3 AS col2
) AS A;
{code}
Causes error 
{code:java}
SQL Error: SYSTEM ERROR: CompileException: Line 37, Column 40: 
No applicable constructor/method found for actual parameters "int, int, int, 
io.netty.buffer.DrillBuf"; 
candidates are: 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, 
org.apache.drill.exec.expr.holders.VarCharHolder)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, 
org.apache.drill.exec.expr.holders.NullableVarCharHolder)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, byte[], 
int, int)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, 
java.nio.ByteBuffer, int, int)", 
"public void 
org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, int, 
int, int, io.netty.buffer.DrillBuf)"

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6394) Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object '/tmp/file.dat' not found within 'dfs'

2018-05-09 Thread Hari Sekhon (JIRA)
Hari Sekhon created DRILL-6394:
--

 Summary: Error: VALIDATION ERROR: From line 1, column 15 to line 
1, column 17: Object '/tmp/file.dat' not found within 'dfs'
 Key: DRILL-6394
 URL: https://issues.apache.org/jira/browse/DRILL-6394
 Project: Apache Drill
  Issue Type: Improvement
  Components:  Server, Execution - Codegen, Metadata, Query Planning 
 Optimization, SQL Parser, Storage - Text  CSV
Affects Versions: 1.13.0
 Environment: MapR 6.0
Reporter: Hari Sekhon


Improvement request for the following error to be made more specific to mention 
that this is caused by the file extension (.dat) not being one of the expected 
ones, even though it was a CSV file (renaming it to .csv worked):
{code:java}
0: jdbc:drill:drillbit=> select * from dfs.`/tmp/file.dat`;
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object 
'/tmp/file.dat' not found within 'dfs'

SQL Query null

[Error Id: e7c2863e-0feb-4b80-82b7-008056f0fcef on :31010] (state=,code=0)

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)