[jira] [Created] (DRILL-6400) Hash-Aggr: Avoid recreating common Hash-Table setups for every partition
Boaz Ben-Zvi created DRILL-6400: --- Summary: Hash-Aggr: Avoid recreating common Hash-Table setups for every partition Key: DRILL-6400 URL: https://issues.apache.org/jira/browse/DRILL-6400 Project: Apache Drill Issue Type: Improvement Components: Execution - Relational Operators Affects Versions: 1.13.0 Reporter: Boaz Ben-Zvi Assignee: Boaz Ben-Zvi Fix For: 1.14.0 The current Hash-Aggr code (and soon the Hash-Join code) creates multiple partitions to hold the incoming data; each partition with its own HashTable. The current code invokes the HashTable method _createAndSetupHashTable()_ for *each* partition. But most of the setups done by this method are identical for all the partitions (e.g., code generation). Calling this method has a performance cost (some local tests measured between 3 - 30 milliseconds, depends on the key columns). Suggested performance improvement: Extract the common settings to be called *once*, and use the results later by all the partitions. When running with the default 32 partitions, this can have a measurable improvement (and if spilling, this method is used again). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6399) Use RowSets In MiniPlanUnitTestBase To Generate Test Data
Timothy Farkas created DRILL-6399: - Summary: Use RowSets In MiniPlanUnitTestBase To Generate Test Data Key: DRILL-6399 URL: https://issues.apache.org/jira/browse/DRILL-6399 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Re: [IMPORTANT] Gitbox enabled
See [1] and [2] Thank you, Vlad [1] https://help.github.com/articles/providing-your-2fa-authentication-code/#when-youll-be-asked-for-a-personal-access-token-as-a-password [2] https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/ On 5/9/18 16:12, Boaz Ben-Zvi wrote: Note *committers* , in case you get the same error: After successfully enabling the needed Two Factor Authentication (2FA), my “git push” started failing, like: ~/drill > git push origin Username for 'https://github.com': ben-zvi Password for 'https://ben-...@github.com': remote: Invalid username or password. fatal: Authentication failed for 'https://github.com/Ben-Zvi/drill.git/' The solution: Need to enter a personal access token instead of the github password. To generate a personal access token, go to https://github.com/settings/tokens The token is a long hash code ; just copy and paste it as a password. Thanks, Boaz On 5/3/18, 11:36 AM, "Parth Chandra"wrote: Note to all the *committers* - Gitbox integration has been enabled. This means you can merge in a PR directly from Github. (i.e. the apache/drill repository on github is now the master repository, and is writable. (It is no longer a mirror). This also means that the original git-wip repository will not be available and pushing to this repository will not achieve anything useful. [IMPORTANT] Please visit https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_setup_=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=fsXeKdxWoc0QL7vvlOPkm4D4aiyv_gLn0IM1oP8s7TM= to setup 2FA if you'd like to use GitHub as a remote g...@github.com:apache/drill.git You can also use GitBox as a remote https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=DOiW8QP9zBVK4JKcS-aDmTsrOvQJu7syQgTWpCRi2zM= Same thing for drill-site g...@github.com:apache/drill-site.git or https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill-2Dsite.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=XKrIn1R2oXPsF_Y4OPadWjIeRQXAOv7BJgEi-qWY-Kg= [IMPORTANT] - The github UI currently enables the option to "Create a merge commit" . Please *do not* use this option. Click on the drop down and chose the "rebase and merge" or "squash and merge" option @vrozov is the expert on this, so if you run into difficulties please include him in the communication. (Better still just post on the list). Thanks Parth
Re: [IMPORTANT] Gitbox enabled
Note *committers* , in case you get the same error: After successfully enabling the needed Two Factor Authentication (2FA), my “git push” started failing, like: ~/drill > git push origin Username for 'https://github.com': ben-zvi Password for 'https://ben-...@github.com': remote: Invalid username or password. fatal: Authentication failed for 'https://github.com/Ben-Zvi/drill.git/' The solution: Need to enter a personal access token instead of the github password. To generate a personal access token, go to https://github.com/settings/tokens The token is a long hash code ; just copy and paste it as a password. Thanks, Boaz On 5/3/18, 11:36 AM, "Parth Chandra"wrote: Note to all the *committers* - Gitbox integration has been enabled. This means you can merge in a PR directly from Github. (i.e. the apache/drill repository on github is now the master repository, and is writable. (It is no longer a mirror). This also means that the original git-wip repository will not be available and pushing to this repository will not achieve anything useful. [IMPORTANT] Please visit https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_setup_=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=fsXeKdxWoc0QL7vvlOPkm4D4aiyv_gLn0IM1oP8s7TM= to setup 2FA if you'd like to use GitHub as a remote g...@github.com:apache/drill.git You can also use GitBox as a remote https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=DOiW8QP9zBVK4JKcS-aDmTsrOvQJu7syQgTWpCRi2zM= Same thing for drill-site g...@github.com:apache/drill-site.git or https://urldefense.proofpoint.com/v2/url?u=https-3A__gitbox.apache.org_repos_asf_drill-2Dsite.git=DwIBaQ=cskdkSMqhcnjZxdQVpwTXg=EqulKDxxEDCX6zbp1AZAa1-iAPQGgCioAqgDp7DE2BU=eWZe_PpBWgOlhFxb0vKBmQ0-DhAbSxF_WoJ8rg8dT9U=XKrIn1R2oXPsF_Y4OPadWjIeRQXAOv7BJgEi-qWY-Kg= [IMPORTANT] - The github UI currently enables the option to "Create a merge commit" . Please *do not* use this option. Click on the drop down and chose the "rebase and merge" or "squash and merge" option @vrozov is the expert on this, so if you run into difficulties please include him in the communication. (Better still just post on the list). Thanks Parth
[jira] [Created] (DRILL-6398) Combine RowSetTestUtils with RowSetUtilities
Timothy Farkas created DRILL-6398: - Summary: Combine RowSetTestUtils with RowSetUtilities Key: DRILL-6398 URL: https://issues.apache.org/jira/browse/DRILL-6398 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas There are two classes with RowSet utils, there should just be one. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
Apache Drill board report (draft) for May 2018
Hi Drill Devs, the Apache board report for Drill for this quarter is due soon. Here's a draft. If you have any comments, let me know. I plan to submit by tomorrow morning. Thanks, Aman === ## Description: - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud Storage ## Issues: - There are no issues requiring board attention at this time ## Activity: - Since the last board report, Drill has released version 1.13.0. The following is a partial list of new features/enhancements that were added in addition to many other bug fixes: - JDK 8 support. - Upgrade to Apache Calcite version 1.15. - JDBC Statement.setQueryTimeout(int) support. - Batch sizing improvements. - Support for SPNEGO to extend Kerberos to Web applications through HTTP. - Ability to run Drill under YARN. - Parquet filter pushdown improvements and related performance improvements. - Hive client for Drill is updated to version 2.3.2. - Ability to automatically manage memory allocations during Drill startup. - Support SQL syntax highlighting of queries, auto-complete support in SQL editors, and snippets. - Improved performance of the Single Merge Exchange operator. - Like operator optimization. - User/Distribution-specific configuration checks during startup. ## Health report: - The project is quite healthy. Development activity as reflected in the pull requests and JIRAs is good. Activity on the dev and user mailing lists continues to be strong. Three new committers were added in the last period. ## PMC changes: - Currently 19 PMC members. - No new PMC members added in the last 3 months - Last PMC addition was Paul Rogers on Mon Jan 29 2018 ## Committer base changes: - Currently 43 committers. - New commmitters: - Kunal Khatua was added as a committer on Tue Feb 27 2018 - Vova Vysotskyi was added as a committer on Thu Mar 15 2018 - Sorabh Hamirwasia was added as a committer on Fri Apr 27 2018 ## Releases: - 1.13.0 was released on Sun Mar 18 2018 ## Mailing list activity: - dev@drill.apache.org: - 437 subscribers (down -9 in the last 3 months): - 2582 emails sent to list (2244 in previous quarter) - iss...@drill.apache.org: - 19 subscribers (up 0 in the last 3 months): - 3652 emails sent to list (3088 in previous quarter) - u...@drill.apache.org: - 605 subscribers (down -8 in the last 3 months): - 356 emails sent to list (181 in previous quarter)
[jira] [Created] (DRILL-6397) OperatorTestBuilder, should leverage RowSets for comparing baseline values.
Timothy Farkas created DRILL-6397: - Summary: OperatorTestBuilder, should leverage RowSets for comparing baseline values. Key: DRILL-6397 URL: https://issues.apache.org/jira/browse/DRILL-6397 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6396) Remove unused getTempDir Method in BaseFixture
Timothy Farkas created DRILL-6396: - Summary: Remove unused getTempDir Method in BaseFixture Key: DRILL-6396 URL: https://issues.apache.org/jira/browse/DRILL-6396 Project: Apache Drill Issue Type: Improvement Reporter: Timothy Farkas Assignee: Timothy Farkas This tempDirectory method is no longer used. The DirTestWatcher and BaseDirTestWatcher classes are used instead for testing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6395) Value Window Function - LEAD and LAG on VarChar result in "No applicable constructor/method found" error
Raymond Wong created DRILL-6395: --- Summary: Value Window Function - LEAD and LAG on VarChar result in "No applicable constructor/method found" error Key: DRILL-6395 URL: https://issues.apache.org/jira/browse/DRILL-6395 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Affects Versions: 1.13.0 Environment: windows 10, apache drill 1.13.0, 32GB Ram Reporter: Raymond Wong {code:java} SELECT col2, LEAD(col1, 1) OVER (ORDER BY col2) AS nxtCol1 FROM ( SELECT 'A' AS col1, 1 AS col2 UNION SELECT 'B' AS col1, 2 AS col2 UNION SELECT 'C' AS col1, 3 AS col2 ) AS A; {code} Causes error {code:java} SQL Error: SYSTEM ERROR: CompileException: Line 37, Column 40: No applicable constructor/method found for actual parameters "int, int, int, io.netty.buffer.DrillBuf"; candidates are: "public void org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, org.apache.drill.exec.expr.holders.VarCharHolder)", "public void org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, org.apache.drill.exec.expr.holders.NullableVarCharHolder)", "public void org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, byte[], int, int)", "public void org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, java.nio.ByteBuffer, int, int)", "public void org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(int, int, int, int, io.netty.buffer.DrillBuf)" {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6394) Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object '/tmp/file.dat' not found within 'dfs'
Hari Sekhon created DRILL-6394: -- Summary: Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object '/tmp/file.dat' not found within 'dfs' Key: DRILL-6394 URL: https://issues.apache.org/jira/browse/DRILL-6394 Project: Apache Drill Issue Type: Improvement Components: Server, Execution - Codegen, Metadata, Query Planning Optimization, SQL Parser, Storage - Text CSV Affects Versions: 1.13.0 Environment: MapR 6.0 Reporter: Hari Sekhon Improvement request for the following error to be made more specific to mention that this is caused by the file extension (.dat) not being one of the expected ones, even though it was a CSV file (renaming it to .csv worked): {code:java} 0: jdbc:drill:drillbit=> select * from dfs.`/tmp/file.dat`; Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 17: Object '/tmp/file.dat' not found within 'dfs' SQL Query null [Error Id: e7c2863e-0feb-4b80-82b7-008056f0fcef on :31010] (state=,code=0) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)