[jira] [Created] (DRILL-3990) Create a sys.fragments table

2015-10-28 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3990: - Summary: Create a sys.fragments table Key: DRILL-3990 URL: https://issues.apache.org/jira/browse/DRILL-3990 Project: Apache Drill Issue Type: Improvement

[jira] [Created] (DRILL-3989) Create a sys.queries table

2015-10-28 Thread Jacques Nadeau (JIRA)
Jacques Nadeau created DRILL-3989: - Summary: Create a sys.queries table Key: DRILL-3989 URL: https://issues.apache.org/jira/browse/DRILL-3989 Project: Apache Drill Issue Type: Bug C

Easy tasks for new developers: Check out the "newbie" label

2015-10-28 Thread Jacques Nadeau
Hey Everybody, I've started identifying tasks that a newbie could pick up. These are tasks that should be a good place to start working with Drill and also very rewarding. You can look at some by going here: https://issues.apache.org/jira/issues/?jql=project%20%3D%20DRILL%20AND%20labels%20%3D%20n

Maven/checkstyle changes

2015-10-28 Thread Daniel Barclay
Hey, what changed in the Maven setup regarding checkstyle? I used to be able to run "mvn validate" to run checkstyle to find style violations up front (e.g., before starting a long test run). However, that doesn't seem to work any more. It looks like checkstyle is run later, but it's not clear

Re: Drill Test Framework : Planning Tests

2015-10-28 Thread rahul challapalli
Jacques, If my understanding is correct you are alluding towards declarative style expected values. Can you give an example? Below are my thoughts around implementing this 1. Construct Physical Operator tree's based on the expected value json and actual drill output json strings 2. For each P

Re: Maven/checkstyle changes

2015-10-28 Thread Julien Le Dem
Hi Daniel, I moved checkstyle to the verify phase that comes after the tests and before package: https://maven.apache.org/guides/introduction/introduction-to-the-lifecycle.html This is the default binding for checkstyle: https://maven.apache.org/plugins/maven-checkstyle-plugin/check-mojo.html This

Re: Maven/checkstyle changes

2015-10-28 Thread Julien Le Dem
Daniel, If you move the config for the checkstyle plugin out of the executions tag, you can run mvn checkstyle:check See: https://github.com/apache/drill/commit/eefbaa448ab15974d5e63cf37615ce20bb436ba2 It'll be part of: On Wed, Oct 28, 2015 at 10:37 AM, Julien Le Dem wrote: > Hi Daniel, > I move

Re: Maven/checkstyle changes

2015-10-28 Thread Julien Le Dem
... of: https://github.com/apache/drill/pull/221 On Wed, Oct 28, 2015 at 10:50 AM, Julien Le Dem wrote: > Daniel, > If you move the config for the checkstyle plugin out of the executions > tag, you can run mvn checkstyle:check > See: > > https://github.com/apache/drill/commit/eefbaa448ab15974d5e

Re: [DISCUSS] Ideas to improve metadata cache read performance

2015-10-28 Thread Parth Chandra
And ideally, I suppose, the merged schema would correspond to the information that we want to keep in a .drill file. On Tue, Oct 27, 2015 at 4:55 PM, Aman Sinha wrote: > @Steven, w.r.t to your suggestion about doing the metadata operation during > execution phase, see the related discussion in

[jira] [Created] (DRILL-3991) Support schema changes in hash join operator

2015-10-28 Thread amit hadke (JIRA)
amit hadke created DRILL-3991: - Summary: Support schema changes in hash join operator Key: DRILL-3991 URL: https://issues.apache.org/jira/browse/DRILL-3991 Project: Apache Drill Issue Type: Impro

Broader feedback on DRILL-3810

2015-10-28 Thread Jacques Nadeau
Hey Guys, DRILL-3810 is a patch adding schema to a format plugin. In order to do this, Kamesh has suggested a change to the FormatPlugin that basically has a secondary call called getDrillTable(Object selection) that is called after the FormatMatcher. However, it seems weird that there is a multi-

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jinfengni
Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43314473 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean contai

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jinfengni
Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43315255 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean contai

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread julianhyde
Github user julianhyde commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43316183 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean conta

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jinfengni
Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-151987787 Please modify the title of JIRA DRILL-3623, since the new pull request is using a completely different approach to address the performance issue for "LIMIT 0". --- If

[jira] [Created] (DRILL-3992) Unable to query Oracle

2015-10-28 Thread Eric Roma (JIRA)
Eric Roma created DRILL-3992: Summary: Unable to query Oracle Key: DRILL-3992 URL: https://issues.apache.org/jira/browse/DRILL-3992 Project: Apache Drill Issue Type: Bug Components: Qu

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jacques-n
Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152001143 What happened to the original strategy of short circuiting on schema'd files. This approach still means we have to pay for all the operation compilations for no reason.

[GitHub] drill pull request: DRILL-3983: Small test improvements

2015-10-28 Thread julienledem
Github user julienledem commented on the pull request: https://github.com/apache/drill/pull/221#issuecomment-152001036 @adeneche Please see last commit. I made the output printing configurable so that it is less verbose in tests. https://github.com/apache/drill/commit/9b40f93122eb2205

Drill Test Framework - Data Generation does not work for Advanced/Tpch

2015-10-28 Thread Jacques Nadeau
When I try to run the extended test framework tests in the tpch folder (with -d): Advanced/tpch/tpch_sf100/parquet I get a message: Schema [dfs.drillTestDirTpch100Parquet] is not valid with respect to either root schema or current default schema. Where is this data? -- Jacques Nadeau CTO and

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jinfengni
Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152013964 The original approach (skipping the execution phase for limit 0 completely), actually could potentially have issues in some cases, due to the difference in Calcite rule

Re: Drill Test Framework - Data Generation does not work for Advanced/Tpch

2015-10-28 Thread Ramana I N
@Jacques, I doubt these were ever checked in. And I dont think they were using the s3 method to download as well. Regards Ramana On Wed, Oct 28, 2015 at 2:47 PM, Jacques Nadeau wrote: > When I try to run the extended test framework tests in the tpch folder > (with -d): > > Advanced/tpch/tpch_

[GitHub] drill pull request: requireJavaVersion and requireMavenVersion ove...

2015-10-28 Thread mpouttuclarke
GitHub user mpouttuclarke opened a pull request: https://github.com/apache/drill/pull/223 requireJavaVersion and requireMavenVersion overly restrictive The current settings don't work with JDK 1.8.0_60 You can merge this pull request into a Git repository by running: $ git pull

Re: Drill Test Framework - Data Generation does not work for Advanced/Tpch

2015-10-28 Thread Abhishek Girish
Data is in a Amazon S3 bucket. I'll create links for all Advanced suites datasets and share them shortly. Will also add a note to README. On Wed, Oct 28, 2015 at 2:47 PM, Jacques Nadeau wrote: > When I try to run the extended test framework tests in the tpch folder > (with -d): > > Advanced/tpch

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jacques-n
Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152018974 Got it. Thanks for the explanation. So this is a hack until we can solve those issues. I think we have to do this work, however. a 1-2 second response to a limi

[GitHub] drill pull request: requireJavaVersion and requireMavenVersion ove...

2015-10-28 Thread hnfgns
Github user hnfgns commented on the pull request: https://github.com/apache/drill/pull/223#issuecomment-152023451 My understanding here is to restrict developers working on unsupported platforms. At least until the support is official. I doubt in this case if we do support JDK ≥ 1.8

setting planner max width

2015-10-28 Thread Hanifi GUNES
On a 10 node cluster, I am executing a query with the following *alter session set `planner.width.max_per_node`=6;* and see 153 minor fragments reported in the profiles tab whereas I would expect a max parallelization of 60 cluster-wide. Is not this option bounding the max # of threads per query

Re: setting planner max width

2015-10-28 Thread Jacques Nadeau
Max width per node is per major fragment per node, not per query. So you should see no more than 60 minor fragments for any particular major fragment. Remember that in most cases, a multi-major-fragment query has blocking operations in it. -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, Oc

Re: setting planner max width

2015-10-28 Thread Hanifi GUNES
Yup. That makes sense. All ~150 minor fragments seem under a sing;e major fragment though. I should dig in further to see what's going on here. -H+ 2015-10-28 16:14 GMT-07:00 Jacques Nadeau : > Max width per node is per major fragment per node, not per query. > > So you should see no more than 6

[jira] [Created] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release

2015-10-28 Thread Sudheesh Katkam (JIRA)
Sudheesh Katkam created DRILL-3993: -- Summary: Rebase Drill on Calcite 1.5.0 release Key: DRILL-3993 URL: https://issues.apache.org/jira/browse/DRILL-3993 Project: Apache Drill Issue Type: Bu

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jinfengni
Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152034091 Sudheesh and I feel this new approach is more like a big optimization step towards solving the performance issue for "limit 0" query, rather than hack solution : 1) It

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152034356 Also, on the execution side, I was actually hitting [DRILL-2288](https://issues.apache.org/jira/browse/DRILL-2288), where sending exactly one batch with schema and

[GitHub] drill pull request: DRILL-3963 Add Sequence file support.

2015-10-28 Thread amithadke
Github user amithadke commented on the pull request: https://github.com/apache/drill/pull/214#issuecomment-152036324 @sudheeshkatkam I've added changes and tests for sequence file and avro. they both use hadoop's api to create recordreader. Thanks for helping out with the test. ---

Re: Drill 1.3 Timing: Let's start the vote next week

2015-10-28 Thread Parth Chandra
No one else seems to have an opinion :) Let's go with the monthly cycle with the vote going out at the beginning of the month starting with the 1.3 release. Let's also make sure we stick to this cycle for all future releases. Parth On Mon, Oct 26, 2015 at 7:06 PM, Jacques Nadeau wrote: >

[jira] [Created] (DRILL-3994) Build Fails on Windows after DRILL-3742

2015-10-28 Thread Sudheesh Katkam (JIRA)
Sudheesh Katkam created DRILL-3994: -- Summary: Build Fails on Windows after DRILL-3742 Key: DRILL-3994 URL: https://issues.apache.org/jira/browse/DRILL-3994 Project: Apache Drill Issue Type:

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread sudheeshkatkam
Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43339305 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean c

Issue 3149

2015-10-28 Thread Edmon Begoli
May I please escalate this issue for 1.3 or 1.4: https://issues.apache.org/jira/browse/DRILL-3149 I understand that Jim's fixed was lost. Can the fix be recovered and slipped into 1.3? It is causing us to re-format very large volume of files to check and remove these line terminators. Thank yo

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jacques-n
Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152047328 Interesting. Can you explain where the time is coming from? It isn't clear to me why this will have a big impact over what we had before. While you're pushing the limit

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread sudheeshkatkam
Github user sudheeshkatkam commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152053724 I think I see the source of confusion (sorry); this patch does not address that query in the JIRA, which is why Jinfeng asked me to change the title in one of his

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jacques-n
Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152056732 I'm sorry to say that I'm -1 on this change It seems to be adding a planning rewrite rule where there should be a simple fix execution bug. Let's just fix the e

Re: Issue 3149

2015-10-28 Thread Jacques Nadeau
Jim's fix wasn't lost. It was in the context of very different reader. That reader was deprecated because there were a number of other issues and performance problems with it. Those items were addressed in this reader. In terms of someone looking at this soon, I agree that this would be great. Can

Re: setting planner max width

2015-10-28 Thread Jacques Nadeau
Hmm... if that is the case it sounds like a bug. -- Jacques Nadeau CTO and Co-Founder, Dremio On Wed, Oct 28, 2015 at 4:18 PM, Hanifi GUNES wrote: > Yup. That makes sense. All ~150 minor fragments seem under a sing;e major > fragment though. I should dig in further to see what's going on here.

[GitHub] drill pull request: DRILL-3623: Use shorter query path for LIMIT 0...

2015-10-28 Thread jacques-n
Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152059989 Just to add to my comment above, if you want to do a quick call or hangout to discuss I'm more than happy to. As I said above, it is possible I am misunderstanding. If

Re: Drill 1.3 Timing: Let's start the vote next week

2015-10-28 Thread Jacques Nadeau
Great, thanks for you flexibility! I think this will be very helpful to the people struggling with some of the jdbc issues right now. One other nice thing is that this will get Julien's changes out to users. It wasn't mentioned in the jira but his changes should make embedded mode startup 2-3 sec

[jira] [Created] (DRILL-3995) Scalar replacement bug with Common Subexpression Elimination

2015-10-28 Thread Steven Phillips (JIRA)
Steven Phillips created DRILL-3995: -- Summary: Scalar replacement bug with Common Subexpression Elimination Key: DRILL-3995 URL: https://issues.apache.org/jira/browse/DRILL-3995 Project: Apache Drill

[GitHub] drill pull request: DRILL-3912: Common subexpression elimination

2015-10-28 Thread StevenMPhillips
Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-152066327 @jinfengni I updated the PR. Could you take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well