Re: [jira] [Created] (DRILL-4259) Add new functional tests to ensure that failures can be detected independent of the testing environment

2016-01-11 Thread Chun Chang
Can we come up with a list of different configuration option settings we
want to run? The automation framework supports running with different
system settings now.

On Mon, Jan 11, 2016 at 9:43 AM, Jason Altekruse (JIRA) 
wrote:

> Jason Altekruse created DRILL-4259:
> --
>
>  Summary: Add new functional tests to ensure that failures can
> be detected independent of the testing environment
>  Key: DRILL-4259
>  URL: https://issues.apache.org/jira/browse/DRILL-4259
>  Project: Apache Drill
>   Issue Type: Test
> Reporter: Jason Altekruse
>
>
> In DRILL-4243 an out of memory issue was fixed after a change to the
> memory allocator made memory limits more strict. While the regression tests
> had been run by the team at Dremio prior to merging the patch, running the
> tests on a cluster with more cores changed the memory limits on the queries
> and caused several tests to fail.
>
> While changes of this magnitude are not going to be common, we should have
> a test suite that reliably fails independent of the environment it is run
> (assuming that there are sufficient resources for the tests to run).
>
> It would be good to at least try to reproduce this failure on a few
> different setups (cores, nodes in cluster) by adjusting available
> configuration options and adding tests with those different configurations
> so that the tests will fail in different environments.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


Re: Heads up on trivial project fix: plan baselines in regression suite need updating

2016-03-04 Thread Chun Chang
I think it depends on which cluster. I saw 8 failures on most of the
clusters, but 44 failures on one cluster. I guess all of them need to be
looked at and modified.

On Fri, Mar 4, 2016 at 10:35 AM, Parth Chandra  wrote:

> I'd be more comfortable if we merged this in after the release. Updating
> the test baselines will delay the release considerably - I would want the
> new baselines to be verified manually which is always time consuming.
> How many tests are affected?
>
>
>
> On Fri, Mar 4, 2016 at 10:03 AM, Jacques Nadeau 
> wrote:
>
> > Do you think we should back out? It seemed like this could likely cause
> > correctness issues although we may be safe with our name based
> resolution.
> > On Mar 4, 2016 9:56 AM, "Aman Sinha"  wrote:
> >
> > > @jacques, thanks for the heads-up, although it comes too close to the
> > > release date :).  I agree that the plan tests should be targeted to a
> > > narrow scope by specifying the sub-pattern it is supposed to test.
>  That
> > > said, it is a lot easier for the tester to capture the entire plan
> since
> > > he/she may miss an important detail if a sub-plan is captured, so this
> > > requires close interaction with the developer (which depending on
> various
> > > factors may take longer while the test needs to be checked-in).
> > > BTW, Calcite unit tests capture entire plan.  I am not sure if similar
> > > issue has been discussed on Calcite dev list in the past.
> > >
> > > -Aman
> > >
> > > On Fri, Mar 4, 2016 at 4:19 AM, Jacques Nadeau 
> > wrote:
> > >
> > > > I just merged a simple fix that Laurent found for DRILL-4467.
> > > >
> > > > This fix ensures consistent column ordering when pushing projection
> > into
> > > a
> > > > scan and invalid plans. This is good and was causing excessive
> > operators
> > > > and pushdown failure in some cases.
> > > >
> > > > However, this fix removes a number of trivial projects (that were
> > > > previously not detected as such) in a large set of queries. This
> means
> > > that
> > > > a number of plan baselines will need to be updated in the extended
> > > > regression suite to avoid consideration of the trivial project. This
> > > > underscores an issue I see in these tests. In virtually all cases
> I've
> > > > seen, the purpose of the test shouldn't care whether the trivial
> > project
> > > is
> > > > part of the plan. However, the baseline is over-reaching in its
> > > definition,
> > > > including a bunch of nodes irrelevant to the purpose of the test. One
> > > > example might be here:
> > > >
> > > >
> > > >
> > >
> >
> https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/filter/pushdown/plan/q23.res
> > > >
> > > > In this baseline, we're testing that the filter is pushed past the
> > > > aggregation. That means what we really need to be testing is a
> > multiline
> > > > plan pattern of
> > > >
> > > > HashAgg.*Filter.*Scan.*
> > > >
> > > > or better
> > > >
> > > > HashAgg.*Filter\(condition=\[=\(\$0, 10\)\]\).*Scan.*
> > > >
> > > > However, you can see that the actual expected result includes the
> > > > entire structure of the plan (but not the pushed down filter
> > > > condition). This causes the plan to fail now that DRILL-4467 is
> > > > merged. As part of the fixes to these plans, we should really make
> > > > sure that the scope of the baseline is only focused on the relevant
> > > > issue to avoid nominal changes from causing testing false positives.
> > > >
> > > >
> > > >
> > > > --
> > > > Jacques Nadeau
> > > > CTO and Co-Founder, Dremio
> > > >
> > >
> >
>


Re: Time for the 1.6 Release

2016-03-04 Thread Chun Chang
Jacques submitted a PR for fixing the failed baselines. I've merged them
into automation master and confirmed the failed tests are all passing now.
Thanks.

-Chun


On Thu, Mar 3, 2016 at 10:48 PM, Jacques Nadeau  wrote:

> I think we need to include DRILL-4467
> . I think it is a one
> line patch and it provides unpredictable plans at a minimum but may also
> present invalid result. Still need to think through the second half. I've
> seen this plan instability in some of my recent test runs (even without
> Java 8) when running extended HBase tests.
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Thu, Mar 3, 2016 at 10:02 PM, Parth Chandra  wrote:
>
> > Updated list  (I'll follow up with the folks named here separately) -
> >
> > Committed for 1.6 -
> >
> > DRILL-4384 - Query profile is missing important information on WebUi -
> > Merged
> > DRILL-3488/pr 388 (Java 1.8 support) - Merged.
> > DRILL-4410/pr 380 (listvector should initiatlize bits...) - Merged
> > DRILL-4383/pr 375 (Allow custom configs for S3, Kerberos, etc) - Merged
> > DRILL-4465/pr 401 (Simplify Calcite parsing & planning integration) -
> > Waiting to be merged
> > DRILL-4437 (and others)/pr 394 (Operator unit test framework). Waiting to
> > be merged.
> >
> > DRILL-4281/pr 400 (Drill should support inbound impersonation) (Jacques
> to
> > review)
> > DRILL-4372/pr 377(?) (Drill Operators and Functions should correctly
> expose
> > their types within Calcite.) - Waiting for Aman to review. (Owners:
> Hsuan,
> > Jinfeng, Aman, Sudheesh)
> > DRILL-4313/pr 396  (Improved client randomization. Update JIRA with
> > warnings about using the feature ) (Sudheesh to review.)
> > DRILL-4449/pr 389 (Wrong results when metadata cache is used..) (Aman to
> > review)
> > DRILL-4069/pr 352 Enable RPC thread offload by default (Owner: Sudheesh)
> >
> > Need review -
> > DRILL-4375/pr 402 (Fix the maven release profile)
> > DRILL-4452/pr 395 (Update Avatica Driver to latest Calcite)
> > DRILL-4332/pr 389 (Make vector comparison order stable in test framework)
> > DRILL-4411/pr 381 (hash join over-memory condition)
> > DRILL-4387/pr 379 (GroupScan should not use star column)
> > DRILL-4184/pr 372 (support variable length decimal fields in parquet)
> > DRILL-4120 - dir0 does not work when the directory structure contains
> Avro
> > files - Partial patch available.
> > DRILL-4203/pr 341 (fix dates written into parquet files to conform to
> > parquet format spec)
> >
> > Not included (yet) -
> > DRILL-3149 - No patch available
> > DRILL-4441 - IN operator does not work with Avro reader - No patch
> > available
> > DRILL-3745/pr 399 - Hive char support - New feature - Needs QA - Not
> > included in 1.6
> > DRILL-3623 - Limit 0 should avoid execution when querying a known schema.
> > (Need to add limitations of current impl). Intrusive change; should be
> > included at beginning of release cycle.
> > DRILL-4416/pr 385 (quote path separator) (Owner: Hanifi) - Causes leak.
> >
> > Others -
> > DRILL-2517   - Already resolved.
> > DRILL-3688/pr 382 (skip.header.line.count in hive). - Already merged. PR
> > needs to be closed.
> >
> >
> > On Thu, Mar 3, 2016 at 9:44 PM, Parth Chandra  wrote:
> >
> > > Right. My mistake. Thanks, Jacques, for reviewing.
> > >
> > > On Thu, Mar 3, 2016 at 9:08 PM, Zelaine Fong 
> wrote:
> > >
> > >> DRILL-4281/pr 400 (Drill should support inbound impersonation)
> (Sudheesh
> > >> to
> > >> review)
> > >>
> > >> Sudheesh is the fixer of DRILL-4281, so I don't think he can be the
> > >> reviewer :).
> > >>
> > >> -- Zelaine
> > >>
> > >> On Thu, Mar 3, 2016 at 6:30 PM, Parth Chandra 
> > wrote:
> > >>
> > >> > Here's an updated list with names of reviewers added. If anyone else
> > is
> > >> > reviewing the open PRs please let me know. Some PRs have owners
> names
> > >> that
> > >> > I will follow up with.
> > >> > Jason, I've included your JIRA in the list.
> > >> >
> > >> >
> > >> > Committed for 1.6 -
> > >> >
> > >> > DRILL-4384 - Query profile is missing important information on
> WebUi -
> > >> > Merged
> > >> > DRILL-3488/pr 388 (Java 1.8 support) - Merged.
> > >> > DRILL-4410/pr 380 (listvector should initiatlize bits...) - Merged
> > >> > DRILL-4383/pr 375 (Allow custom configs for S3, Kerberos, etc) -
> > Merged
> > >> > DRILL-4465/pr 401 (Simplify Calcite parsing & planning integration)
> -
> > >> > Waiting to be merged
> > >> >
> > >> > DRILL-4281/pr 400 (Drill should support inbound impersonation)
> > >> (Sudheesh to
> > >> > review)
> > >> > DRILL-4372/pr 377(?) (Drill Operators and Functions should correctly
> > >> expose
> > >> > their types within Calcite.) - Waiting for Aman to review. (Owners:
> > >> Hsuan,
> > >> > Jinfeng, Aman, Sudheesh)
> > >> > DRILL-4313/pr 396  (Improved client randomization. Update JIRA with
> > >> > warnings about using the feature ) (Sudheesh to review.)
> > >> > DRILL-4437 (and others)/pr 394 (Operator unit test framework).
> (P

Re: [VOTE] Release Apache Drill 1.6.0 - rc0

2016-03-14 Thread Chun Chang
+1 (non-binding)

-ran functional and advanced automation

On Mon, Mar 14, 2016 at 1:09 PM, Sudheesh Katkam 
wrote:

> +1 (non-binding)
>
> * downloaded and built from source tar-ball; ran unit tests successfully
> on Ubuntu
> * ran simple queries (including cancellations) in embedded mode on Mac;
> verified states in web UI
> * ran simple queries (including cancellations) on a 3 node cluster;
> verified states in web UI
>
> * tested maven artifacts (drill-jdbc) using a sample application <
> https://github.com/sudheeshkatkam/drill-example>.
> This application is based on DrillClient, and not JDBC API. I had to make
> two changes for this application to work (i.e. not backward compatible).
> However, these changes are not related to this release (commits
> responsible: 1fde9bb <
> https://github.com/apache/drill/commit/1fde9bb1505f04e0b0a1afb542a1aa5dfd20ed1b>
> and de00881 <
> https://github.com/apache/drill/commit/de008810c815e46e6f6e5d13ad0b9a23e705b13a>).
> We should have a conversation about what constitutes public API and changes
> to this API on a separate thread.
>
> Thank you,
> Sudheesh
>
> > On Mar 14, 2016, at 12:04 PM, Abhishek Girish 
> wrote:
> >
> > +1 (non-binding)
> >
> > - Tested Drill in distributed mode (built with MapR profile).
> > - Ran functional tests from Drill-Test-Framework [1]
> > - Tested Web UI (basic sanity)
> > - Tested Sqlline
> >
> > Looks good.
> >
> >
> > [1] https://github.com/mapr/drill-test-framework
> >
> > On Mon, Mar 14, 2016 at 11:23 AM, Venki Korukanti <
> venki.koruka...@gmail.com
> >> wrote:
> >
> >> +1
> >>
> >> Installed tar.gz on a 3 node cluster.
> >> Ran queries on data located in HDFS
> >> Enabled auth in WebUI, ran few queries and, verified auth and querying
> >> works fine
> >> Logged bugs for 2 minor issues/improvements (DRILL-4508
> >>  & DRILL-4509
> >> )
> >>
> >> Thanks
> >> Venki
> >>
> >> On Mon, Mar 14, 2016 at 10:56 AM, Norris Lee  wrote:
> >>
> >>> +1 (Non-binding)
> >>>
> >>> Build from source on CentOS. Tested the ODBC driver with queries
> against
> >>> hive and DFS (json, parquet, tsv, csv, directories).
> >>>
> >>> Norris
> >>>
> >>> -Original Message-
> >>> From: Hsuan Yi Chu [mailto:hyi...@maprtech.com]
> >>> Sent: Monday, March 14, 2016 10:42 AM
> >>> To: dev@drill.apache.org; adityakish...@gmail.com
> >>> Subject: Re: [VOTE] Release Apache Drill 1.6.0 - rc0
> >>>
> >>> +1
> >>> mvn clean install on linux vm; Tried some queries; Looks good.
> >>>
> >>> On Mon, Mar 14, 2016 at 9:58 AM, Aditya 
> wrote:
> >>>
>  While I did verify the signature and structure of the maven artifacts,
>  I think Jacques was referring to verify the functionality, which I
> have
> >>> not.
> 
>  On Mon, Mar 14, 2016 at 8:12 AM, Parth Chandra 
> >>> wrote:
> 
> > Aditya has verified the maven artifacts. Would it make sense to
> > extend
>  the
> > vote by another day to let more people verify the release?
> >
> >
> >
> > On Mon, Mar 14, 2016 at 7:08 AM, Jacques Nadeau 
> > wrote:
> >
> >> I haven't had a chance to validate yet.  Has anyone checked the
> >> maven artifacts yet?
> >> On Mar 14, 2016 6:37 AM, "Aditya"  wrote:
> >>
> >>> +1 (binding).
> >>>
> >>> * Verified checksum and signature of all release artifacts in[1]
> >>> and
> >> maven
> >>> artifacts in [2] and the artifacts are signed using Parth's
> >>> public
>  key
> >> (ID
> >>> 9BAA73B0).
> >>> * Verified that build and tests pass using the source artifact.
> >>> * Verified that Drill can be launched in embedded mode using the
> >>> convenience binary release.
> >>> * Ran sample queries using classpath storage plugin.
> >>>
> >>> p.s. Have enhanced the release verification script [3] to allow
> > automatic
> >>> download and verification of release artifacts through the pull
>  request
> >>> 249[4]. Will merge if someone can review it.
> >>>
> >>> [1] http://home.apache.org/~parthc/drill/releases/1.6.0/rc0/
> >>> [2]
> >> https://repository.apache.org/content/repositories/orgapachedrill-
> >> 1030
> >>> [3]
> > https://github.com/apache/drill/blob/master/tools/verify_release.sh
> >>> [4] https://github.com/apache/drill/pull/249
> >>>
> >>> On Mon, Mar 14, 2016 at 12:51 AM, Abdel Hakim Deneche <
> >>> adene...@maprtech.com
>  wrote:
> >>>
>  +1
> 
>  built from source with mapr profile and deployed on 2 nodes,
>  then
>  run
>  window functions from Drill's test framework. Also took a
>  quick
>  look
> > at
> >>> the
>  WebUI. Everything looks fine
> 
>  On Sun, Mar 13, 2016 at 5:53 PM, Parth Chandra
>  
> >>> wrote:
> 
> > Added GPG key
> >
> > On S

Drill automation framework update

2016-03-23 Thread Chun Chang
Hi drillers,

MapR recently made changes to the automation framework* to make it easier
running against HDFS cluster. Please refer to the updated README file for
detail. Let us know if you encounter any issues.

Thanks,
-Chun

*https://github.com/mapr/drill-test-framework


Re: Drill automation framework update

2016-03-23 Thread Chun Chang
You are correct Jacques. No change made to plan baselines yet.

On Wed, Mar 23, 2016 at 5:10 PM, Jacques Nadeau  wrote:

> Can you confirm that you've successfully executed the tests on Apache HDFS
> 2.7.1? I note that you have modified the plans to remove the maprfs prefix
> however you have kept the individual file names. I believe the ordering of
> these files is not the same in HDFS versus MapRFS and thus tests will fail.
> Can you confirm or dispute that issue?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Wed, Mar 23, 2016 at 4:19 PM, Chun Chang  wrote:
>
> > Hi drillers,
> >
> > MapR recently made changes to the automation framework* to make it easier
> > running against HDFS cluster. Please refer to the updated README file for
> > detail. Let us know if you encounter any issues.
> >
> > Thanks,
> > -Chun
> >
> > *https://github.com/mapr/drill-test-framework
> >
>


Re: Drill automation framework update

2016-03-23 Thread Chun Chang
Jacques,

Would it be possible for you guys (or anyone in the community) to execute
the tests on Apache HDFS and report back the list of failed tests? I may
spend time to fix them or look for community contributions.

Thanks,
-Chun

On Wed, Mar 23, 2016 at 5:20 PM, Chun Chang  wrote:

> You are correct Jacques. No change made to plan baselines yet.
>
> On Wed, Mar 23, 2016 at 5:10 PM, Jacques Nadeau 
> wrote:
>
>> Can you confirm that you've successfully executed the tests on Apache HDFS
>> 2.7.1? I note that you have modified the plans to remove the maprfs prefix
>> however you have kept the individual file names. I believe the ordering of
>> these files is not the same in HDFS versus MapRFS and thus tests will
>> fail.
>> Can you confirm or dispute that issue?
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>> On Wed, Mar 23, 2016 at 4:19 PM, Chun Chang  wrote:
>>
>> > Hi drillers,
>> >
>> > MapR recently made changes to the automation framework* to make it
>> easier
>> > running against HDFS cluster. Please refer to the updated README file
>> for
>> > detail. Let us know if you encounter any issues.
>> >
>> > Thanks,
>> > -Chun
>> >
>> > *https://github.com/mapr/drill-test-framework
>> >
>>
>
>


Re: [VOTE] Release Apache Drill 1.8.0 - rc1

2016-08-29 Thread Chun Chang
There are a few known random failures in our regression run that we don't
know the real cause yet. Other than that, looks good.

+0

On Mon, Aug 29, 2016 at 1:23 AM, Gautam Parai  wrote:

> +1
> Built from src, ran all unit tests on the linux VM.
> Checked signature and checksums.
> Ran tests for a few bugs which were fixed in the release.
> LGTM.
>
> On Sat, Aug 27, 2016 at 10:00 AM, Dechang Gu  wrote:
>
> > +1
> > Build from src, ran tpch queries and a bunch of other
> > queries.
> > LGTM.
> >
> > On Fri, Aug 26, 2016 at 9:58 PM, Parth Chandra 
> wrote:
> >
> > > +1 (binding)
> > > Built from src, ran all unit tests on the mac (except jdbc which didn't
> > run
> > > because of some wierd stuff on my machine)
> > > Checked signature and checksums
> > > Built C++ client
> > > Ran a few hundred small queries and queries with cancel.
> > > Checked state of drillbit.
> > > Everything looks good.
> > >
> > >
> > >
> > > On Fri, Aug 26, 2016 at 11:35 AM, Padma Penumarthy <
> > > ppenumar...@maprtech.com
> > > > wrote:
> > >
> > > > +1
> > > >
> > > > Installed on my cluster. Ran few queries.
> > > > Tried on mac in embedded mode. Ran few queries.
> > > > Tried json query which had regression earlier.
> > > >
> > > > Thanks,
> > > > Padma
> > > >
> > > >
> > > > > On Aug 26, 2016, at 8:58 AM, Aman Sinha 
> > wrote:
> > > > >
> > > > > +1  (binding)
> > > > > Downloaded source and binary tarballs.
> > > > > Verified git.properties and README.
> > > > > On Linux VM: Built from source and ran unit tests.
> > > > > On Mac: Installed binary release and ran in embedded mode.
> > > > > Ran a few queries that exercise new features and verified that the
> > > > runtime
> > > > > profiles and Explain look good in the Web UI.
> > > > >
> > > > > LGTM.
> > > > >
> > > > > On Thu, Aug 25, 2016 at 3:16 PM, Jinfeng Ni 
> wrote:
> > > > >
> > > > >> Hello all,
> > > > >>
> > > > >> I'd like to propose the second release candidate (rc1) of Apache
> > > > >> Drill, version 1.8.0. It covers a total of 47 resolved JIRAs [1].
> > > > >> Fixes for issues found in rc0 is also included in rc1. Thanks to
> > > > >> everyone who contributed to this release.
> > > > >>
> > > > >> The tarball artifacts are hosted at [2] and the maven artifacts
> are
> > > > hosted
> > > > >> at [3].
> > > > >>
> > > > >> This release candidate is based on commit
> > > > >> 80c4d0290b3f6aafbbd70777d6f29be9a0e767e3 located at [4].
> > > > >>
> > > > >> The vote will be open for the next ~96 hours (including an extra
> day
> > > as
> > > > the
> > > > >> vote is happening over a weekend) ending at 5:00PM Pacific, August
> > > 29th,
> > > > >> 2016.
> > > > >>
> > > > >> [ ] +1
> > > > >> [ ] +0
> > > > >> [ ] -1
> > > > >>
> > > > >> Here's my vote: +1
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Jinfeng
> > > > >>
> > > > >> [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?
> > > > >> version=12334768&styleName=&projectId=12313820
> > > > >> [2] http://home.apache.org/~jni/drill/1.8.0/rc1/
> > > > >> [3] https://repository.apache.org/content/repositories/
> > > > >> orgapachedrill-1034/
> > > > >> [4] https://github.com/jinfengni/incubator-drill/tree/1.8.0
> > > > >>
> > > >
> > > >
> > >
> >
>


Re: ZK lost connectivity issue on large cluster

2016-09-14 Thread Chun Chang
Looks like you are running 1.5. I believe there are some work done in that
area and the newer release should behave better.

On Wed, Sep 14, 2016 at 11:43 AM, François Méthot 
wrote:

> Hi,
>
>   We are trying to find a solution/workaround to issue:
>
> 2016-01-28 16:36:14,367 [Curator-ServiceCache-0] ERROR
> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: ForemanException:
> One more more nodes lost connectivity during query.  Identified nodes
> were [atsqa4-133.qa.lab:31010].
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> ForemanException: One more more nodes lost connectivity during query.
> Identified nodes were [atsqa4-133.qa.lab:31010].
> at org.apache.drill.exec.work.foreman.Foreman$ForemanResult.
> close(Foreman.java:746)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.
> processEvent(Foreman.java:858)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.
> processEvent(Foreman.java:790)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman$StateSwitch.
> moveToState(Foreman.java:792)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.moveToState(
> Foreman.java:909)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.access$2700(
> Foreman.java:110)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman$StateListener.
> moveToState(Foreman.java:1183)
> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>
>
> DRILL-4325  
> ForemanException:
> One or more nodes lost connectivity during query
>
>
>
> Any one experienced this issue ?
>
> It happens when running query involving many parquet files on a cluster of
> 200 nodes. Same query on a smaller cluster of 12 nodes runs fine.
>
> It is not caused by garbage collection, (checked on both ZK node and the
> involved drill bit).
>
> Negotiated max session timeout is 40 seconds.
>
> The sequence seems:
> - Drill Query begins, using an existing ZK session.
> - Drill Zk session timeouts
>   - perhaps it was writing something that took too long
> - Drill attempts to renew session
>- drill believes that the write operation failed, so it attempts to
> re-create the zk node, which trigger another exception.
>
>  We are open to any suggestion. We will report any finding.
>
> Thanks
> Francois
>


Re: ZK lost connectivity issue on large cluster

2016-10-14 Thread Chun Chang
s WARNING in our log for the past month for pretty much every
> node
> >>>>> I
> >>>>>> checked.
> >>>>>> 2016-09-19 13:31:56,866 [BitServer-7] WARN
> >>>>>> o.a.d.exec.rpc.control.ControlServer - Message of mode REQUEST of
> >>>>> rpc type
> >>>>>> 6 took longer than 500 ms. Actual Duration was 16053ms.
> >>>>>> 2016-09-19 14:15:33,357 [BitServer-4] WARN
> >>>>>> o.a.d.exec.rpc.control.ControlClient - Message of mode RESPONSE of
> >>>>> rpc type
> >>>>>> 1 took longer than 500 ms. Actual Duration was 981ms.
> >>>>>>
> >>>>>> We really appreciate your help. I will dig in the source code for
> >>>>> when and
> >>>>>> why this error happen.
> >>>>>>
> >>>>>>
> >>>>>> Francois
> >>>>>>
> >>>>>> P.S.:
> >>>>>> We do see this also:
> >>>>>> 2016-09-19 14:48:23,444 [drill-executor-9] WARN
> >>>>>> o.a.d.exec.rpc.control.WorkEventBus - Fragment ..:1:2 not found
> >>>>> in the
> >>>>>> work bus.
> >>>>>> 2016-09-19 14:48:23,444 [drill-executor-11] WARN
> >>>>>> o.a.d.exec.rpc.control.WorkEventBus - Fragment :1:222 not found
> >>>>> in the
> >>>>>> work bus.
> >>>>>> 2016-09-19 14:48:23,444 [drill-executor-12] WARN
> >>>>>> o.a.d.exec.rpc.control.WorkEventBus - Fragment :1:442 not found
> >>>>> in the
> >>>>>> work bus.
> >>>>>> 2016-09-19 14:48:23,444 [drill-executor-10] WARN
> >>>>>> o.a.d.exec.rpc.control.WorkEventBus - Fragment :1:662 not found
> >>>>> in the
> >>>>>> work bus.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Sun, Sep 18, 2016 at 2:57 PM, Sudheesh Katkam <
> >>>>> skat...@maprtech.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hi Francois,
> >>>>>>>
> >>>>>>> More questions..
> >>>>>>>
> >>>>>>>> + Can you share the query profile?
> >>>>>>>> I will sum it up:
> >>>>>>>> It is a select on 18 columns: 9 string, 9 integers.
> >>>>>>>> Scan is done on 13862 parquet files spread  on 1000 fragments.
> >>>>>>>> Fragments are spread accross 215 nodes.
> >>>>>>>
> >>>>>>> So ~5 leaf fragments (or scanners) per Drillbit seems fine.
> >>>>>>>
> >>>>>>> + Does the query involve any aggregations or filters? Or is this a
> >>>>> select
> >>>>>>> query with only projections?
> >>>>>>> + Any suspicious timings in the query profile?
> >>>>>>> + Any suspicious warning messages in the logs around the time of
> >>>>> failure
> >>>>>>> on any of the drillbits? Specially on atsqa4-133.qa.lab? Specially
> >>>>> this one
> >>>>>>> (“..” are place holders):
> >>>>>>> Message of mode .. of rpc type .. took longer than ..ms.  Actual
> >>>>>>> duration was ..ms.
> >>>>>>>
> >>>>>>> Thank you,
> >>>>>>> Sudheesh
> >>>>>>>
> >>>>>>>> On Sep 15, 2016, at 11:27 AM, François Méthot <
> fmetho...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Hi Sudheesh,
> >>>>>>>>
> >>>>>>>> + How many zookeeper servers in the quorum?
> >>>>>>>> The quorum has 3 servers, everything looks healthy
> >>>>>>>>
> >>>>>>>> + What is the load on atsqa4-133.qa.lab when this happens? Any
> other
> >>>>>>>> applications running on that node? How many threads is the Drill
> >>>>> process
> >>>>>>>> using?
> >>>>>>>> The load on the failing node(8 cores) is 14, when Drill is
> running.
> >>>>> Which
> >>>>>>>> is nothing out of the ordinary according to our admin.
> >>>>>>>> HBase is also running.
> >>>>>>>> planner.width.max_per_node is set to 8
> >>>>>>>>
> >>>>>>>> + When running the same query on 12 nodes, is the data size same?
> >>>>>>>> Yes
> >>>>>>>>
> >>>>>>>> + Can you share the query profile?
> >>>>>>>> I will sum it up:
> >>>>>>>> It is a select on 18 columns: 9 string, 9 integers.
> >>>>>>>> Scan is done on 13862 parquet files spread  on 1000 fragments.
> >>>>>>>> Fragments are spread accross 215 nodes.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> We are in process of increasing our Zookeeper session timeout
> >>>>> config to
> >>>>>>> see
> >>>>>>>> if it helps.
> >>>>>>>>
> >>>>>>>> thanks
> >>>>>>>>
> >>>>>>>> Francois
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Sep 14, 2016 at 4:40 PM, Sudheesh Katkam <
> >>>>> skat...@maprtech.com>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi Francois,
> >>>>>>>>>
> >>>>>>>>> Few questions:
> >>>>>>>>> + How many zookeeper servers in the quorum?
> >>>>>>>>> + What is the load on atsqa4-133.qa.lab when this happens? Any
> >>>>> other
> >>>>>>>>> applications running on that node? How many threads is the Drill
> >>>>> process
> >>>>>>>>> using?
> >>>>>>>>> + When running the same query on 12 nodes, is the data size same?
> >>>>>>>>> + Can you share the query profile?
> >>>>>>>>>
> >>>>>>>>> This may not be the right thing to do, but for now, If the
> cluster
> >>>>> is
> >>>>>>>>> heavily loaded, increase the zk timeout.
> >>>>>>>>>
> >>>>>>>>> Thank you,
> >>>>>>>>> Sudheesh
> >>>>>>>>>
> >>>>>>>>>> On Sep 14, 2016, at 11:53 AM, François Méthot <
> >>>>> fmetho...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> We are running 1.7.
> >>>>>>>>>> The log were taken from the jira tickets.
> >>>>>>>>>>
> >>>>>>>>>> We will try out 1.8 soon.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Sep 14, 2016 at 2:52 PM, Chun Chang <
> cch...@maprtech.com>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Looks like you are running 1.5. I believe there are some work
> >>>>> done in
> >>>>>>>>> that
> >>>>>>>>>>> area and the newer release should behave better.
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Sep 14, 2016 at 11:43 AM, François Méthot <
> >>>>>>> fmetho...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi,
> >>>>>>>>>>>>
> >>>>>>>>>>>> We are trying to find a solution/workaround to issue:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2016-01-28 16:36:14,367 [Curator-ServiceCache-0] ERROR
> >>>>>>>>>>>> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
> >>>>>>> ForemanException:
> >>>>>>>>>>>> One more more nodes lost connectivity during query.
> Identified
> >>>>> nodes
> >>>>>>>>>>>> were [atsqa4-133.qa.lab:31010].
> >>>>>>>>>>>> org.apache.drill.common.exceptions.UserException: SYSTEM
> ERROR:
> >>>>>>>>>>>> ForemanException: One more more nodes lost connectivity during
> >>>>> query.
> >>>>>>>>>>>> Identified nodes were [atsqa4-133.qa.lab:31010].
> >>>>>>>>>>>> at org.apache.drill.exec.work.for
> >>>>> eman.Foreman$ForemanResult.
> >>>>>>>>>>>> close(Foreman.java:746)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>> at org.apache.drill.exec.work.
> foreman.Foreman$StateSwitch.
> >>>>>>>>>>>> processEvent(Foreman.java:858)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>> at org.apache.drill.exec.work.
> foreman.Foreman$StateSwitch.
> >>>>>>>>>>>> processEvent(Foreman.java:790)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>> at org.apache.drill.exec.work.
> foreman.Foreman$StateSwitch.
> >>>>>>>>>>>> moveToState(Foreman.java:792)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>> at org.apache.drill.exec.work.
> foreman.Foreman.moveToState(
> >>>>>>>>>>>> Foreman.java:909)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>> at org.apache.drill.exec.work.
> foreman.Foreman.access$2700(
> >>>>>>>>>>>> Foreman.java:110)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>> at org.apache.drill.exec.work.for
> >>>>> eman.Foreman$StateListener.
> >>>>>>>>>>>> moveToState(Foreman.java:1183)
> >>>>>>>>>>>> [drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> DRILL-4325  <https://issues.apache.org/jira/browse/DRILL-4325
> >
> >>>>>>>>>>>> ForemanException:
> >>>>>>>>>>>> One or more nodes lost connectivity during query
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Any one experienced this issue ?
> >>>>>>>>>>>>
> >>>>>>>>>>>> It happens when running query involving many parquet files on
> a
> >>>>>>> cluster
> >>>>>>>>>>> of
> >>>>>>>>>>>> 200 nodes. Same query on a smaller cluster of 12 nodes runs
> >>>>> fine.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It is not caused by garbage collection, (checked on both ZK
> >>>>> node and
> >>>>>>>>> the
> >>>>>>>>>>>> involved drill bit).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Negotiated max session timeout is 40 seconds.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The sequence seems:
> >>>>>>>>>>>> - Drill Query begins, using an existing ZK session.
> >>>>>>>>>>>> - Drill Zk session timeouts
> >>>>>>>>>>>>   - perhaps it was writing something that took too long
> >>>>>>>>>>>> - Drill attempts to renew session
> >>>>>>>>>>>>- drill believes that the write operation failed, so it
> >>>>> attempts
> >>>>>>>>>>> to
> >>>>>>>>>>>> re-create the zk node, which trigger another exception.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We are open to any suggestion. We will report any finding.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks
> >>>>>>>>>>>> Francois
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>


Re: [VOTE] Release Apache Drill 1.0.0 (rc1)

2015-05-18 Thread Chun Chang
+1 (non-binding)

This release candidate is a lot more stable in term of resource handling. I
bombarded a four node cluster with queries from multiple concurrent threads
for 72 hours, DRILL is stable and performed well. No performance
degradation noticed.

On Mon, May 18, 2015 at 2:28 PM, Steven Phillips 
wrote:

> +1 (binding)
>
> On Sun, May 17, 2015 at 11:31 PM, Hsuan Yi Chu 
> wrote:
>
> > +1 (non-binding)
> > 1. Downloaded the source to run mvn clean install
> > 2. Deployed 2-node Drill and tried a some queries, simple and nested ones
> >
> > Things all worked fine. :)
> >
> >
> > On Sun, May 17, 2015 at 8:50 PM, Abhishek Girish 
> wrote:
> >
> > > +1 (non-binding)
> > >
> > > - Tried out various queries on sqlline. Looked good.
> > > - Ran TPC-DS SF 100 queries successfully using the test framework. Saw
> no
> > > regressions.
> > > - Drill UI is responsive. Profile page shows updated information. Saw
> an
> > > issue, but I'm guessing not a blocker.
> > >
> > > Regards,
> > > Abhishek
> > >
> > > On Sun, May 17, 2015 at 6:40 PM, Jason Altekruse <
> > altekruseja...@gmail.com
> > > >
> > > wrote:
> > >
> > > > +1 binding
> > > >
> > > > Downloaded the source and binary tarballs, built the source and all
> > unit
> > > > tests passed. Ran a few queries in embedded mode and checked a few
> > pages
> > > on
> > > > the web UI.
> > > >
> > > > Great work everyone! Even Cyanide and Happiness got into the drill
> > spirit
> > > > to celebrate the release!
> > > >
> > > > http://explosm.net/comics/3929
> > > >
> > > >
> > > > On Sun, May 17, 2015 at 3:17 PM, Ramana Inukonda <
> > rinuko...@maprtech.com
> > > >
> > > > wrote:
> > > >
> > > > > +1(non binding)
> > > > > Did the following things:
> > > > >
> > > > >1. Built from downloaded source tar, skipped tests as I see
> plenty
> > > of
> > > > >unit tests successes.
> > > > >2. Deployed built tar on 2 centos clusters- 8 node and 3 node.
> > > > >3. Ran some queries against various sources- hive, hbase,
> parquet,
> > > csv
> > > > >and json. Verified results.
> > > > >4. Ran TPCH queries against a SF100 dataset and verified results
> > > > there.
> > > > >5. Looked to cancel long running  queries from sqlline before
> and
> > > > after
> > > > >returning results. sqlline still seems fine and able to execute
> > > > queries.
> > > > >6. Changed system and session options from sqlline and verified
> if
> > > > they
> > > > >take effect.
> > > > >7. Checked profiles page and storage plugin page from the web UI
> > and
> > > > >updated storage plugins and verified as updated successfully.
> > > > >
> > > > > Congrats guys, looks like a good release!
> > > > >
> > > > > On Sun, May 17, 2015 at 9:45 AM, Tomer Shiran 
> > > wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > Downloaded binary
> > > > > > Ran Drill in embedded mode with the new alias
> (bin/drill-embedded)
> > > > > > Ran some queries on the Yelp dataset on local files (Mac) and
> > MongoDB
> > > > > > Found that David is the most popular name on Yelp and that
> reviews
> > > are
> > > > > more
> > > > > > often useful than funny or cool (based on Yelp votes)
> > > > > >
> > > > > > Congrats!
> > > > > >
> > > > > > On Fri, May 15, 2015 at 8:03 PM, Jacques Nadeau <
> > jacq...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Everybody,
> > > > > > >
> > > > > > > I'm happy to propose a new release of Apache Drill, version
> > 1.0.0.
> > > > > This
> > > > > > is
> > > > > > > the second release candidate (rc1).  It includes a few issues
> > found
> > > > > > earlier
> > > > > > > today and covers a total of 228 JIRAs*.
> > > > > > >
> > > > > > > The vote will be open for 72 hours ending at 8pm Pacific, May
> 18,
> > > > 2015.
> > > > > > >
> > > > > > > [ ] +1
> > > > > > > [ ] +0
> > > > > > > [ ] -1
> > > > > > >
> > > > > > > thanks,
> > > > > > > Jacques
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12325568
> > > > > > > [2] http://people.apache.org/~jacques/apache-drill-1.0.0.rc1/
> > > > > > >
> > > > > > >
> > > > > > > *Note, the previous rc0 vote email undercounting the number of
> > > closed
> > > > > > > JIRAs.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abhishek Girish
> > >
> > > Senior Software Engineer
> > >
> > > (408) 476-9209
> > >
> > > 
> > >
> >
>
>
>
> --
>  Steven Phillips
>  Software Engineer
>
>  mapr.com
>


Re: [VOTE] Release Apache Drill 1.1.0 (rc0)

2015-07-03 Thread Chun Chang
72 hour longevity test looks good.

+1 (non-binding)

On Thu, Jul 2, 2015 at 8:39 PM, Aman Sinha  wrote:

> (Followup to my previous email). I ran several queries against  TPCH  SF1
> on my Mac and did not find any issues, apart from the version # shown in
> sqlline (which I think is a non-blocker).
>
> +1  (binding)
>
> Aman
>
> On Thu, Jul 2, 2015 at 8:36 PM, Hanifi GUNES 
> wrote:
>
> > *>> Jinfeng*
> >
> > *-  Verified checksum for both the source and binary tar files.*
> >
> > *>> Hanifi, Sudheesh*
> >
> > *- manually inspected maven repo- built a query submitter importing
> > jdbc-all artifact from the repo at [jacques:3]*
> >
> > Is there a guideline on verifying maven artifacts besides inspecting
> > published POMs or trying to use them? I could do that if someone points
> me.
> >
> >
> > Thanks.
> > -Hanifi
> >
> >
> > 2015-07-02 20:09 GMT-07:00 Ted Dunning :
> >
> > > I haven't seen that anybody is checking signatures and the maven
> > artifacts.
> > >
> > > Is anybody doing that?  If not, the release should be held back until
> > that
> > > is done.
> > >
> > > (I can't do it due to time pressure)
> > >
> > >
> > >
> > > On Thu, Jul 2, 2015 at 6:58 PM, Aman Sinha 
> wrote:
> > >
> > > > Downloaded the binary tar-ball.  Installed on my macbook.  Started
> > > sqlline
> > > > in embedded mode. Saw that sqlline is showing version 1.0.0 instead
> of
> > > > 1.1.0, although 'select * from sys.version'  is showing the right
> > commit.
> > > > Anyone else sees this ?
> > > >
> > > > /sqlline -u jdbc:drill:zk=local -n admin -p admin --maxWidth=10
> > > > ...
> > > > apache drill 1.0.0
> > > > "just drill it"
> > > >
> > > >
> > > >
> > > > On Thu, Jul 2, 2015 at 6:01 PM, Jason Altekruse <
> > > altekruseja...@gmail.com>
> > > > wrote:
> > > >
> > > > > +1 binding
> > > > >
> > > > > - downloaded and built the source tarball, all tests passed (on MAC
> > > osx)
> > > > > - started sqlline, issued a few queries
> > > > > - tried a basic update of storage plugin from the web UI and looked
> > > over
> > > > a
> > > > > few query profiles
> > > > >
> > > > >
> > > > > On Thu, Jul 2, 2015 at 5:42 PM, Mehant Baid  >
> > > > wrote:
> > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > * Downloaded src tar-ball, was able to build and run unit tests
> > > > > > successfully.
> > > > > > * Brought up DrillBit in embedded and distributed mode.
> > > > > > * Ran some TPC-H queries via Sqlline and the web UI.
> > > > > > * Checked the UI for profiles
> > > > > >
> > > > > > Looks good.
> > > > > >
> > > > > > Thanks
> > > > > > Mehant
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 7/2/15 5:36 PM, Sudheesh Katkam wrote:
> > > > > >
> > > > > >> +1 (non-binding)
> > > > > >>
> > > > > >> * downloaded binary tar-ball
> > > > > >> * ran queries (including cancellations) in embedded mode on Mac;
> > > > > verified
> > > > > >> states in web UI
> > > > > >>
> > > > > >> * downloaded and built from source tar-ball; ran unit tests on
> Mac
> > > > > >> * ran queries (including cancellations) on a 3 node cluster;
> > > verified
> > > > > >> states in web UI
> > > > > >>
> > > > > >> * built a Java query submitter that uses the maven artifacts
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Sudheesh
> > > > > >>
> > > > > >>  On Jul 2, 2015, at 4:06 PM, Hanifi Gunes 
> > > > wrote:
> > > > > >>>
> > > > > >>> - fully built and tested Drill from source on CentOS
> > > > > >>> - deployed on 3 nodes
> > > > > >>> - ran concurrent queries
> > > > > >>> - manually inspected maven repo
> > > > > >>> - built a Scala query submitter importing jdbc-all artifact
> from
> > > the
> > > > > repo
> > > > > >>> at [jacques:3]
> > > > > >>>
> > > > > >>> overall, great job!
> > > > > >>>
> > > > > >>> +1 (binding)
> > > > > >>>
> > > > > >>> On Thu, Jul 2, 2015 at 3:16 PM, rahul challapalli <
> > > > > >>> challapallira...@gmail.com> wrote:
> > > > > >>>
> > > > > >>>  +1 (non-binding)
> > > > > 
> > > > >  Tested the new CTAS auto partition feature
> > > > >  Published jdbc-all artifact looks good as well
> > > > > 
> > > > >  I am able to add the staged jdbc-all package as a dependency
> to
> > my
> > > > >  sample
> > > > >  JDBC app's pom file and I was able to connect to my drill
> > > cluster. I
> > > > >  think
> > > > >  this is a sufficient test for the published artifact.
> > > > > 
> > > > >  Part of the pom file below
> > > > > 
> > > > >  
> > > > >  
> > > > >    staged-releases
> > > > >    
> > > > > 
> > > >
> http://repository.apache.org/content/repositories/orgapachedrill-1001
> > > > >  
> > > > >  
> > > > >    
> > > > > 
> > > > >  
> > > > >  org.apache.drill.exec
> > > > >  drill-jdbc-all
> > > > >  1.1.0
> > > > >    
> > > > >  
> > > > > 
> > > > >  - Rahul
> > > > > 
> > > > >  On Thu, Jul 2, 2015 at 2:02 PM, Parth Chandra <
> > > > pchan...@

Re: [DISCUSS] Publishing advanced/functional tests

2015-08-17 Thread Chun Chang
Hi Ramana,

Glad to see your post here. I agree with your point that we should have a
way for public to run all the pre-commit tests. I feel that's a higher
priority than anything else since with that, people can commit their
patches.

Thanks,
Chun

On Fri, Aug 14, 2015 at 11:33 AM, Ramana I N  wrote:

> So what is the status on this? It would be nice to have this out with 1.2
> coming out.
>
> Regards
> Ramana
>
>
>
> On Wed, Aug 5, 2015 at 11:08 AM, Abhishek Girish <
> abhishek.gir...@gmail.com>
> wrote:
>
> > Ramana,
> >
> > I think the issue with licenses is mostly resolved. It was discussed that
> > for TPC-*, since we shall not be redistributing the data-gen software,
> but
> > distributing a randomized variant of the data generated by it, we should
> be
> > okay to include it part of our framework. For other datasets, we shall
> > either provide their copy of license with our framework, or simply
> provide
> > a link for users to download data before they execute.
> >
> > For now we should focus on having the framework out with minimal cleanup.
> > In near future we can work on setting up infrastructure and enhancing the
> > framework itself.
> >
> > -Abhishek
> >
> > On Wed, Aug 5, 2015 at 10:46 AM, Ramana I N  > > wrote:
> >
> > > @Jacques, Ted
> > >
> > > in the mean time, we risk patches being merged that have less than
> > complete
> > > > testing.
> > >
> > >
> > > While I agree with the premise of getting the tests out as soon as
> > possible
> > > it does not help us achieve anything except transparency. Your
> statement
> > > that getting the tests out will increase quality is dependent on
> someone
> > > actually being able to run the tests once they have access to it.
> > >
> > > Maybe we should focus on making a jenkins job to run the tests
> publicly.
> > > With that in place we can exclude the TPC* datasets as well as the yelp
> > > data sets from the framework and avoid licensing issues.
> > >
> > > Regards
> > > Ramana
> > >
> > >
> > > On Tue, Aug 4, 2015 at 11:39 AM, Abhishek Girish <
> > > abhishek.gir...@gmail.com
> > > >
> > > wrote:
> > >
> > > > We not only re-distribute external data-sets as-is, but also include
> > > > variants for those (text -> parquet, json, ...). So the challenge
> here
> > is
> > > > not simply disabling automatic downloads via the framework, and point
> > > users
> > > > to manually download the files before running the framework, but also
> > > about
> > > > how we will handle tests which require variants of the data sets. It
> > > simply
> > > > isn't practical to users of the framework to (1) download data-gen
> > > manually
> > > > (2) use specific seed / options before generating data, (3) convert
> > them
> > > to
> > > > parquet, etc.. (4) move them to specific locations inside their copy
> of
> > > the
> > > > framework.
> > > >
> > > > Something we'll need to know is how other projects are handling
> > > bench-mark
> > > > & other external datasets.
> > > >
> > > > -Abhishek
> > > >
> > > > On Tue, Aug 4, 2015 at 11:23 AM, rahul challapalli <
> > > > challapallira...@gmail.com
> > > > wrote:
> > > >
> > > > > Thanks for your inputs.
> > > > >
> > > > > Once issue with just publishing the tests in their current state is
> > > that,
> > > > > the framework re-distributes tpch, tpcds, yelp data sets without
> > > > requiring
> > > > > the users to accept their relevant licenses. A good number of tests
> > > uses
> > > > > these data sets. Any thoughts on how to handle this?
> > > > >
> > > > > - Rahul
> > > > >
> > > > > On Wed, Jul 29, 2015 at 12:07 AM, Ted Dunning <
> ted.dunn...@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > +1.  Get it out there.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Jul 28, 2015 at 10:12 PM, Jacques Nadeau <
> > jacq...@dremio.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Rahul,
> > > > > > >
> > > > > > > My suggestion would be to the lower bar--do the absolute bare
> > > minimum
> > > > > to
> > > > > > > get the tests out there.  For example, simply remove
> proprietary
> > > > > > > information and then get it on a public github (whether your
> > > personal
> > > > > > > github or a corporate one).  From there, people can help by
> > > > submitting
> > > > > > pull
> > > > > > > requests to improve the infrastructure and harness.  Making
> > things
> > > > > easier
> > > > > > > is something that can be done over time.  For example, we've
> had
> > > > offers
> > > > > > > from a couple different Linux Admins to help on something.  I'm
> > > sure
> > > > > that
> > > > > > > they could help with a number of the items you've identified.
> In
> > > the
> > > > > > mean
> > > > > > > time, we risk patches being merged that have less than complete
> > > > > testing.
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Jacques Nadeau
> > > > > > > CTO and Co-Founder, Dremio
> > > > > > >
> > > > > > > On Mon, Jul 27, 2015 at 2:16 PM, rahul challapalli <
> > > > > > > challapall

How deep can Drill drill?

2015-09-24 Thread Chun Chang
For JSON data format, what is the recommended/supported nesting levels
Drill can handle? I think we should establish a limit and make it into the
documentation.

I tried querying a json file with 1800 nesting levels. The query never came
back, even after three hours.

Thanks,
Chun


Re: How deep can Drill drill?

2015-09-24 Thread Chun Chang
Hi Stefan,

Yeah, that many level doesn't make too much practical sense. It's just an
example how people may test it with.

Chun

{
  "level0_lab0_bool" : false,
  "level0_lab1_float" : 0.03461426496505737,
  "level0_lab2_int" : -1001861650,
  "level0_lab3_double" : 0.015721726429861027,
  "level0_lab4_string" : "61fee88e-195b-4fd3-8cfa-50952e4a66a7",
  "level0_lab5_long" : 7258421602631813798,
  "level0_lab6_date" : "2015-09-24",
  "level0_lab7_null" : null,
  "level0_lab8_byte" : 85,
  "level0_lab9_short" : 9369,
  "level0_lab10_time" : "01:15:54",
  "level0_lab11_timestamp" : null,
  "stairway_to_level1" : {
"level1_lab12_bool" : false,
"level1_lab13_float" : 0.5304490327835083,
"level1_lab14_int" : 1520835057,
"level1_lab15_double" : 0.9255150816426463,
"level1_lab16_string" : "229f74c6-c20d-40d7-bf1c-c87e84f6200f",
"level1_lab17_long" : -5283026701033140543,
"level1_lab18_date" : "2015-09-24",
"level1_lab19_null" : null,
"level1_lab20_byte" : 85,
"level1_lab21_short" : 12957,
"level1_lab22_time" : "01:15:54",
"level1_lab23_timestamp" : null,
"stairway_to_level2" : {
  "level2_lab24_bool" : true,
  "level2_lab25_float" : 0.7355604767799377,
  "level2_lab26_int" : 1518663456,
  "level2_lab27_double" : 0.2988853136068288,
  "level2_lab28_string" : "034e3a3e-224c-489c-975b-68fe4833c45c",
  "level2_lab29_long" : -7191075049054166440,
  "level2_lab30_date" : "2015-09-24",
  "level2_lab31_null" : null,
  "level2_lab32_byte" : 62,
  "level2_lab33_short" : 26995,
  "level2_lab34_time" : "01:15:54",
  "level2_lab35_timestamp" : null,
  "stairway_to_level3" : {
...
...
   "stairway_to_level1800" : {
...
...
}}...}

On Thu, Sep 24, 2015 at 3:26 PM, Stefán Baxter 
wrote:

> Hi Chun,
>
> Can you please explain to me what you mean by nesting levels?
>
> Because the way I understand it, having 1800 nesting levels makes no sense
> :).
>
> Regards,
>  -Stefan
>
> On Thu, Sep 24, 2015 at 9:18 PM, Chun Chang  wrote:
>
> > For JSON data format, what is the recommended/supported nesting levels
> > Drill can handle? I think we should establish a limit and make it into
> the
> > documentation.
> >
> > I tried querying a json file with 1800 nesting levels. The query never
> came
> > back, even after three hours.
> >
> > Thanks,
> > Chun
> >
>


Re: Heads up on MapR test framework and file ordering (specifically MD-185 tests)

2015-10-26 Thread Chun Chang
Jacques,

Thanks for the heads up. And what is your proposed fix? I think we can
easily enhance the test framework to handle file ordering cases. This way,
we don't need to modify any existing tests.

Thanks,
Chun

On Sat, Oct 24, 2015 at 9:11 PM, Jacques Nadeau  wrote:

> A large number of tests associated with MD-185 are unintentionally brittle.
> Many of these tests reference more than one file. The tests are planning
> tests and thus are prone to failure if the file ordering isn't the same as
> what was used for test generation.
>
> In the case of these types of tests, people should be cautious of adding
> expected results that include multiple files in a certain order. I'm
> working on a patch fix for these specific tests but wanted to let people
> know to be cautious of these issues in the future.
>
> thanks!
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>


Re: Heads up on MapR test framework and file ordering (specifically MD-185 tests)

2015-10-26 Thread Chun Chang
Just verify numFiles may not be sufficient. I will see if I can enhance the
verification code to handle both. Thanks. -Chun

On Mon, Oct 26, 2015 at 10:01 AM, Jacques Nadeau  wrote:

> My current fix is to remove the specific lists. (A test which is flaky
> isn't very valuable.) You can see my current branch at [1]. You can look at
> my most recent commit to see the changes I made to the expected results
> files to get the tests to report the correct result.
>
> If you want to do file name verification (as opposed to simply numFiles
> verification), it seems like you should be doing a structured verification
> of the json plan rather than trying to do something with the text plan.
>
> [1] https://github.com/dremio/drill-test-framework
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Oct 26, 2015 at 9:36 AM, Chun Chang  wrote:
>
> > Jacques,
> >
> > Thanks for the heads up. And what is your proposed fix? I think we can
> > easily enhance the test framework to handle file ordering cases. This
> way,
> > we don't need to modify any existing tests.
> >
> > Thanks,
> > Chun
> >
> > On Sat, Oct 24, 2015 at 9:11 PM, Jacques Nadeau 
> > wrote:
> >
> > > A large number of tests associated with MD-185 are unintentionally
> > brittle.
> > > Many of these tests reference more than one file. The tests are
> planning
> > > tests and thus are prone to failure if the file ordering isn't the same
> > as
> > > what was used for test generation.
> > >
> > > In the case of these types of tests, people should be cautious of
> adding
> > > expected results that include multiple files in a certain order. I'm
> > > working on a patch fix for these specific tests but wanted to let
> people
> > > know to be cautious of these issues in the future.
> > >
> > > thanks!
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> >
>


Re: Drill Test Framework - Data Generation does not work for Advanced/Tpch

2015-11-01 Thread Chun Chang
We should make it gzipped.  I will follow up on this.

On Sun, Nov 1, 2015 at 2:28 PM, Jacques Nadeau  wrote:

> For everyone else's reference, the files are named .tgz but they are not
> gzipped.
>
> They are only tarred...
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Sun, Nov 1, 2015 at 11:47 AM, Jacques Nadeau 
> wrote:
>
> > nevermind.
> >
> > Confusing that it is hidden in a subdirectory. For others, this is the
> > location of this information:
> >
> >
> >
> https://github.com/mapr/drill-test-framework/tree/master/framework/resources/Advanced
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Sun, Nov 1, 2015 at 11:46 AM, Jacques Nadeau 
> > wrote:
> >
> >> It looks like there is an addition for tpcds but not tpch. Can you
> upload
> >> the tpch data as well?
> >>
> >> --
> >> Jacques Nadeau
> >> CTO and Co-Founder, Dremio
> >>
> >> On Sat, Oct 31, 2015 at 6:57 PM,  wrote:
> >>
> >>> Jacques,
> >>>
> >>> I believe the data is on S3 now and Ahbishek provided the link in
> >>> README. He might forgot to send out email. Pull the latest framework. I
> >>> think it's there.
> >>>
> >>> Regards,
> >>>
> >>> Chun
> >>>
> >>> > On Oct 31, 2015, at 5:40 PM, Jacques Nadeau 
> >>> wrote:
> >>> >
> >>> > Ping, just checking in on this. :)
> >>> >
> >>> >
> >>> >
> >>> > --
> >>> > Jacques Nadeau
> >>> > CTO and Co-Founder, Dremio
> >>> >
> >>> > On Wed, Oct 28, 2015 at 3:23 PM, Abhishek Girish <
> >>> abhishek.gir...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> Data is in a Amazon S3 bucket. I'll create links for all Advanced
> >>> suites
> >>> >> datasets and share them shortly. Will also add a note to README.
> >>> >>
> >>> >> On Wed, Oct 28, 2015 at 2:47 PM, Jacques Nadeau  >
> >>> >> wrote:
> >>> >>
> >>> >>> When I try to run the extended test framework tests in the tpch
> >>> folder
> >>> >>> (with -d):
> >>> >>>
> >>> >>> Advanced/tpch/tpch_sf100/parquet
> >>> >>>
> >>> >>> I get a message:
> >>> >>>
> >>> >>> Schema [dfs.drillTestDirTpch100Parquet] is not valid with respect
> to
> >>> >> either
> >>> >>> root schema or current default schema.
> >>> >>>
> >>> >>> Where is this data?
> >>> >>>
> >>> >>>
> >>> >>> --
> >>> >>> Jacques Nadeau
> >>> >>> CTO and Co-Founder, Dremio
> >>> >>>
> >>> >>
> >>>
> >>
> >>
> >
>


Re: [VOTE] Release Apache Drill 1.3.0 (rc2)

2015-11-12 Thread Chun Chang
I am working on putting together a comprehensive test plan specifically
covering parquet reader. One part of the plan is to create a tool that can
generate parquet files of all flavors. This will considerably increase our
coverage, and hopefully prevent this type of issues. I am looking at
paqeut-mr and parqeut-compatability to get ideas. Hopefully I am on the
right track. Welcome pointers.

Thanks,
Chun

On Thu, Nov 12, 2015 at 10:57 AM, Jacques Nadeau  wrote:

> Hey Guys,
>
> It sounds like the Parquet upgrade in 1.3 have fixed an incorrect result
> problem with externally generated files. This has unfortunately resulted in
> a performance regression in the context of partition pruning. I'm neutral
> on whether this is a release stopper but it sounds like we have some strong
> opinions from Aman, Jinfeng and Rahul. As such, I think this kills the
> release.
>
> It seems like there are at least two options for resolution:
>
> - give people a migration tool for their previous Drill-created Parquet
> files
> - provide people a switch to enable the old behavior. (This will possibly
> give users incorrect results if they use this in the wrong context--ick...)
>
> Let's move the discussion of the potential fix approaches to the DRILL-4070
> that Rahul filed.
>
> Two other questions that we should probably figure out answers to:
> - How can we make sure this gets caught by testing in the future?
> - Who wants to work on the fix?
>
> How does that sound?
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Thu, Nov 12, 2015 at 10:48 AM, rahul challapalli <
> challapallira...@gmail.com> wrote:
>
> > While breaking backward compatibility could be justified in cases like
> > this, doing this without providing a tested upgrade process is
> > unacceptable.
> >
> > - Rahul
> >
> > On Thu, Nov 12, 2015 at 10:43 AM, Steven Phillips 
> > wrote:
> >
> > > Does DRILL-4070 cause incorrect results? Or just prevent partition
> > pruning?
> > >
> > > On Thu, Nov 12, 2015 at 10:32 AM, Jason Altekruse <
> > > altekruseja...@gmail.com>
> > > wrote:
> > >
> > > > I just commented on the JIRA, we are behaving correctly for newly
> > created
> > > > parquet files. I did confirm the failure to prune on auto-partitioned
> > > files
> > > > created by 1.2. I do not think this is a release blocker, because I
> do
> > > not
> > > > think we can solve this in Drill code without risking wrong results
> > over
> > > > parquet files written by other tools. I do support the creation of a
> > > > migration utility for existing files written by Drill 1.2, but this
> can
> > > be
> > > > released independent of 1.3.
> > > >
> > > >
> > > > On Thu, Nov 12, 2015 at 10:26 AM, Jinfeng Ni 
> > > > wrote:
> > > >
> > > > > Agree with Aman that DRILL-4070 is a show stopper. Parquet is the
> > > > > major data source Drill uses. If this release candidate breaks the
> > > > > backward compatibility of partitioning pruning for the parquet
> files
> > > > > created with prior release of Drill, it could cause serious problem
> > > > > for the current Drill user.
> > > > >
> > > > > -1
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Nov 12, 2015 at 10:10 AM, rahul challapalli
> > > > >  wrote:
> > > > > > -1 (non-binding)
> > > > > > The nature of the issue (DRILL-4070) demands adequate testing
> even
> > > > with a
> > > > > > workaround in place.
> > > > > >
> > > > > > On Thu, Nov 12, 2015 at 9:32 AM, Aman Sinha <
> amansi...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > >> Given this issue, I would be a -1  unfortunately.
> > > > > >>
> > > > > >> On Thu, Nov 12, 2015 at 8:42 AM, Aman Sinha <
> amansi...@apache.org
> > >
> > > > > wrote:
> > > > > >>
> > > > > >> > Can someone familiar with the parquet changes take a look at
> > > > > DRILL-4070 ?
> > > > > >> > It seems to break backward compatibility.
> > > > > >> >
> > > > > >> > On Tue, Nov 10, 2015 at 9:51 PM, Jacques Nadeau <
> > > jacq...@dremio.com
> > > > >
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> >> Hey Everybody,
> > > > > >> >>
> > > > > >> >> I'd like to propose a new release candidate of Apache Drill,
> > > > version
> > > > > >> >> 1.3.0.  This is the third release candidate (rc2).  This
> > > addresses
> > > > > some
> > > > > >> >> issues identified in the the second release candidate
> including
> > > > some
> > > > > >> test
> > > > > >> >> issues & rpc concurrency issues.
> > > > > >> >>
> > > > > >> >> The tarball artifacts are hosted at [2] and the maven
> artifacts
> > > are
> > > > > >> hosted
> > > > > >> >> at [3]. This release candidate is based on commit
> > > > > >> >> 13ab6b1f9897ebcf9179407ffaf84b79b0ee95a1 located at [4].
> > > > > >> >> The vote will be open for 72 hours ending at 10PM Pacific,
> > > November
> > > > > 13,
> > > > > >> >> 2015.
> > > > > >> >>
> > > > > >> >> [ ] +1
> > > > > >> >> [ ] +0
> > > > > >> >> [ ] -1
> > > > > >> >>
> > > > > >> >> thanks,
> > > > > >> >> Jacques
> > > > > >> >>
> > > > > >> >> [1]
> > > > > >> >>
> > > > >

Re: [VOTE] Release Apache Drill 1.3.0 (rc2)

2015-11-18 Thread Chun Chang
@Ramana
Thanks for the input. I noticed that too. Parquet-mr has some examples.

Regards

On Wed, Nov 18, 2015 at 11:21 AM, Ramana I N  wrote:

> @Chun
> When I was working on generating parquet files for testing drill and
> parquet I did use parquet-compat, but that project has since been
> abandoned. So it may not contain the latest changes in parquet.
> Parquet-mr looks more promising.
>
> Regards
> Ramana
>
>
> On Thu, Nov 12, 2015 at 11:29 AM, Chun Chang  wrote:
>
> > I am working on putting together a comprehensive test plan specifically
> > covering parquet reader. One part of the plan is to create a tool that
> can
> > generate parquet files of all flavors. This will considerably increase
> our
> > coverage, and hopefully prevent this type of issues. I am looking at
> > paqeut-mr and parqeut-compatability to get ideas. Hopefully I am on the
> > right track. Welcome pointers.
> >
> > Thanks,
> > Chun
> >
> > On Thu, Nov 12, 2015 at 10:57 AM, Jacques Nadeau 
> > wrote:
> >
> > > Hey Guys,
> > >
> > > It sounds like the Parquet upgrade in 1.3 have fixed an incorrect
> result
> > > problem with externally generated files. This has unfortunately
> resulted
> > in
> > > a performance regression in the context of partition pruning. I'm
> neutral
> > > on whether this is a release stopper but it sounds like we have some
> > strong
> > > opinions from Aman, Jinfeng and Rahul. As such, I think this kills the
> > > release.
> > >
> > > It seems like there are at least two options for resolution:
> > >
> > > - give people a migration tool for their previous Drill-created Parquet
> > > files
> > > - provide people a switch to enable the old behavior. (This will
> possibly
> > > give users incorrect results if they use this in the wrong
> > context--ick...)
> > >
> > > Let's move the discussion of the potential fix approaches to the
> > DRILL-4070
> > > that Rahul filed.
> > >
> > > Two other questions that we should probably figure out answers to:
> > > - How can we make sure this gets caught by testing in the future?
> > > - Who wants to work on the fix?
> > >
> > > How does that sound?
> > >
> > > --
> > > Jacques Nadeau
> > > CTO and Co-Founder, Dremio
> > >
> > > On Thu, Nov 12, 2015 at 10:48 AM, rahul challapalli <
> > > challapallira...@gmail.com> wrote:
> > >
> > > > While breaking backward compatibility could be justified in cases
> like
> > > > this, doing this without providing a tested upgrade process is
> > > > unacceptable.
> > > >
> > > > - Rahul
> > > >
> > > > On Thu, Nov 12, 2015 at 10:43 AM, Steven Phillips  >
> > > > wrote:
> > > >
> > > > > Does DRILL-4070 cause incorrect results? Or just prevent partition
> > > > pruning?
> > > > >
> > > > > On Thu, Nov 12, 2015 at 10:32 AM, Jason Altekruse <
> > > > > altekruseja...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I just commented on the JIRA, we are behaving correctly for newly
> > > > created
> > > > > > parquet files. I did confirm the failure to prune on
> > auto-partitioned
> > > > > files
> > > > > > created by 1.2. I do not think this is a release blocker,
> because I
> > > do
> > > > > not
> > > > > > think we can solve this in Drill code without risking wrong
> results
> > > > over
> > > > > > parquet files written by other tools. I do support the creation
> of
> > a
> > > > > > migration utility for existing files written by Drill 1.2, but
> this
> > > can
> > > > > be
> > > > > > released independent of 1.3.
> > > > > >
> > > > > >
> > > > > > On Thu, Nov 12, 2015 at 10:26 AM, Jinfeng Ni <
> > jinfengn...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Agree with Aman that DRILL-4070 is a show stopper. Parquet is
> the
> > > > > > > major data source Drill uses. If this release candidate breaks
> > the
> > > > > > > backward compatibility of partitioning pruning for the parquet
> > > files
> > > &

Re: Drill Test Framework configuration file update

2015-11-23 Thread Chun Chang
I think it's a good change to parameterize the whole connection string.
This also allows us to easily switch to other data sources.

+1

On Mon, Nov 23, 2015 at 3:08 PM, rahul challapalli <
challapallira...@gmail.com> wrote:

> Drillers,
>
> In the extended set of tests [1], we are directly connecting to a single
> drillbit through jdbc instead of talking to zookeeper. This was done as a
> workaround for DRILL-3171. Since the fix for DRILL-3171 has been committed
> now, I am updating the framework to use zookeeper. As part of this change,
> I intend to introduce the below config parameter. This parameter also
> allows us to switch between connecting to a single drillbit vs connecting
> through zookeeper. Let me know your thoughts
>
> CONNECTION_STRING=jdbc:drill:zk\=10.10.100.190:5181/zk_root/cluster_id
>
> [1] https://github.com/mapr/drill-test-framework
>
> - Rahul
>


Re: Drill Test Framework configuration file update

2015-11-23 Thread Chun Chang
We definitely could make it backward compatible. But I feel forcing the
change may not be a bad thing.

On Mon, Nov 23, 2015 at 3:32 PM, Sudheesh Katkam 
wrote:

> +1
>
> Note that if this breaking change goes in, all current users must update
> the config file to have this new property.
>
> > On Nov 23, 2015, at 3:21 PM, Chun Chang  wrote:
> >
> > I think it's a good change to parameterize the whole connection string.
> > This also allows us to easily switch to other data sources.
> >
> > +1
> >
> > On Mon, Nov 23, 2015 at 3:08 PM, rahul challapalli <
> > challapallira...@gmail.com> wrote:
> >
> >> Drillers,
> >>
> >> In the extended set of tests [1], we are directly connecting to a single
> >> drillbit through jdbc instead of talking to zookeeper. This was done as
> a
> >> workaround for DRILL-3171. Since the fix for DRILL-3171 has been
> committed
> >> now, I am updating the framework to use zookeeper. As part of this
> change,
> >> I intend to introduce the below config parameter. This parameter also
> >> allows us to switch between connecting to a single drillbit vs
> connecting
> >> through zookeeper. Let me know your thoughts
> >>
> >> CONNECTION_STRING=jdbc:drill:zk\=10.10.100.190:5181/zk_root/cluster_id
> >>
> >> [1] https://github.com/mapr/drill-test-framework
> >>
> >> - Rahul
> >>
>
>


Re: Compile failed with tpch.tgz: invalid block type

2017-01-05 Thread Chun Chang
The FW might be the cause, but you can download manually so I would also
look at your maven version. My version Apache Maven 3.1.1 seems works but I
am outside of China.

On Thu, Jan 5, 2017 at 1:35 AM, WeiWan  wrote:

> branch: 1.8.0
>
> I compile the drill source code with `mvn clean install -DskipTests`, and
> failed with errors below.
>
> After rerun with `-X` debug mode, I found this file (tpch.tgz) was
> downloaded from `http://apache-drill.s3.amazonaws.com/files/sf-0.01_
> tpc-h_parquet_typed.tgz`  amazonaws.com/files/sf-0.01_tpc-h_parquet_typed.tgz%60> . I download it
> manually and can unzip it successfully. So I suspect bugs in maven plugin
> `download-maven-plugin`, used in `contrib/data/tpch-sample-data/pom.xml`.
>
> By the way, I’m in China. Maybe the GFW cause this issue...
>
> [ERROR] Failed to execute goal com.googlecode.maven-download-
> plugin:download-maven-plugin:1.2.0:wget (install-tgz) on project
> tpch-sample-data: IO Error: Error while expanding
> /Users/flow/workspace/github/drill/contrib/data/tpch-
> sample-data/target/classes/tpch/tpch.tgz: invalid block type -> [Help 1]
> org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute
> goal com.googlecode.maven-download-plugin:download-maven-plugin:1.2.0:wget
> (install-tgz) on project tpch-sample-data: IO Error
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:212)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:153)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:145)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.
> buildProject(LifecycleModuleBuilder.java:116)
> at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.
> buildProject(LifecycleModuleBuilder.java:80)
> at org.apache.maven.lifecycle.internal.builder.singlethreaded.
> SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.
> execute(LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
> at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
> at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
> at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
> at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
> at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(
> NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
> DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> launchEnhanced(Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> launch(Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> mainWithExitCode(Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.
> main(Launcher.java:356)
> Caused by: org.apache.maven.plugin.MojoExecutionException: IO Error
> at com.googlecode.WGet.execute(WGet.java:260)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(
> DefaultBuildPluginManager.java:134)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute(
> MojoExecutor.java:207)
> ... 20 more
> Caused by: org.codehaus.plexus.archiver.ArchiverException: Error while
> expanding /Users/flow/workspace/github/drill/contrib/data/tpch-
> sample-data/target/classes/tpch/tpch.tgz
> at org.codehaus.plexus.archiver.tar.TarUnArchiver.execute(
> TarUnArchiver.java:101)
> at org.codehaus.plexus.archiver.AbstractUnArchiver.extract(
> AbstractUnArchiver.java:120)
> at com.googlecode.WGet.unpack(WGet.java:269)
> at com.googlecode.WGet.execute(WGet.java:255)
> ... 22 more
> Caused by: java.util.zip.ZipException: invalid block type
> at java.util.zip.InflaterInputStream.read(
> InflaterInputStream.java:164)
> at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
> at org.codehaus.plexus.archiver.tar.TarBuffer.readBlock(
> TarBuffer.java:260)
> at org.codehaus.plexus.archiver.tar.TarBuffer.readRecord(
> TarBuffer.java:222)
> at org.codehaus.plexus.archiver.tar.TarInputStream.read(
> TarInputStream.java:371)
> at org.codehaus.plexus.archiver.tar.TarInputStream.read(
> TarInputStream.java:314)
> at org.codehaus.plexus.util.IOUtil.copy(IOUtil.java:188)
> at org.codehaus.plexus.util.IOUtil.copy(IOUtil.java:174)
> at org.codehaus.plexus.archiver.zip.AbstractZipUnArchiver.extractFile(
> AbstractZipUnArchiver.java:226)
> at org.codehaus.plexus.archiver.tar.TarUnArchiver.execute(
> TarUnArchiver.java:93)
> ... 25 more
>
>
>
>
> Regards
> Flow Wei
>
>
>
>


Re: Performance issue with 2 phase hash-agg design

2017-06-20 Thread Chun Chang
I also noticed if the keys are mostly unique, the first phase aggregation 
effort is mostly wasted. This can and should be improved.


One idea is to detect unique keys while processing. When the percentage of 
unique keys exceeds a certain threshold after processing certain percentage of 
data, skip the rest and send directly to downstream second phase aggregation.


From: rahul challapalli 
Sent: Tuesday, June 20, 2017 1:36:31 PM
To: dev
Subject: Performance issue with 2 phase hash-agg design

During the first phase, the hash agg operator is not protected from skew in
data (Eg : data contains 2 files where the number of records in one file is
very large compared to the other). Assuming there are only 2 fragments, the
hash-agg operator in one fragment handles more records and it aggregates
until the memory available to it gets exhausted, at which point it sends
the record batches downstream to the hash-partitioner.

Because the hash-partitioner normalizes the skew in the data, the work is
evenly divided and the 2 minor fragments running the second phase
hash-aggregate take similar amount of processing time.

So what is the problem here? During the first phase one minor fragment
takes a long time which affects the runtime of the query. Instead, if the
first phase did not do any aggregation or only used low memory (there by
limiting the aggregations performed) then the query would have completed
faster. However the advantage of doing 2-phase aggregation is reduced
traffic on the network. But if the keys used in group by are mostly unique
then we loose this advantage as well.

I was playing with the new spillable hash-agg code and observed that
increasing memory did not improve the runtime.  This behavior can be
explained by the above reasoning.

Aggregating on mostly unique keys may not be a common use case, but any
thoughts in general about this?


Re: i need apache-drill-1.11.0.tar.gz

2018-04-19 Thread Chun Chang
Recommend you use 1.13.0. If for some reason you have to have 1.11.0, a way to 
get it is to download source and build yourself. But I don't see why you want 
to stick to 1.11.0, a lot of improvements have been made since.


From: 苏斌 
Sent: Thursday, April 19, 2018 3:30:47 AM
To: dev@drill.apache.org
Subject: i need apache-drill-1.11.0.tar.gz

hi guys

drill is amazing

I want to know where i can get apache-drill-1.11.0.tar.gz ?

download url in this page '
https://urldefense.proofpoint.com/v2/url?u=https-3A__drill.apache.org_docs_apache-2Ddrill-2D1-2D11-2D0-2Drelease-2Dnotes_&d=DwIBaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=rDJ870cVaO0UIjYlV9X2uA&m=qTMiUxGrXWW26pFR0UEX9gbFifYAbrHrEn0YnT13JQc&s=5S4mjI9dT2ZikKLY0H0K7G5lDJENDj9KIQl4e6Y9pnI&e='
 is not
correct, it linked to 1.13.0 download page

*thanks*

*Best Regards*

*fans for drill*


Re: Building drill with "-P mapr" option fails

2014-11-26 Thread Chun Chang
A maven option:

-U,--update-snapshots  Forces a check for updated
releases and snapshots on remote

On Wed, Nov 26, 2014 at 3:25 PM, Alexander Zarei  wrote:

> Thanks Chun!
>
> Is -Pmapr connected? we used to do it like "-P mapr" and what is -U for?
>
> Thanks,
> Alex
>
> On Wed, Nov 26, 2014 at 3:00 PM, Chun Chang  wrote:
>
> > I just build successfully
> >
> > mvn clean install -DskipTests -Pmapr -U
> >
> > On Wed, Nov 26, 2014 at 2:52 PM, Alexander Zarei <
> > alexanderz.si...@gmail.com
> > > wrote:
> >
> > > Hi,
> > >
> > > I just pulled the latest version of drill from Github but was not able
> > > build it with
> > >
> > > *mvn clean install -DskipTests -P mapr*
> > >
> > > I could build it without "-P mapr" option but now I don't know if the
> > build
> > > is reliable or not as I cannot reproduce a reported bug. Any suggestion
> > is
> > > greatly appreciated.
> > >
> > > Thanks,
> > > Alex
> > >
> > > *Alexander Zarei*
> > >
> > > Computer Scientist *|* Simba Technologies Inc.
> > > +1.604.633.0008 *|* alexand...@simba.com
> > > 938 West 8th Avenue *|* Vancouver, BC *|* Canada * | *V5Z 1E5
> > > *The Big Data Connectivity Experts | *www.simba.com
> > >
> > > This email message is for the sole use of the intended recipient(s) and
> > may
> > > contain confidential and privileged information.  Any unauthorized
> > review,
> > > use, disclosure, or distribution is prohibited.  If you are not the
> > > intended recipient, please contact the sender by reply email and
> destroy
> > > all copies of the original message.  Thank you.
> > >
> >
>


Re: [VOTE] Release Apache Drill 0.7.0 (rc1)

2014-12-22 Thread Chun Chang
+1: nonbinding

+Deployed RPM to a 4 node cluster and ran various automation tests. This
release is far more superior than last release in terms of features and
quality.

On Mon, Dec 22, 2014 at 12:34 AM, Aditya  wrote:
>
> +1 (binding).
>
> + Verified that the source tarball is snapshot of git commit
> e3ab2c1760ad34bda80141e2c3108f7eda7c9104.
> + Checksum/signature verified.
> + Built Drill from the source tarball. All unit tests pass on Windows and
> Cent-OS.
> + Launched Drill in embedded mode on Windows; added, removed storage
> plugins; ran simple queries.
> + Launched Drill in distributed mode on Windows; added, removed storage
> plugins; ran simple queries.
> + Deployed the binary tarball to a 4 node Cent-OS cluster. Configured HBase
> and MapR-DB plugins and ran queries against both.
>
> This one looks solid.
>
> On Mon, Dec 22, 2014 at 12:06 AM, Yash Sharma  wrote:
>
> > +1 : Nonbinding
> > ---
> > - Source and binaries present
> > - Source and Binaries contain README, NOTICE, LICENSE Files.
> > - Source contains INSTALL.md
> > - Verified Checksums for src and binaries (md5 & sha)
> > - Able to launch drill from binary distribution (Embeded mode).
> > - Able to build from source (mvn clean install -DskipTests)
> > - Able to fire sample queries on sqlline
> > ---
> > Env Details:
> > Apache Maven 3.2.1
> > Java version: 1.7.0_45, Oracle
> > Ubuntu 12.04.5 LTS
> > ---
> >
> > On Mon, Dec 22, 2014 at 10:54 AM, Jacques Nadeau 
> > wrote:
> >
> > > I know everyone is trying to go off to holiday vacation but I'd love to
> > > have one or two more votes on this before releasing.  Can I get some
> more
> > > feedback?
> > >
> > > Thanks,
> > > Jacques
> > >
> > > On Thu, Dec 18, 2014 at 12:06 PM, Jacques Nadeau 
> > > wrote:
> > >
> > > > Good morning,
> > > >
> > > > I would like to propose the release of Apache Drill, version 0.7.0.
> > This
> > > > is the second release candidate (zero-index rc1) and includes fixes
> > for a
> > > > few issues identified as part of the first candidate.
> > > >
> > > > This release includes 228 resolved JIRAs [1].
> > > >
> > > > The artifacts are hosted at [2].
> > > >
> > > > The vote will be open for 72 hours, ending Noon Pacific, December 21,
> > > 2014.
> > > >
> > > > [ ] +1
> > > > [ ] +0
> > > > [ ] -1
> > > >
> > > >
> > > > Thank you,
> > > > Jacques
> > > >
> > > > [1]
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12327473
> > > > [2] http://people.apache.org/~jacques/apache-drill-0.7.0.rc1/
> > > >
> > >
> >
>


Re: WHERE clause with nested JSON data

2015-04-01 Thread Chun Chang
Hi Subhajit,

Possible for you to attach a sample of your dataset? I want to try the
query with your dataset.

Thanks,
Chun

On Tue, Mar 31, 2015 at 11:53 AM, Jason Altekruse 
wrote:

> The error message indicates that this is a planning bug. Please try to look
> to see if you can find an open JIRA for the issue and add any information
> about your case there. If there is not one already filed, please open a new
> one and try to provide as much explanation as you can about the data
> involved. If you can, create a minimal reproduction with an Hbase table
> definition and a few rows of data that produce the issue.
>
> Thanks for trying out Drill, welcome to the community!
>
> AssertionError: RexInputRef index 2 out of range 0..1
>
> On Tue, Mar 31, 2015 at 11:29 AM, Kristine Hahn 
> wrote:
>
> > These examples of nested data queries that use a where clause might help:
> >
> >
> >
> http://drill.apache.org/docs/json-data-model/#example:-access-a-map-field-in-an-array
> >
> >
> >
> http://drill.apache.org/docs/json-data-model/#example:-flatten-an-array-of-maps-using-a-subquery
> >
> > Kristine Hahn
> > Sr. Technical Writer
> > 415-497-8107 @krishahn
> >
> >
> > On Tue, Mar 31, 2015 at 12:38 AM, Subhajit Ghosh 
> > wrote:
> >
> > > I am facing some issues when running a SELECT query with a WHERE clause
> > on
> > > a nested value/column. Note that the query is run on a view of the
> HBase
> > > table.
> > >
> > > 0: jdbc:drill:schema:hbase:zk=localhost> select
> > > t.json.runtimeConfiguration.properties.jvmHeapUsageInit as val from
> > > IndividualTestRun_ t;
> > > ++
> > > |val |
> > > ++
> > > | 2686   |
> > > | 2539   |
> > > | 3814   |
> > > | 3525   |
> > > | 3227   |
> > > | 3486   |
> > > | 2055   |
> > > | 3191   |
> > > | 2931   |
> > > ++
> > > 9 rows selected (0.692 seconds)
> > >
> > > The SELECT (without the WHERE clause) works as expected. "properties"
> is
> > a
> > > dictionary.
> > >
> > > 0: jdbc:drill:schema:hbase:zk=localhost> select
> > > t.json.runtimeConfiguration.properties.jvmHeapUsageInit as val from
> > > IndividualTestRun_ t where
> > > t.json.runtimeConfiguration.properties.jvmHeapUsageInit>3000;
> > > Query failed: AssertionError: RexInputRef index 2 out of range 0..1
> > >
> > > Am I missing something here? Is this supported? I am on Drill 0.8
> > >
> > > Following is the stack trace:
> > >
> > > 2015-03-31 08:36:24,962 [2ae5b186-84f3-7220-c78a-d67379dd1df8:foreman]
> > INFO
> > >  o.a.d.e.s.hbase.TableStatsCalculator - Region size calculation
> disabled.
> > > 2015-03-31 08:36:25,088 [2ae5b186-84f3-7220-c78a-d67379dd1df8:foreman]
> > INFO
> > >  o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING
> > -->
> > > FAILED
> > > org.apache.drill.exec.work.foreman.ForemanException: Unexpected
> exception
> > > during fragment initialization: null
> > > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:211)
> > > [drill-java-exec-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> > > [drill-common-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> > > [na:1.7.0_71]
> > > at
> > >
> > >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> > > [na:1.7.0_71]
> > > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> > > Caused by: java.lang.reflect.UndeclaredThrowableException: null
> > > at com.sun.proxy.$Proxy61.getRowCount(Unknown Source) ~[na:na]
> > > at
> > >
> > >
> >
> org.eigenbase.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:96)
> > > ~[optiq-core-0.9-drill-r20.jar:na]
> > > at org.eigenbase.rel.SingleRel.getRows(SingleRel.java:65)
> > > ~[optiq-core-0.9-drill-r20.jar:na]
> > > at
> > >
> > >
> >
> org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier$MajorFragmentStat.add(ExcessiveExchangeIdentifier.java:99)
> > > ~[drill-java-exec-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitPrel(ExcessiveExchangeIdentifier.java:74)
> > > ~[drill-java-exec-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitPrel(ExcessiveExchangeIdentifier.java:31)
> > > ~[drill-java-exec-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> org.apache.drill.exec.planner.physical.visitor.BasePrelVisitor.visitProject(BasePrelVisitor.java:48)
> > > ~[drill-java-exec-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> org.apache.drill.exec.planner.physical.ProjectPrel.accept(ProjectPrel.java:69)
> > > ~[drill-java-exec-0.8.0-rebuffed.jar:0.8.0]
> > > at
> > >
> > >
> >
> org.apache.drill.exec.planner.physical.visitor.ExcessiveExchangeIdentifier.visitScreen(ExcessiveExchangeIdentifier.java:61)
> >

Re: Stopping Drillbit and preventing it from restarting automatically on MapR sandbox

2015-04-28 Thread Chun Chang
Run the following cmd to stop the old drillbit:

maprcli node services -name Drill -action stop -nodes `hostname -f`

On Tue, Apr 28, 2015 at 5:27 PM, Alexander Zarei  wrote:

> Hi everyone,
>
> I am wondering if you could help me stop the drillbit in MapR sandbox as it
> keeps starting again by default.
>
> I build Drill from source from github and want to start it but the default
> (older) drill keeps running avoiding the start of the new drillbit.
>
> Thanks for your time and help!
>
> Thanks,
> Alex
>


Re: Stopping Drillbit and preventing it from restarting automatically on MapR sandbox

2015-04-29 Thread Chun Chang
Actually it should be:

maprcli node services -name drill-bits -action stop -nodes `hostname -f`

Sorry about that.

On Tue, Apr 28, 2015 at 6:01 PM, Chun Chang  wrote:

> Run the following cmd to stop the old drillbit:
>
> maprcli node services -name Drill -action stop -nodes `hostname -f`
>
> On Tue, Apr 28, 2015 at 5:27 PM, Alexander Zarei <
> alexanderz.si...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> I am wondering if you could help me stop the drillbit in MapR sandbox as
>> it
>> keeps starting again by default.
>>
>> I build Drill from source from github and want to start it but the default
>> (older) drill keeps running avoiding the start of the new drillbit.
>>
>> Thanks for your time and help!
>>
>> Thanks,
>> Alex
>>
>
>


[jira] [Created] (DRILL-4298) SYSTEM ERROR: ChannelClosedException

2016-01-21 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4298:
-

 Summary: SYSTEM ERROR: ChannelClosedException
 Key: DRILL-4298
 URL: https://issues.apache.org/jira/browse/DRILL-4298
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - RPC
Affects Versions: 1.5.0
Reporter: Chun Chang


1.5.0-SNAPSHOT  2f0e3f27e630d5ac15cdaef808564e01708c3c55

Running functional regression, hit this error, seems random and not associated 
with any particular query.

>From client side:

{noformat}
1/5  create table `existing_partition_pruning/lineitempart` partition 
by (dir0) as select * from 
dfs.`/drill/testdata/partition_pruning/dfs/lineitempart`;
Error: SYSTEM ERROR: ChannelClosedException: Channel closed 
/10.10.100.171:31010 <--> /10.10.100.171:33713.

Fragment 0:0

[Error Id: 772d90b8-c5e6-4ecc-8776-68ccc6b57d49 on drillats1.qa.lab:31010] 
(state=,code=0)
java.sql.SQLException: SYSTEM ERROR: ChannelClosedException: Channel closed 
/10.10.100.171:31010 <--> /10.10.100.171:33713.

Fragment 0:0

[Error Id: 772d90b8-c5e6-4ecc-8776-68ccc6b57d49 on drillats1.qa.lab:31010]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321)
at 
net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172)
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:62)
at 
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1593)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:746)
at sqlline.SqlLine.runCommands(SqlLine.java:1651)
at sqlline.Commands.run(Commands.java:1304)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
at sqlline.SqlLine.dispatch(SqlLine.java:742)
at sqlline.SqlLine.initArgs(SqlLine.java:553)
at sqlline.SqlLine.begin(SqlLine.java:596)
at sqlline.SqlLine.start(SqlLine.java:375)
at sqlline.SqlLine.main(SqlLine.java:268)
Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: ChannelClosedException: Channel closed /10.10.100.171:31010 <--> 
/10.10.100.171:33713.

Fragment 0:0

[Error Id: 772d90b8-c5e6-4ecc-8776-68ccc6b57d49 on drillats1.qa.lab:31010]
at 
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
at 
org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
at 
org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
at 
org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
at 
org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerCo

[jira] [Created] (DRILL-4311) Unexpected exception during fragment initialization: Internal error: Error while applying rule DrillTableRule, args [rel#6431439:EnumerableTableScan.ENUMERABLE.ANY([]).[]

2016-01-26 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4311:
-

 Summary: Unexpected exception during fragment initialization: 
Internal error: Error while applying rule DrillTableRule, args 
[rel#6431439:EnumerableTableScan.ENUMERABLE.ANY([]).[](table=[hive, 
lineitem_text_partitioned_hive_hier_intstring])]
 Key: DRILL-4311
 URL: https://issues.apache.org/jira/browse/DRILL-4311
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.5.0
Reporter: Chun Chang


1.5.0-SNAPSHOT  3d0b4b02521f12e3871d6060c8f9bfce73b218a0

Hit the following exception while running Functional automation. It's not 
specific to a query. The same query passed in other runs. So looks random. And 
feels the current master is less stable than a few days ago.

{noformat}
2016-01-26 05:22:05,991 [29588d02-6fc1-3e49-4e4b-de4cc6205538:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
29588d02-6fc1-3e49-4e4b-de4cc6205538: select l_orderkey, l_partkey, l_quantity, 
l_shipdate, l_shipinstruct from 
hive.lineitem_text_partitioned_hive_hier_intstring where `year`=1993 and 
l_orderkey > 29600 and `month`='nov'
2016-01-26 05:22:05,990 [29588d02-7206-dac7-a1dd-bb4a99fed1b9:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 85 out of 
85 using 16 threads. Time: 13ms total, 2.287035ms avg, 3ms max.
2016-01-26 05:22:05,982 [29588d01-bfc1-49db-caa3-baabb0b9ff30:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
29588d01-bfc1-49db-caa3-baabb0b9ff30: select distinct count(distinct c_row) 
from data group by c_int order by 1
2016-01-26 05:22:05,995 [29588d02-7206-dac7-a1dd-bb4a99fed1b9:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 85 out of 
85 using 16 threads. Earliest start: 400.204000 μs, Latest start: 12264.46 
μs, Average start: 5804.976765 μs .
2016-01-26 05:22:05,995 [29588d02-0b3c-0b0f-fbac-c219dd631d92:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29588d02-0b3c-0b0f-fbac-c219dd631d92:0:0: 
State to report: RUNNING
2016-01-26 05:22:05,997 [29588d02-0b3c-0b0f-fbac-c219dd631d92:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29588d02-0b3c-0b0f-fbac-c219dd631d92:0:0: 
State change requested RUNNING --> FINISHED
2016-01-26 05:22:05,997 [29588d02-0b3c-0b0f-fbac-c219dd631d92:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29588d02-0b3c-0b0f-fbac-c219dd631d92:0:0: 
State to report: FINISHED
2016-01-26 05:22:05,997 [29588d02-7206-dac7-a1dd-bb4a99fed1b9:foreman] INFO  
o.a.d.e.p.l.partition.PruneScanRule - Total pruning elapsed time: 128 ms
2016-01-26 05:22:06,016 [29588d01-51bd-c95b-a4ef-692ababd0a05:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
29588d01-51bd-c95b-a4ef-692ababd0a05: use `dfs`
2016-01-26 05:22:06,137 [29588d01-c725-8642-b99d-e902fd4e7f93:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Time: 0ms total, 0.945990ms avg, 0ms max.
2016-01-26 05:22:06,137 [29588d01-c725-8642-b99d-e902fd4e7f93:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Earliest start: 0.219000 μs, Latest start: 0.219000 μs, Average 
start: 0.219000 μs .
2016-01-26 05:22:06,138 [29588d01-bfc1-49db-caa3-baabb0b9ff30:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses
2016-01-26 05:22:06,139 [29588d01-bfc1-49db-caa3-baabb0b9ff30:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 
using 1 threads. Time: 1ms total, 1.486007ms avg, 1ms max.
2016-01-26 05:22:06,140 [29588d01-bfc1-49db-caa3-baabb0b9ff30:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 
using 1 threads. Earliest start: 0.39 μs, Latest start: 0.39 μs, 
Average start: 0.39 μs .
2016-01-26 05:22:06,140 [29588d01-bfc1-49db-caa3-baabb0b9ff30:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Took 1 ms to read file metadata
2016-01-26 05:22:06,169 [29588d01-c725-8642-b99d-e902fd4e7f93:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29588d01-c725-8642-b99d-e902fd4e7f93:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2016-01-26 05:22:06,169 [29588d01-c725-8642-b99d-e902fd4e7f93:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29588d01-c725-8642-b99d-e902fd4e7f93:0:0: 
State to report: RUNNING
2016-01-26 05:22:06,175 [29588d02-7206-dac7-a1dd-bb4a99fed1b9:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29588d02-7206-dac7-a1dd-bb4a99fed1b9:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2016-01-26 05:22:06,175 [29588d02-7206-dac7-a1dd-bb4a99fed1b9:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29588d02-7206-dac7-a1dd-bb4a99fed1b9:0:0: 
State to report: RUNNING
2016-01-26 05:22:06,2

[jira] [Created] (DRILL-4343) UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not supported See Apache Drill JIRA: DRILL-3188

2016-02-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4343:
-

 Summary: UNSUPPORTED_OPERATION ERROR: This type of window frame is 
currently not supported  See Apache Drill JIRA: DRILL-3188
 Key: DRILL-4343
 URL: https://issues.apache.org/jira/browse/DRILL-4343
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.5.0
Reporter: Chun Chang


1.5.0-SNAPSHOT  1b96174b1e5bafb13a873dd79f03467802d7c929

Running negative test cases automation:
./run.sh -s Functional -g smoke,regression,negative -n 10 -d 

Got this error. It's random failure since it did not appear in repeated runs. 
It's interesting that the error message refers to JIRA-3188 which is already 
fixed.

{noformat}
[#184] Query failed: 
oadd.org.apache.drill.common.exceptions.UserRemoteException: 
UNSUPPORTED_OPERATION ERROR: This type of window frame is currently not 
supported 
See Apache Drill JIRA: DRILL-3188


[Error Id: 53ff1611-736f-4b85-9c11-421125b69711 on atsqa4-195.qa.lab:31010]
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
at 
oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
at 
oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
at 
oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
oadd.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:744)
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4344) oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: NullPointerException

2016-02-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4344:
-

 Summary: 
oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
NullPointerException
 Key: DRILL-4344
 URL: https://issues.apache.org/jira/browse/DRILL-4344
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - RPC
Reporter: Chun Chang


1.5.0-SNAPSHOT  1b96174b1e5bafb13a873dd79f03467802d7c929
Running negative test cases automation:
./run.sh -s Functional -g smoke,regression,negative -n 10 -d
Got this error. It's random failure since it did not appear in repeated runs.

{noformat}
Execution Failures:
/root/drillAutomation/framework-master/framework/resources/Functional/tpcds/variants/json/q4_1.sql
Query: 
--/* q4 tpcds */

SELECT A.SS_CUSTOMER_SK,
   B.D_DATE_SK,
   B.D_YEAR,
   B.D_MOY,
   max(A.price) as price,
   max(A.cost) as cost
FROM
( SELECT
  S.SS_CUSTOMER_SK,
  S.SS_SOLD_DATE_SK,
  max(S.SS_LIST_PRICE) as price,
  max(S.SS_WHOLESALE_COST) as cost
FROM store_sales S
WHERE S.SS_QUANTITY  > 2
GROUP BY S.SS_CUSTOMER_SK,
 S.SS_SOLD_DATE_SK

) a
JOIN
  date_dim b
ON a.SS_SOLD_DATE_SK = b.D_DATE_SK
WHERE b.d_qoy = 2
  AND b.d_dow = 1
  and b.d_year IN (1990, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 
1908, 1909,
 1910, 1911, 1912, 1913, 1914, 1915, 1916, 1917, 1918, 
1919,
 1920, 1921, 1922, 1923, 1924, 1925, 1926, 1927, 1928, 
1929,
 1930, 1931, 1932, 1933, 1934, 1935, 1936, 1937, 1938, 
1939,
 1940, 1941, 1950, 1951, 1952, 1953, 1954, 1955, 1956, 
1960,
 1970, 1980, 2001, 2002, 2011, 2012, 2013, 2014)
GROUP BY A.SS_CUSTOMER_SK,
B.D_DATE_SK,
B.D_YEAR,
B.D_MOY
ORDER BY B.D_DATE_SK, A.SS_CUSTOMER_SK, B.D_YEAR, B.D_MOY
Failed with exception
java.sql.SQLException: SYSTEM ERROR: NullPointerException


[Error Id: b96a07a1-4307-41fd-be84-544b0c4176ae on atsqa4-193.qa.lab:31010]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
at 
org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1923)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:73)
at 
oadd.net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
at 
oadd.net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
at 
oadd.net.hydromatic.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:78)
at 
org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112)
at 
org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:165)
at 
org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:93)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: NullPointerException


[Error Id: b96a07a1-4307-41fd-be84-544b0c4176ae on atsqa4-193.qa.lab:31010]
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
at 
oadd.org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
at 
oadd.org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
at oadd.org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:67)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:374)
at 
oadd.org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:89)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:252)
at 
oadd.org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:123)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:285)
at 
oadd.org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:257)
at 
oadd.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 
oadd.io.netty.channel.AbstractChannelHandlerContext.invokeCha

[jira] [Created] (DRILL-4510) IllegalStateException: Failure while reading vector. Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was holding vector class org.apache.dril

2016-03-14 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4510:
-

 Summary: IllegalStateException: Failure while reading vector.  
Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was 
holding vector class org.apache.drill.exec.vector.NullableVarCharVector
 Key: DRILL-4510
 URL: https://issues.apache.org/jira/browse/DRILL-4510
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Reporter: Chun Chang
Priority: Critical


Hit the following regression running advanced automation. Regression happened 
between commit b979bebe83d7017880b0763adcbf8eb80acfcee8 and 
1f23b89623c72808f2ee866cec9b4b8a48929d68

{noformat}
Execution Failures:
/root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf100/original/query66.sql
Query: 
-- start query 66 in stream 0 using template query66.tpl 
SELECT w_warehouse_name, 
   w_warehouse_sq_ft, 
   w_city, 
   w_county, 
   w_state, 
   w_country, 
   ship_carriers, 
   year1,
   Sum(jan_sales) AS jan_sales, 
   Sum(feb_sales) AS feb_sales, 
   Sum(mar_sales) AS mar_sales, 
   Sum(apr_sales) AS apr_sales, 
   Sum(may_sales) AS may_sales, 
   Sum(jun_sales) AS jun_sales, 
   Sum(jul_sales) AS jul_sales, 
   Sum(aug_sales) AS aug_sales, 
   Sum(sep_sales) AS sep_sales, 
   Sum(oct_sales) AS oct_sales, 
   Sum(nov_sales) AS nov_sales, 
   Sum(dec_sales) AS dec_sales, 
   Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, 
   Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, 
   Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, 
   Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, 
   Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, 
   Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, 
   Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, 
   Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, 
   Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, 
   Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, 
   Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, 
   Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, 
   Sum(jan_net)   AS jan_net, 
   Sum(feb_net)   AS feb_net, 
   Sum(mar_net)   AS mar_net, 
   Sum(apr_net)   AS apr_net, 
   Sum(may_net)   AS may_net, 
   Sum(jun_net)   AS jun_net, 
   Sum(jul_net)   AS jul_net, 
   Sum(aug_net)   AS aug_net, 
   Sum(sep_net)   AS sep_net, 
   Sum(oct_net)   AS oct_net, 
   Sum(nov_net)   AS nov_net, 
   Sum(dec_net)   AS dec_net 
FROM   (SELECT w_warehouse_name, 
   w_warehouse_sq_ft, 
   w_city, 
   w_county, 
   w_state, 
   w_country, 
   'ZOUROS' 
   || ',' 
   || 'ZHOU' AS ship_carriers, 
   d_yearAS year1, 
   Sum(CASE 
 WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS jan_sales, 
   Sum(CASE 
 WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS feb_sales, 
   Sum(CASE 
 WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS mar_sales, 
   Sum(CASE 
 WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS apr_sales, 
   Sum(CASE 
 WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS may_sales, 
   Sum(CASE 
 WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS jun_sales, 
   Sum(C

[jira] [Created] (DRILL-4532) remove incorrect log message

2016-03-23 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4532:
-

 Summary: remove incorrect log message
 Key: DRILL-4532
 URL: https://issues.apache.org/jira/browse/DRILL-4532
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.6.0
Reporter: Chun Chang
Priority: Trivial


We should not log the following incorrect warning message. The message is 
logged for every connection.

"2016-03-23 17:46:42,105 [UserServer-1] WARN  
o.a.drill.exec.rpc.user.UserSession - Ignoring unknown property: zk"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4665) Partition pruning not working for hive partitioned table with 'LIKE' and '=' filter

2016-05-10 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4665:
-

 Summary: Partition pruning not working for hive partitioned table 
with 'LIKE' and '=' filter
 Key: DRILL-4665
 URL: https://issues.apache.org/jira/browse/DRILL-4665
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization, Storage - Hive
Affects Versions: 1.6.0
    Reporter: Chun Chang


This problem was initially reported by Shankar Mane 

> Problem:
>
> 1. In drill, we are using hive partition table. But explain plan (same
> query) for like and = operator differs and used all partitions in case of
> like operator.
> 2. If you see below drill explain plans: Like operator uses *all*
> partitions where
> = operator uses *only* partition filtered by log_date condition.

I reproduced the reported issue. I have a partitioned hive external table with 
the following schema:

{noformat}
create external table if not exists lineitem_partitioned (
l_orderkey bigint,
l_partkey bigint,
l_suppkey bigint,
l_linenumber bigint,
l_quantity double,
l_extendedprice double,
l_discount double,
l_tax double,
l_returnflag string,
l_linestatus string,
l_shipdate string,
l_commitdate string,
l_receiptdate string,
l_shipinstruct string,
l_shipmode string,
l_comment string
)
partitioned by (year int, month int, day int)
STORED AS PARQUET
LOCATION '/drill/testdata/tpch100_dir_partitioned_5files/lineitem';

alter table lineitem_partitioned add partition(year=2015, month=1, day=1) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/1';
alter table lineitem_partitioned add partition(year=2015, month=1, day=2) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/2';
alter table lineitem_partitioned add partition(year=2015, month=1, day=3) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/3';
alter table lineitem_partitioned add partition(year=2015, month=1, day=4) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/4';
alter table lineitem_partitioned add partition(year=2015, month=1, day=5) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/5';
alter table lineitem_partitioned add partition(year=2015, month=1, day=6) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/6';
alter table lineitem_partitioned add partition(year=2015, month=1, day=7) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/7';
alter table lineitem_partitioned add partition(year=2015, month=1, day=8) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/8';
alter table lineitem_partitioned add partition(year=2015, month=1, day=9) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/9';
alter table lineitem_partitioned add partition(year=2015, month=1, day=10) 
location  
'/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/10';
{noformat}

Without 'LIKE', drill plans right:

{noformat}
0: jdbc:drill:schema=dfs.drillTestDir> explain plan for select l_shipdate, 
l_linestatus, l_shipinstruct, `day` from lineitem_partitioned_db2k_2_999 where 
`day` = 2 limit 10;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(l_shipdate=[$0], l_linestatus=[$1], l_shipinstruct=[$2], 
day=[$3])
00-02SelectionVectorRemover
00-03  Limit(fetch=[10])
00-04Limit(fetch=[10])
00-05  Project(l_shipdate=[$1], l_linestatus=[$0], 
l_shipinstruct=[$2], day=[$3])
00-06Scan(groupscan=[HiveScan [table=Table(dbName:md815_db2k, 
tableName:lineitem_partitioned_db2k_2_999), columns=[`l_linestatus`, 
`l_shipdate`, `l_shipinstruct`, `day`], numPartitions=1, partitions= 
[Partition(values:[2015, 1, 2])], 
inputDirectories=[maprfs:/drill/testdata/tpch100_dir_partitioned_5files/lineitem/2015/1/2]]])
{noformat}

With 'LIKE', pruning is not happening:

{noformat}
0: jdbc:drill:schema=dfs.drillTestDir> explain plan for select l_shipdate, 
l_linestatus, l_shipinstruct, `day` from lineitem_partitioned_db2k_2_999 where 
`day` = 2 and l_shipinstruct like '%BACK%' limit 10;
+--+--+
| text | json |
+--+--+
| 00-00Screen
00-01  Project(l_shipdate=[$0], l_linestatus=[$1], l_shipinstruct=[$2], 
day=[$3])
00-02SelectionVectorRemover
00-03  Limit(fetch=[10])
00-04Limit(fetch=[10])
00-05  Project(l_shipdate=[$1], l_linestatus=[$0], 
l_shipinstruct=[$2], day=[$3])
00-06SelectionVectorRemover
00-07  Filter(condition=[AND(=($3, 2), LIKE($2, &

[jira] [Created] (DRILL-4708) connection closed unexpectedly

2016-06-03 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4708:
-

 Summary: connection closed unexpectedly
 Key: DRILL-4708
 URL: https://issues.apache.org/jira/browse/DRILL-4708
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - RPC
Affects Versions: 1.7.0
Reporter: Chun Chang


Running DRILL functional automation, we often see query failed randomly due to 
the following unexpected connection close error.

{noformat}
Execution Failures:
/root/drillAutomation/framework/framework/resources/Functional/ctas/ctas_flatten/10rows/filter5.q
Query: 
select * from dfs.ctas_flatten.`filter5_10rows_ctas`
Failed with exception
java.sql.SQLException: CONNECTION ERROR: Connection /10.10.100.171:36185 <--> 
drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. 
Drillbit down?


[Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ]
at 
org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321)
at 
oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172)
at 
org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:210)
at 
org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:99)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: oadd.org.apache.drill.common.exceptions.UserException: CONNECTION 
ERROR: Connection /10.10.100.171:36185 <--> 
drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. 
Drillbit down?


[Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ]
at 
oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
at 
oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373)
at 
oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at 
oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
at 
oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
at 
oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
at 
oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
at 
oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943)
at 
oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592)
at 
oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584)
at 
oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71)
at 
oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89)
at 
oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
... 1 more
{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4767) Parquet reader throw IllegalArgumentException for int32 type with GZIP compression

2016-07-06 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4767:
-

 Summary: Parquet reader throw IllegalArgumentException for int32 
type with GZIP compression
 Key: DRILL-4767
 URL: https://issues.apache.org/jira/browse/DRILL-4767
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.7.0
Reporter: Chun Chang


Created a small parquet file with the following schema:

{noformat}
[root@perfnode166 parquet-mr]# java -jar 
parquet-tools/target/parquet-tools-1.8.2-SNAPSHOT.jar schema 
/mapr/drill50.perf.lab/drill/testdata/parquet_storage/int32_10_bs10k_ps1k_gzip.parquet
message test {
  required int32 int32_field_required;
  optional int32 int32_field_optional;
  repeated int32 int32_field_repeated;
}
{noformt}

and meta

{noformat}
[root@perfnode166 parquet-mr]# java -jar 
parquet-tools/target/parquet-tools-1.8.2-SNAPSHOT.jar meta 
/mapr/drill50.perf.lab/drill/testdata/parquet_storage/int32_10_bs10k_ps1k_gzip.parquet
file: 
file:/mapr/drill50.perf.lab/drill/testdata/parquet_storage/int32_10_bs10k_ps1k_gzip.parquet
creator:  parquet-mr version 1.8.2-SNAPSHOT (build 
0cfa025d6ffeee07cb0fa2125c977185b849e5c9)
extra:writer.model.name = example

file schema:  test

int32_field_required: REQUIRED INT32 R:0 D:0
int32_field_optional: OPTIONAL INT32 R:0 D:1
int32_field_repeated: REPEATED INT32 R:1 D:1

row group 1:  RC:10 TS:147 OFFSET:4

int32_field_required:  INT32 GZIP DO:0 FPO:4 SZ:65/47/0.72 VC:10 
ENC:DELTA_BINARY_PACKED
int32_field_optional:  INT32 GZIP DO:0 FPO:69 SZ:67/49/0.73 VC:10 
ENC:DELTA_BINARY_PACKED
int32_field_repeated:  INT32 GZIP DO:0 FPO:136 SZ:69/51/0.74 VC:10 
ENC:DELTA_BINARY_PACKED
{noformat}

and dump

{noformat}
[root@perfnode166 parquet-mr]# java -jar 
parquet-tools/target/parquet-tools-1.8.2-SNAPSHOT.jar dump 
/mapr/drill50.perf.lab/drill/testdata/parquet_storage/int32_10_bs10k_ps1k_gzip.parquet
row group 0

int32_field_required:  INT32 GZIP DO:0 FPO:4 SZ:65/47/0.72 VC:10 ENC:D [more]...
int32_field_optional:  INT32 GZIP DO:0 FPO:69 SZ:67/49/0.73 VC:10 ENC: [more]...
int32_field_repeated:  INT32 GZIP DO:0 FPO:136 SZ:69/51/0.74 VC:10 ENC [more]...

int32_field_required TV=10 RL=0 DL=0

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 0, max:  
[more]... VC:10

int32_field_optional TV=10 RL=0 DL=1

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 1, max:  
[more]... VC:10

int32_field_repeated TV=10 RL=1 DL=1

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 2, max:  
[more]... VC:10

INT32 int32_field_required

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:0 V:0
value 2:  R:0 D:0 V:3
value 3:  R:0 D:0 V:6
value 4:  R:0 D:0 V:9
value 5:  R:0 D:0 V:12
value 6:  R:0 D:0 V:15
value 7:  R:0 D:0 V:18
value 8:  R:0 D:0 V:21
value 9:  R:0 D:0 V:24
value 10: R:0 D:0 V:27

INT32 int32_field_optional

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:1 V:1
value 2:  R:0 D:1 V:4
value 3:  R:0 D:1 V:7
value 4:  R:0 D:1 V:10
value 5:  R:0 D:1 V:13
value 6:  R:0 D:1 V:16
value 7:  R:0 D:1 V:19
value 8:  R:0 D:1 V:22
value 9:  R:0 D:1 V:25
value 10: R:0 D:1 V:28

INT32 int32_field_repeated

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:1 V:2
value 2:  R:0 D:1 V:5
value 3:  R:0 D:1 V:8
value 4:  R:0 D:1 V:11
value 5:  R:0 D:1 V:14
value 6:  R:0 D:1 V:17
value 7:  R:0 D:1 V:20
value 8:  R:0 D:1 V:23
value 9:  R:0 D:1 V:26
value 10: R:0 D:1 V:29
{noformat}

But query through drill, I got the following error:

{noformat}
0: jdbc:drill:schema=dfs.drillTestDir> select * from 
dfs.`drill/testdata/parquet_storage/int32_10_bs10k_ps1k_gzip.parquet`;
Error: SYSTEM ERROR: IllegalArgumentException

Fragment 0:0

[Error Id: d91ec9fe-0ce3-4d05-9e5b-d53cebb99726 on 10.10.30.169:31010] 
(state=,code=0)

0: jdbc:drill:schema=dfs.drillTestDir> select * from sys.version;
+-+---+---++-++
| version | com

[jira] [Created] (DRILL-4769) forman spins query int32 data with snappy compression

2016-07-06 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4769:
-

 Summary: forman spins query int32 data with snappy compression
 Key: DRILL-4769
 URL: https://issues.apache.org/jira/browse/DRILL-4769
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.7.0
Reporter: Chun Chang


Similar data structure as DRILL-4767, but with SNAPPY compression, same query, 
drill forman just spins. 

{noformat}
[root@perfnode166 parquet-mr]# java -jar 
parquet-tools/target/parquet-tools-1.8.2-SNAPSHOT.jar dump 
/mapr/drill50.perf.lab/drill/testdata/parquet_storage/int32_10_bs10k_ps1k_snappy.parquet
row group 0

int32_field_required:  INT32 SNAPPY DO:0 FPO:4 SZ:49/47/0.96 VC:10 ENC [more]...
int32_field_optional:  INT32 SNAPPY DO:0 FPO:53 SZ:51/49/0.96 VC:10 EN [more]...
int32_field_repeated:  INT32 SNAPPY DO:0 FPO:104 SZ:53/51/0.96 VC:10 E [more]...

int32_field_required TV=10 RL=0 DL=0

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 0, max:  
[more]... VC:10

int32_field_optional TV=10 RL=0 DL=1

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 1, max:  
[more]... VC:10

int32_field_repeated TV=10 RL=1 DL=1

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 2, max:  
[more]... VC:10

INT32 int32_field_required

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:0 V:0
value 2:  R:0 D:0 V:3
value 3:  R:0 D:0 V:6
value 4:  R:0 D:0 V:9
value 5:  R:0 D:0 V:12
value 6:  R:0 D:0 V:15
value 7:  R:0 D:0 V:18
value 8:  R:0 D:0 V:21
value 9:  R:0 D:0 V:24
value 10: R:0 D:0 V:27

INT32 int32_field_optional

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:1 V:1
value 2:  R:0 D:1 V:4
value 3:  R:0 D:1 V:7
value 4:  R:0 D:1 V:10
value 5:  R:0 D:1 V:13
value 6:  R:0 D:1 V:16
value 7:  R:0 D:1 V:19
value 8:  R:0 D:1 V:22
value 9:  R:0 D:1 V:25
value 10: R:0 D:1 V:28

INT32 int32_field_repeated

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:1 V:2
value 2:  R:0 D:1 V:5
value 3:  R:0 D:1 V:8
value 4:  R:0 D:1 V:11
value 5:  R:0 D:1 V:14
value 6:  R:0 D:1 V:17
value 7:  R:0 D:1 V:20
value 8:  R:0 D:1 V:23
value 9:  R:0 D:1 V:26
value 10: R:0 D:1 V:29
{noformat}

Here is the drillbit thread dump:

{noformat}
[root@perfnode169 ~]# jstack -l 7355
2016-07-06 17:25:56
Full thread dump OpenJDK 64-Bit Server VM (24.79-b02 mixed mode):

"Attach Listener" daemon prio=10 tid=0x7fbae45a0800 nid=0x2a52 waiting on 
condition [0x]
   java.lang.Thread.State: RUNNABLE

   Locked ownable synchronizers:
- None

"qtp239614979-176" prio=10 tid=0x016bd000 nid=0x2329 waiting on 
condition [0x7fbab749a000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006f7700410> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at 
org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:389)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJobPoll(QueuedThreadPool.java:513)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.access$700(QueuedThreadPool.java:48)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:569)
at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
- None

"qtp239614979-174" prio=10 tid=0x01b3c800 nid=0x2327 waiting on 
condition [0x7fbabd7e5000]
   java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x0006f7700410> (a 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
at 
org.eclipse.jetty.util.BlockingArrayQueue.poll(BlockingArrayQueue.java:389)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.idleJob

[jira] [Created] (DRILL-4770) ParquetRecordReader throws NPE querying a single int64 column file

2016-07-08 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4770:
-

 Summary: ParquetRecordReader throws NPE querying a single int64 
column file
 Key: DRILL-4770
 URL: https://issues.apache.org/jira/browse/DRILL-4770
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.8.0
Reporter: Chun Chang


I have a parquet file with a single int64 column. 

{noformat}
[root@perfnode166 parquet-mr]# java -jar 
parquet-tools/target/parquet-tools-1.8.2-SNAPSHOT.jar dump 
/mapr/drill50.perf.lab/drill/testdata/parquet_storage/int64_10_bs10k_ps1k_uncompressed.parquet
row group 0

int64_field_required:  INT64 UNCOMPRESSED DO:0 FPO:4 SZ:55/55/1.00 VC:10 
[more]...

int64_field_required TV=10 RL=0 DL=0

page 0:  DLE:RLE RLE:RLE VLE:DELTA_BINARY_PACKED ST:[min: 0, max:  
[more]... VC:10

INT64 int64_field_required

*** row group 1 of 1, values 1 to 10 ***
value 1:  R:0 D:0 V:0
value 2:  R:0 D:0 V:1
value 3:  R:0 D:0 V:2
value 4:  R:0 D:0 V:3
value 5:  R:0 D:0 V:4
value 6:  R:0 D:0 V:5
value 7:  R:0 D:0 V:6
value 8:  R:0 D:0 V:7
value 9:  R:0 D:0 V:8
value 10: R:0 D:0 V:9
{noformat}

Drill version:

{noformat}
0: jdbc:drill:schema=dfs.drillTestDir> select * from sys.version;
+-+---+-++-++
| version | commit_id | 
commit_message  
|commit_time | build_email | 
build_time |
+-+---+-++-++
| 1.8.0-SNAPSHOT  | 05c42eae79ce3e309028b3824f9449b98e329f29  | DRILL-4707: Fix 
memory leak or incorrect query result in case two column names are 
case-insensitive identical.  | 29.06.2016 @ 08:15:13 PDT  | inram...@gmail.com  
| 07.07.2016 @ 10:50:40 PDT  |
+-+---+-++-++
1 row selected (0.44 seconds)
{noformat}


drill throws NPE:

{noformat}
2016-07-08 11:08:55,156 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
288013c7-f122-f6be-936e-c18ebe9b92ef: select * from 
dfs.`drill/testdata/parquet_storage/int64_10_bs10k_ps1k_uncompressed.parquet`
2016-07-08 11:08:55,292 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses
2016-07-08 11:08:55,295 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 
using 1 threads. Time: 2ms total, 2.423069ms avg, 2ms max.
2016-07-08 11:08:55,295 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of 1 
using 1 threads. Earliest start: 1.347000 μs, Latest start: 1.347000 μs, 
Average start: 1.347000 μs .
2016-07-08 11:08:55,295 [288013c7-f122-f6be-936e-c18ebe9b92ef:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - Took 2 ms to read file metadata
2016-07-08 11:08:55,377 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2016-07-08 11:08:55,377 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: 
State to report: RUNNING
2016-07-08 11:08:55,386 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: 
State change requested RUNNING --> FAILED
2016-07-08 11:08:55,386 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 288013c7-f122-f6be-936e-c18ebe9b92ef:0:0: 
State change requested FAILED --> FINISHED
2016-07-08 11:08:55,387 [288013c7-f122-f6be-936e-c18ebe9b92ef:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException

Fragment 0:0

[Error Id: 21fcc35b-6151-46b6-a750-0ce6f2141a7d on 

[jira] [Created] (DRILL-4871) random JsonParseException: Illegal character ((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between tokens

2016-08-31 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4871:
-

 Summary: random JsonParseException: Illegal character ((CTRL-CHAR, 
code 0)): only regular white space (\r, \n, \t) is allowed between tokens
 Key: DRILL-4871
 URL: https://issues.apache.org/jira/browse/DRILL-4871
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.8.0
Reporter: Chun Chang


Running functional regression, hit this error sometimes. Will provide more info 
if I hit it again.

{noformat}
/root/drillAutomation/framework-master/framework/resources/Functional/metadata_caching/partition_pruning/plan/3level_sanity4.q
Query: 
explain plan for select l_quantity from l_3level where dir0=1 and ((dir1='one' 
and dir2 IN ('2015-7-12', '2015-7-13')) or (dir1='two' and dir2='2015-8-12')) 
and l_discount=0.07 order by l_orderkey limit 10
Failed with exception
java.sql.SQLException: SYSTEM ERROR: JsonParseException: Illegal character 
((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between 
tokens
 at [Source: com.mapr.fs.MapRFsDataInputStream@4f1a197d; line: 1, column: 2]


[Error Id: fae9e81d-a7d6-459d-bbd2-33d008bd8c56 on atsqa6c85.qa.lab:31010]

  (org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception 
during fragment initialization: Internal error: Error while applying rule 
DrillTableRule, args 
[rel#951416:DirPrunedEnumerableTableScan.ENUMERABLE.ANY([]).[](table=[],selection=root=maprfs:/drill/testdata/metadata_caching_pp/l_3levelfiles=[maprfs:///drill/testdata/metadata_caching_pp/l_3level/1/two/2015-8-12,maprfs:///drill/testdata/metadata_caching_pp/l_3level/1/one/2015-7-13,maprfs:///drill/testdata/metadata_caching_pp/l_3level/1/one/2015-7-12])]
org.apache.drill.exec.work.foreman.Foreman.run():281
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():744
  Caused By (java.lang.AssertionError) Internal error: Error while applying 
rule DrillTableRule, args 
[rel#951416:DirPrunedEnumerableTableScan.ENUMERABLE.ANY([]).[](table=[],selection=root=maprfs:/drill/testdata/metadata_caching_pp/l_3levelfiles=[maprfs:///drill/testdata/metadata_caching_pp/l_3level/1/two/2015-8-12,maprfs:///drill/testdata/metadata_caching_pp/l_3level/1/one/2015-7-13,maprfs:///drill/testdata/metadata_caching_pp/l_3level/1/one/2015-7-12])]
org.apache.calcite.util.Util.newInternal():792
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():251
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():808
org.apache.calcite.tools.Programs$RuleSetProgram.run():303
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():404
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():343

org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel():240

org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel():290
org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan():61
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():94
org.apache.drill.exec.work.foreman.Foreman.runSQL():1008
org.apache.drill.exec.work.foreman.Foreman.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():744
  Caused By (org.apache.drill.common.exceptions.DrillRuntimeException) Failure 
creating scan.
org.apache.drill.exec.planner.logical.DrillScanRel.():91
org.apache.drill.exec.planner.logical.DrillScanRel.():69
org.apache.drill.exec.planner.logical.DrillScanRel.():62
org.apache.drill.exec.planner.logical.DrillScanRule.onMatch():37
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch():228
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp():808
org.apache.calcite.tools.Programs$RuleSetProgram.run():303
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():404
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform():343

org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel():240

org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel():290
org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan():61
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():94
org.apache.drill.exec.work.foreman.Foreman.runSQL():1008
org.apache.drill.exec.work.foreman.Foreman.run():264
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():744
  Caused By (com.fasterxml.jackson.core.JsonParseException) Illegal character 
((CTRL-CHAR, code 0)): only regular white space (\r, \n, \t) is allowed between 
tokens
 at [

[jira] [Created] (DRILL-3146) On Suse platform, sqlline at start prints out some info on screen instead of into log file

2015-05-18 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3146:
-

 Summary: On Suse platform, sqlline at start prints out some info 
on screen instead of into log file
 Key: DRILL-3146
 URL: https://issues.apache.org/jira/browse/DRILL-3146
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)
Priority: Trivial


On Suse, at start, sqlline prints out on screen:

{code}
mfs47:~ # /opt/mapr/drill/drill-1.0.0/bin/sqlline  --maxWidth=1 -n qa1 -p 
mapr -u jdbc:drill:schema=dfs.root;zk=10.10.10.47:5181
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: backward-delete-word
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
[INFO] Unable to bind key for unsupported operation: up-history
[INFO] Unable to bind key for unsupported operation: down-history
apache drill 1.0.0
"just drill it"
0: jdbc:drill:schema=dfs.root> select * from sys.version;
+---+-++--++
| commit_id |   commit_message  
  |commit_time | build_email  | build_time |
+---+-++--++
| 74c0a13b0a501e757b0ea922c5bae9b0a4ace319  | Update sqlline version to 
drill-r5  | 18.05.2015 @ 20:01:28 EDT  | Unknown  | 18.05.2015 @ 20:08:25 
EDT  |
+---+-++--++
1 row selected (0.265 seconds)
0: jdbc:drill:schema=dfs.root>
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3469) better log message when making storage plugin configuration change

2015-07-07 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3469:
-

 Summary: better log message when making storage plugin 
configuration change
 Key: DRILL-3469
 URL: https://issues.apache.org/jira/browse/DRILL-3469
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Affects Versions: 1.1.0
Reporter: Chun Chang
Assignee: Jacques Nadeau


We barely log anything when we make storage plugin configuration changes, even 
at debug level. This is the only thing I see in drillbit.log:

{code}
perfnode166: 2015-07-07 14:19:13,536 [qtp1911262739-87] DEBUG 
o.a.drill.common.util.PathScanner - Classpath scanning took 0ms
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3066) AtomicRemainder - Tried to close remainder, but it has already been closed.

2015-07-17 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-3066.
---
Resolution: Fixed

> AtomicRemainder - Tried to close remainder, but it has already been closed.
> ---
>
> Key: DRILL-3066
> URL: https://issues.apache.org/jira/browse/DRILL-3066
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.0.0
> Environment: 21cc578b6b8c8f3ca1ebffd3dbb92e35d68bc726 
>Reporter: Khurram Faraaz
>Assignee: Khurram Faraaz
>Priority: Minor
> Fix For: 1.0.0
>
>
> I see the below stack trace in drillbit.log when I try query a corrupt 
> parquet file. Test was run on 4 node cluster on CentOS.
> AtomicRemainder - Tried to close remainder, but it has already been closed.
> {code}
> 2015-05-13 20:42:58,893 [2aac48ac-82d3-0f5a-2bac-537e82b3ac02:frag:0:0] WARN  
> o.a.d.exec.memory.AtomicRemainder - Tried to close remainder, but it has 
> already been closed
> java.lang.Exception: null
> at 
> org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:196) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at org.apache.drill.exec.memory.Accountor.close(Accountor.java:386) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.close(TopLevelAllocator.java:310)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:405)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:399) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources(FragmentExecutor.java:312)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cancel(FragmentExecutor.java:135)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.QueryManager.cancelExecutingFragments(QueryManager.java:202)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:836)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.processEvent(Foreman.java:780)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.common.EventProcessor.sendEvent(EventProcessor.java:73) 
> [drill-common-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateSwitch.moveToState(Foreman.java:782)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:891) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.access$2700(Foreman.java:107) 
> [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$StateListener.moveToState(Foreman.java:1161)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.QueryManager$1.statusUpdate(QueryManager.java:481)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:461)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:90)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:86)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:291)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:255)
>  [drill-java-exec-1.0.0-SNAPSHOT-rebuffed.jar:1.0.0-SNAPSHOT]
> at 
> o

[jira] [Resolved] (DRILL-2275) need implementations of sys tables for drill memory and threads profiles

2015-07-17 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-2275.
---
Resolution: Fixed

> need implementations of sys tables for drill memory and threads profiles
> 
>
> Key: DRILL-2275
> URL: https://issues.apache.org/jira/browse/DRILL-2275
> Project: Apache Drill
>  Issue Type: Task
>  Components: Metadata
>Reporter: Zhiyong Liu
>    Assignee: Chun Chang
>Priority: Critical
> Fix For: 0.9.0
>
> Attachments: DRILL-2275.1.patch.txt, DRILL-2275.2.patch.txt, 
> DRILL-2275.3.patch.txt, DRILL-2275.4.patch.txt, DRILL-2275.5.patch.txt, 
> DRILL-2275.6.patch.txt, DRILL-2275.7.patch.txt
>
>
> In order to check drill state information, the following tables are to be 
> implemented:
> 1. Memory: a query such as
> select * from sys.drillmemory;
> should return a result set like the following:
> {code}
> +++--+++
> |drillbit| total_sys_memory   |heap_size | direct_alloc_memory |
> +++--+++
> | node1:port1   | 24596676k | 15200420k | 1012372k   |
> +++--+++
> | node2:port2   | 24596676k | 15200420k | 2012372k   |
> +++--+++
> {code}
> 2. Threads:
> For each node in a cluster, we need counts of threads of the drillbits.  A 
> query like this:
> select * from sys.drillbitthreads;
> should return a result set like the following:
> {code}
> +++--+++
> |drillbit| pool_name   | total_threads | busy_threads |
> +++--+++
> | node1:port1   | pool1 | 8 | 2   |
> +++--+++
> | node2:port2   | pool2 | 10 | 5   |
> +++--+++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2697) Pause injections should pause indefinitely until signalled

2015-07-17 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-2697.
---
Resolution: Fixed

> Pause injections should pause indefinitely until signalled
> --
>
> Key: DRILL-2697
> URL: https://issues.apache.org/jira/browse/DRILL-2697
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Flow
>Affects Versions: 0.9.0
>Reporter: Sudheesh Katkam
>Assignee: Chun Chang
> Fix For: 1.0.0
>
> Attachments: DRILL-2697.1.patch.txt, DRILL-2697.5.patch.txt
>
>
> Currently injected pauses make threads sleep for a specified time. This can  
> be an enhanced to stop the thread indefinitely using a CountDownLatch. It is 
> quite similar to how cancellation works. 
> Tasks: 
> (a) Add another message to RPC layer to signal paused remote threads to 
> resume (through ControlHandler) by counting down. Complications if the thread 
> has not reached the pause site yet.
> (b) Add resume signal (like ctrl-c) to sqlline 
> (further enhancement: another signal to trigger pause from sqlline)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2722) Query profile data not being sent/received (and web UI not updated)

2015-07-17 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-2722.
---
Resolution: Fixed

> Query profile data not being sent/received (and web UI not updated)
> ---
>
> Key: DRILL-2722
> URL: https://issues.apache.org/jira/browse/DRILL-2722
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP, Execution - Flow, Execution - RPC
>Affects Versions: 0.8.0
>Reporter: Chris Westin
>Assignee: Victoria Markman
> Fix For: 1.2.0
>
> Attachments: query1_foreman.log
>
>
> [~amansinha100] has a test case that shows that profile information is not 
> being received (or not being sent, I'm not sure which) for a long-running 
> query. The query appears to stop, even though cycles are being used, and it 
> looks like work is being done. This is becoming a problem for monitoring 
> query progress. We need to find out why the profile information isn't being 
> updated, and fix that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2793) Killing a non foreman node results in direct memory being held on

2015-07-17 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-2793.
---
Resolution: Fixed

> Killing a non foreman node results in direct memory being held on
> -
>
> Key: DRILL-2793
> URL: https://issues.apache.org/jira/browse/DRILL-2793
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Ramana Inukonda Nagaraj
>Assignee: Victoria Markman
> Fix For: 1.0.0
>
>
> Similar to DRILL-2792
> Happens for non foreman nodes as well.
> before:
> {code}
> 1/6  select * from sys.memory;
> +++--++++
> |  hostname  | user_port  | heap_current |  heap_max  | direct_current | 
> direct_max |
> +++--++++
> | atsqa6c62.qa.lab | 31010  | 72738752 | 4151836672 | 6048576
> | 34359738368 |
> | atsqa6c59.qa.lab | 31010  | 177100136| 4151836672 | 200
> | 34359738368 |
> | atsqa6c60.qa.lab | 31010  | 183895944| 4151836672 | 200
> | 34359738368 |
> | atsqa6c57.qa.lab | 31010  | 312198128| 4151836672 | 200
> | 34359738368 |
> | atsqa6c58.qa.lab | 31010  | 180850296| 4151836672 | 200
> | 34359738368 |
> | atsqa6c61.qa.lab | 31010  | 307163256| 4151836672 | 200
> | 34359738368 |
> | atsqa6c64.qa.lab | 31010  | 178883744| 4151836672 | 200
> | 34359738368 |
> | atsqa6c63.qa.lab | 31010  | 326312736| 4151836672 | 200
> | 34359738368 |
> +++--++++
> 8 rows selected (2.209 seconds)
> {code}
> After -cancellation of- killing non foreman node
> {code}
> 0: jdbc:drill:> select * from sys.memory;
> +++--++++
> |  hostname  | user_port  | heap_current |  heap_max  | direct_current | 
> direct_max |
> +++--++++
> | atsqa6c62.qa.lab | 31010  | 395684912| 4151836672 | 1745146306 
> | 34359738368 |
> | atsqa6c57.qa.lab | 31010  | 416717016| 4151836672 | 1751348355 
> | 34359738368 |
> | atsqa6c58.qa.lab | 31010  | 365235768| 4151836672 | 1713761930 
> | 34359738368 |
> | atsqa6c59.qa.lab | 31010  | 409859856| 4151836672 | 1763119827 
> | 34359738368 |
> | atsqa6c60.qa.lab | 31010  | 369571576| 4151836672 | 1759217229 
> | 34359738368 |
> | atsqa6c63.qa.lab | 31010  | 469310224| 4151836672 | 1725239747 
> | 34359738368 |
> | atsqa6c64.qa.lab | 31010  | 471814416| 4151836672 | 1735044144 
> | 34359738368 |
> +++--++++
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2867) Session level parameter drill.exec.testing.controls appears to be set even though it was not

2015-07-17 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-2867.
---
Resolution: Fixed

> Session level parameter drill.exec.testing.controls  appears to be set even 
> though it was not
> -
>
> Key: DRILL-2867
> URL: https://issues.apache.org/jira/browse/DRILL-2867
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 0.9.0
>Reporter: Victoria Markman
>Assignee: Victoria Markman
> Fix For: 1.2.0
>
>
> {code}
> 0: jdbc:drill:schema=dfs> select * from sys.options where type like 
> '%SESSION%';
> ++++++++
> |name|kind|type|  num_val   | string_val |  bool_val  
> | float_val  |
> ++++++++
> | drill.exec.testing.controls | STRING | SESSION| null   | {} 
> | null   | null   |
> ++++++++
> 1 row selected (0.218 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3640) Drill JDBC driver support Statement.setQueryTimeout(int)

2015-08-13 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3640:
-

 Summary: Drill JDBC driver support Statement.setQueryTimeout(int)
 Key: DRILL-3640
 URL: https://issues.apache.org/jira/browse/DRILL-3640
 Project: Apache Drill
  Issue Type: New Feature
  Components: Client - JDBC
Affects Versions: 1.2.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)


It would be nice if we have this implemented. Run away queries can be 
automatically canceled by setting the timeout. 

java.sql.SQLFeatureNotSupportedException: Setting network timeout is not 
supported.
at 
org.apache.drill.jdbc.impl.DrillStatementImpl.setQueryTimeout(DrillStatementImpl.java:152)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3753) better error message for dropping hbase/hive table

2015-09-09 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3753:
-

 Summary: better error message for dropping hbase/hive table
 Key: DRILL-3753
 URL: https://issues.apache.org/jira/browse/DRILL-3753
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.2.0
Reporter: Chun Chang
Assignee: Mehant Baid
Priority: Minor


commit_id: 0686bc23e8fbbd14fd3bf852893449ef8552439d

Drill can not drop hbase/hive tables yet. But the error message should be 
improved:

{code}
[#2] Query failed:
org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: Unable to 
create or drop tables/views. Schema [hbase] is immutable.
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3795) TextReader can't read .tsv file contains multiple double quotes

2015-09-16 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3795:
-

 Summary: TextReader can't read .tsv file contains multiple double 
quotes
 Key: DRILL-3795
 URL: https://issues.apache.org/jira/browse/DRILL-3795
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Text & CSV
Affects Versions: 1.2.0
Reporter: Chun Chang
Assignee: Steven Phillips


commit_id: 69c73af54ac3d15b8e7c21e8a3c35b4a62ebc844

I have a simple tab delimitated file contains multiple double quoted text:

{noformat}
another no quote""anotherwith quote""
""another with double quotes""  no quotes
{noformat}

This cause the following error:

{noformat}
0: jdbc:drill:schema=dfs.drillTestDirDropTabl> select columns[0], columns[1] 
from dfs.tmp.`drill-3718.tsv`;
Error: SYSTEM ERROR: TextParsingException: Error processing input: Cannot use 
newline character within quoted string, line=2, char=61. Content parsed: [ ]

Fragment 0:0

[Error Id: c631eccc-038c-4d61-bda8-e7037c3677e8 on 10.10.30.166:31010] 
(state=,code=0)
{noformat}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3900) OOM with Hive native scan enabled on TPCH-100 parquet, query 05.q

2015-10-06 Thread Chun Chang (JIRA)
Chun Chang created DRILL-3900:
-

 Summary: OOM with Hive native scan enabled on TPCH-100 parquet, 
query 05.q
 Key: DRILL-3900
 URL: https://issues.apache.org/jira/browse/DRILL-3900
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Hive
Affects Versions: 1.2.0
Reporter: Chun Chang


TPCH-100 parquet dataset. Configure Hive 1.0 pointing to the parquet files as 
external tables. Enable Hive native scan.

{noformat}
alter system set `store.hive.optimize_scan_with_native_readers`=true;
{noformat}

Run TPCH query 05 through Hive, drillbit runs out of memory. Same query goes 
through dfs completes successfully. (Disable hive native scan, drill also runs 
out of memory through hive.)

We expect with hive native scan turned on, query should finish.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4113) memory leak reported while handling query or shutting down

2015-11-18 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4113:
-

 Summary: memory leak reported while handling query or shutting down
 Key: DRILL-4113
 URL: https://issues.apache.org/jira/browse/DRILL-4113
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.3.0, 1.4.0
Reporter: Chun Chang
Priority: Critical


With impersonation enabled, I've seen two memory leaks. One reported at query 
time, one at shutdown.

At query time:

{noformat}
2015-11-17 19:11:03,595 [29b413b7-958e-c1f3-9d37-c34f96e7bf6a:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
29b413b7-958e-c1f3-9d37-c34f96e7bf6a: use `dfs.window_functions`
2015-11-17 19:11:03,666 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2015-11-17 19:11:03,666 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: 
State to report: RUNNING
2015-11-17 19:11:03,669 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: 
State change requested RUNNING --> FAILED
2015-11-17 19:11:03,669 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: 
State change requested FAILED --> FAILED
2015-11-17 19:11:03,669 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 29b413b7-edbc-9722-120d-66ab3611f250:0:0: 
State change requested FAILED --> FINISHED
2015-11-17 19:11:03,674 [29b413b7-edbc-9722-120d-66ab3611f250:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: 
Failure while closing accountor.  Expected private and shared pools to be set 
to initial values.  However, one or more were not.  Stats are
zoneinitallocated   delta
private 100 738112  261888
shared  00  261888  -261888.

Fragment 0:0

[Error Id: 6df67be9-69d4-4a3b-9eae-43ab2404c6d3 on drillats1.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
IllegalStateException: Failure while closing accountor.  Expected private and 
shared pools to be set to initial values.  However, one or more were not.  
Stats are
zoneinitallocated   delta
private 100 738112  261888
shared  00  261888  -261888.

Fragment 0:0

[Error Id: 6df67be9-69d4-4a3b-9eae-43ab2404c6d3 on drillats1.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
 ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
 [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.lang.IllegalStateException: Failure while closing accountor.  
Expected private and shared pools to be set to initial values.  However, one or 
more were not.  Stats are
zoneinitallocated   delta
private 100 738112  261888
shared  00  261888  -261888.
at 
org.apache.drill.exec.memory.AtomicRemainder.close(AtomicRemainder.java:199) 
~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.memory.AccountorImpl.close(AccountorImpl.java:365) 
~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.close(TopLevelAllocator.java:326)
 ~[drill-memory-impl-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.OperatorContextImpl.close(OperatorContextImpl.java:123)
 ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext.suppressingClose(FragmentContext.java:437)
 ~[drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT]
at 
org.apache.drill.exec.ops.FragmentContext.close(FragmentContext.java:426) 
~[drill-java-exec-1.4.0-SNAPSHOT.j

[jira] [Created] (DRILL-4189) Support validation against parquet schemas when reading parquet files

2015-12-11 Thread Chun Chang (JIRA)
Chun Chang created DRILL-4189:
-

 Summary: Support validation against parquet schemas when reading 
parquet files
 Key: DRILL-4189
 URL: https://issues.apache.org/jira/browse/DRILL-4189
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.5.0
Reporter: Chun Chang


For schema based file format, such as parquet file, Drill should validate 
against schema, and fail fast if validation fails. Refer to DRILL-3810 for 
background info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-5150) JDBC connections cause drillbit leaks resources and eventually JVM crashes

2016-12-21 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5150:
-

 Summary: JDBC connections cause drillbit leaks resources and 
eventually JVM crashes
 Key: DRILL-5150
 URL: https://issues.apache.org/jira/browse/DRILL-5150
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.9.0
Reporter: Chun Chang


Stress test JDBC connections by making connections and disconnect. Very soon, 
drillbit will crash due to resource leaks. This was observed with Apache DRILL 
JDBC driver. Testing with a third party driver did not cause the crash. Will 
upload JVM dump.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-5327) Hash aggregate can return empty batch which can cause schema change exception

2017-03-07 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5327:
-

 Summary: Hash aggregate can return empty batch which can cause 
schema change exception
 Key: DRILL-5327
 URL: https://issues.apache.org/jira/browse/DRILL-5327
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.10.0
Reporter: Chun Chang
Priority: Blocker


Hash aggregate can return empty batches which cause drill to throw schema 
change exception (not handling this type of schema change). This is not a new 
bug. But a recent hash function change (a theoretically correct change) may 
have increased the chance of hitting this issue. I don't have scientific data 
to support my claim (in fact I don't believe it's the case), but a regular 
regression run used to pass fails now due to this bug. My concern is that 
existing drill users out there may have queries that used to work but fail now. 
It will be difficult to explain why the new release is better for them. I put 
this bug as blocker so we can discuss it before releasing 1.10.

{noformat}
/root/drillAutomation/framework-master/framework/resources/Advanced/tpcds/tpcds_sf1/original/text/query66.sql
Query: 
-- start query 66 in stream 0 using template query66.tpl 
SELECT w_warehouse_name, 
   w_warehouse_sq_ft, 
   w_city, 
   w_county, 
   w_state, 
   w_country, 
   ship_carriers, 
   year1,
   Sum(jan_sales) AS jan_sales, 
   Sum(feb_sales) AS feb_sales, 
   Sum(mar_sales) AS mar_sales, 
   Sum(apr_sales) AS apr_sales, 
   Sum(may_sales) AS may_sales, 
   Sum(jun_sales) AS jun_sales, 
   Sum(jul_sales) AS jul_sales, 
   Sum(aug_sales) AS aug_sales, 
   Sum(sep_sales) AS sep_sales, 
   Sum(oct_sales) AS oct_sales, 
   Sum(nov_sales) AS nov_sales, 
   Sum(dec_sales) AS dec_sales, 
   Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, 
   Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, 
   Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, 
   Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, 
   Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, 
   Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, 
   Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, 
   Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, 
   Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, 
   Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, 
   Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, 
   Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, 
   Sum(jan_net)   AS jan_net, 
   Sum(feb_net)   AS feb_net, 
   Sum(mar_net)   AS mar_net, 
   Sum(apr_net)   AS apr_net, 
   Sum(may_net)   AS may_net, 
   Sum(jun_net)   AS jun_net, 
   Sum(jul_net)   AS jul_net, 
   Sum(aug_net)   AS aug_net, 
   Sum(sep_net)   AS sep_net, 
   Sum(oct_net)   AS oct_net, 
   Sum(nov_net)   AS nov_net, 
   Sum(dec_net)   AS dec_net 
FROM   (SELECT w_warehouse_name, 
   w_warehouse_sq_ft, 
   w_city, 
   w_county, 
   w_state, 
   w_country, 
   'ZOUROS' 
   || ',' 
   || 'ZHOU' AS ship_carriers, 
   d_yearAS year1, 
   Sum(CASE 
 WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS jan_sales, 
   Sum(CASE 
 WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS feb_sales, 
   Sum(CASE 
 WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity 
 ELSE 0 
   END)  AS mar_sales, 
   Sum(CASE 
 WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity 
 

[jira] [Created] (DRILL-5365) FileNotFoundException when reading a parquet file

2017-03-17 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5365:
-

 Summary: FileNotFoundException when reading a parquet file
 Key: DRILL-5365
 URL: https://issues.apache.org/jira/browse/DRILL-5365
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive
Affects Versions: 1.10.0
Reporter: Chun Chang
Assignee: Chunhui Shi


The parquet file is generated through the following CTAS.

To reproduce the issue: 1) two or more nodes cluster; 2) enable impersonation; 
3) set "fs.default.name": "file:///" in hive storage plugin; 4) restart 
drillbits; 5) as a regular user, on node A, drop the table/file; 6) ctas from a 
large enough hive table as source to recreate the table/file; 7) query the 
table from node A should work; 8) query from node B as same user should 
reproduce the issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5617) spill filename conflict

2017-06-28 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5617:
-

 Summary: spill filename conflict
 Key: DRILL-5617
 URL: https://issues.apache.org/jira/browse/DRILL-5617
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.11.0
Reporter: Chun Chang
Assignee: Boaz Ben-Zvi


Spill location can be configured to be written on hdfs such as:

  hashagg: {
# The partitions divide the work inside the hashagg, to ease
# handling spilling. This initial figure is tuned down when
# memory is limited.
#  Setting this option to 1 disables spilling !
num_partitions: 32,
spill: {
# The 2 options below override the common ones
# they should be deprecated in the future
directories : [ "/tmp/drill/spill" ],
fs : "maprfs:///"
 }
  }

However, this could cause spill filename conflict since name convention does 
not contain node name.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5694) hash agg spill to disk, second phase OOM

2017-07-27 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5694:
-

 Summary: hash agg spill to disk, second phase OOM
 Key: DRILL-5694
 URL: https://issues.apache.org/jira/browse/DRILL-5694
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.11.0
Reporter: Chun Chang
Assignee: Boaz Ben-Zvi


| 1.11.0-SNAPSHOT  | d622f76ee6336d97c9189fc589befa7b0f4189d6  | DRILL-5165: 
For limit all case, no need to push down limit to scan  | 21.07.2017 @ 10:36:29 
PDT

Second phase agg ran out of memory. Not suppose to. Test data currently only 
accessible locally.

/root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg15.q
Query:
select row_count, sum(row_count), avg(double_field), max(double_rand), 
count(float_rand) from parquet_500m_v1 group by row_count order by row_count 
limit 30
Failed with exception
java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
while executing the query.

HT was: 534773760 OOM at Second Phase. Partitions: 32. Estimated batch size: 
4849664. Planned batches: 0. Rows spilled so far: 6459928 Memory limit: 
536870912 so far allocated: 534773760.
Fragment 1:6

[Error Id: a193babd-f783-43da-a476-bb8dd4382420 on 10.10.30.168:31010]

  (org.apache.drill.exec.exception.OutOfMemoryException) HT was: 534773760 OOM 
at Second Phase. Partitions: 32. Estimated batch size: 4849664. Planned 
batches: 0. Rows spilled so far: 6459928 Memory limit: 536870912 so far 
allocated: 534773760.

org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1175
org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.physical.impl.BaseRootExec.next():105

org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
org.apache.drill.exec.physical.impl.BaseRootExec.next():95
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():415
org.apache.hadoop.security.UserGroupInformation.doAs():1595
org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745
  Caused By (org.apache.drill.exec.exception.OutOfMemoryException) Unable to 
allocate buffer of size 4194304 due to memory limit. Current allocation: 
534773760
org.apache.drill.exec.memory.BaseAllocator.buffer():238
org.apache.drill.exec.memory.BaseAllocator.buffer():213
org.apache.drill.exec.vector.IntVector.allocateBytes():231
org.apache.drill.exec.vector.IntVector.allocateNew():211

org.apache.drill.exec.test.generated.HashTableGen2141.allocMetadataVector():778

org.apache.drill.exec.test.generated.HashTableGen2141.resizeAndRehashIfNeeded():717
org.apache.drill.exec.test.generated.HashTableGen2141.insertEntry():643
org.apache.drill.exec.test.generated.HashTableGen2141.put():618

org.apache.drill.exec.test.generated.HashAggregatorGen1823.checkGroupAndAggrValues():1173
org.apache.drill.exec.test.generated.HashAggregatorGen1823.doWork():539
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133

[jira] [Created] (DRILL-5740) hash agg fail to read spill file

2017-08-25 Thread Chun Chang (JIRA)
Chun Chang created DRILL-5740:
-

 Summary: hash agg fail to read spill file
 Key: DRILL-5740
 URL: https://issues.apache.org/jira/browse/DRILL-5740
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.12.0
Reporter: Chun Chang
Assignee: Boaz Ben-Zvi
Priority: Critical


-Build: | 1.12.0-SNAPSHOT  | 11008d029bafa36279e3045c4ed1a64366080620
-Multi-node drill cluster

Running a query causing hash agg spill fails with the following error. And this 
seems to be a regression.

{noformat}
Execution Failures:
/root/drill-test-framework/framework/resources/Advanced/hash-agg/spill/hagg5.q
Query:
select gby_date, gby_int32_rand, sum(int32_field), avg(float_field), 
min(boolean_field), count(double_rand) from 
dfs.`/drill/testdata/hagg/PARQUET-500M.parquet` group by gby_date, 
gby_int32_rand order by gby_date, gby_int32_rand limit 30
Failed with exception
java.sql.SQLException: SYSTEM ERROR: FileNotFoundException: File 
/tmp/drill/spill/10.10.30.168-31010/265f91f9-78d2-78a6-68ad-4709674efe0a_HashAgg_1-4-34/spill3
 does not exist

Fragment 1:34

[Error Id: 291a79f8-9b7a-485d-9404-e7b7fe1d8f1e on 10.10.30.168:31010]

  (java.lang.RuntimeException) java.io.FileNotFoundException: File 
/tmp/drill/spill/10.10.30.168-31010/265f91f9-78d2-78a6-68ad-4709674efe0a_HashAgg_1-4-34/spill3
 does not exist
org.apache.drill.exec.physical.impl.aggregate.SpilledRecordbatch.():67

org.apache.drill.exec.test.generated.HashAggregatorGen1891.outputCurrentBatch():980
org.apache.drill.exec.test.generated.HashAggregatorGen1891.doWork():617
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():168
org.apache.drill.exec.record.AbstractRecordBatch.next():164
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():133
org.apache.drill.exec.record.AbstractRecordBatch.next():164
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext():191
org.apache.drill.exec.record.AbstractRecordBatch.next():164
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():93
org.apache.drill.exec.record.AbstractRecordBatch.next():164
org.apache.drill.exec.physical.impl.BaseRootExec.next():105

org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
org.apache.drill.exec.physical.impl.BaseRootExec.next():95
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():234
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():227
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():415
org.apache.hadoop.security.UserGroupInformation.doAs():1595
org.apache.drill.exec.work.fragment.FragmentExecutor.run():227
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():745
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-6132) HashPartitionSender leaks memory

2018-02-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-6132:
-

 Summary: HashPartitionSender leaks memory
 Key: DRILL-6132
 URL: https://issues.apache.org/jira/browse/DRILL-6132
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.12.0
Reporter: Chun Chang
Assignee: Timothy Farkas


Enable assertion (-ea), I noticed HashPartitionSender leaks memory if 
aggregation query fails due to OOM.
{noformat}
message: "SYSTEM ERROR: IllegalStateException: 
Allocator[op:2:1:0:HashPartitionSender] closed with outstanding buffers 
allocated (1).\nAllocator(op:2:1:0:HashPartitionSender) 
100/8192/3300352/100 (res/actual/peak/limit)\n child allocators: 
0\n ledgers: 1\n ledger[703835] allocator: op:2:1:0:HashPartitionSender), 
isOwning: true, size: 8192, references: 1, life: 6329151004063490..0, 
allocatorManager: [688058, life: 6329151004058252..0] holds 1 buffers. \n 
DrillBuf[777552], udle: [688059 0..8192]: , \n reservations: 0\n\n\nFragment 
2:1\n\n[Error Id: c7cc9d37-8881-4db1-8123-2651628c4081 on 
10.10.30.168:31010]\n\n (java.lang.IllegalStateException) 
Allocator[op:2:1:0:HashPartitionSender] closed with outstanding buffers 
allocated (1).\nAllocator(op:2:1:0:HashPartitionSender) 
100/8192/3300352/100 (res/actual/peak/limit)\n child allocators: 
0\n ledgers: 1\n ledger[703835] allocator: op:2:1:0:HashPartitionSender), 
isOwning: true, size: 8192, references: 1, life: 6329151004063490..0, 
allocatorManager: [688058, life: 6329151004058252..0] holds 1 buffers. \n 
DrillBuf[777552], udle: [688059 0..8192]: , \n reservations: 0\n\n 
org.apache.drill.exec.memory.BaseAllocator.close():504\n 
org.apache.drill.exec.ops.BaseOperatorContext.close():157\n 
org.apache.drill.exec.ops.OperatorContextImpl.close():79\n 
org.apache.drill.exec.ops.FragmentContext.suppressingClose():429\n 
org.apache.drill.exec.ops.FragmentContext.close():418\n 
org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324\n 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155\n 
org.apache.drill.exec.work.fragment.FragmentExecutor.run():267\n 
org.apache.drill.common.SelfCleaningRunnable.run():38\n 
java.util.concurrent.ThreadPoolExecutor.runWorker():1149\n 
java.util.concurrent.ThreadPoolExecutor$Worker.run():624\n 
java.lang.Thread.run():748\n"
exception {
exception_class: "java.lang.IllegalStateException"
message: "Allocator[op:2:1:0:HashPartitionSender] closed with outstanding 
buffers allocated (1).\nAllocator(op:2:1:0:HashPartitionSender) 
100/8192/3300352/100 (res/actual/peak/limit)\n child allocators: 
0\n ledgers: 1\n ledger[703835] allocator: op:2:1:0:HashPartitionSender), 
isOwning: true, size: 8192, references: 1, life: 6329151004063490..0, 
allocatorManager: [688058, life: 6329151004058252..0] holds 1 buffers. \n 
DrillBuf[777552], udle: [688059 0..8192]: , \n reservations: 0\n"
stack_trace {
class_name: "org.apache.drill.exec.memory.BaseAllocator"
file_name: "BaseAllocator.java"
line_number: 504
method_name: "close"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.ops.BaseOperatorContext"
file_name: "BaseOperatorContext.java"
line_number: 157
method_name: "close"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.ops.OperatorContextImpl"
file_name: "OperatorContextImpl.java"
line_number: 79
method_name: "close"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.ops.FragmentContext"
file_name: "FragmentContext.java"
line_number: 429
method_name: "suppressingClose"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.ops.FragmentContext"
file_name: "FragmentContext.java"
line_number: 418
method_name: "close"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.work.fragment.FragmentExecutor"
file_name: "FragmentExecutor.java"
line_number: 324
method_name: "closeOutResources"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.work.fragment.FragmentExecutor"
file_name: "FragmentExecutor.java"
line_number: 155
method_name: "cleanup"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.exec.work.fragment.FragmentExecutor"
file_name: "FragmentExecutor.java"
line_number: 267
method_name: "run"
is_native_method: false
}
stack_trace {
class_name: "org.apache.drill.common.SelfCleaningRunnable"
file_name: "SelfCleaningRunnable.java"
line_number: 38
method_name: "run"
is_native_method: false
}
stack_trace {
class_name: "..."
line_number: 0
method_name: "..."
is_native_method: false
}
}
}{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6275) drillbit direct_current memory usage is not populated/updated

2018-03-19 Thread Chun Chang (JIRA)
Chun Chang created DRILL-6275:
-

 Summary: drillbit direct_current memory usage is not 
populated/updated
 Key: DRILL-6275
 URL: https://issues.apache.org/jira/browse/DRILL-6275
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Affects Versions: 1.13.0
Reporter: Chun Chang


We used to keep track drill memory usage in sys.memory. And it was useful in 
detecting memory leaks. This feature seems broken. The direct_current memory 
usage is not populated or updated.

{noformat}
0: jdbc:drill:zk=10.10.30.166:5181> select * from sys.memory;
+---++---+-+-+-+--+
| hostname | user_port | heap_current | heap_max | direct_current | 
jvm_direct_current | direct_max |
+---++---+-+-+-+--+
| 10.10.30.168 | 31010 | 1162636800 | 2147483648 | 0 | 22096 | 10737418240 |
| 10.10.30.169 | 31010 | 1301175040 | 2147483648 | 0 | 22096 | 10737418240 |
| 10.10.30.166 | 31010 | 989448872 | 2147483648 | 0 | 22096 | 10737418240 |
| 10.10.30.167 | 31010 | 1767205312 | 2147483648 | 0 | 22096 | 10737418240 |
+---++---+-+-+-+--+
4 rows selected (1.564 seconds)
0: jdbc:drill:zk=10.10.30.166:5181> select * from sys.version;
+--+---+---++-++
| version | commit_id | commit_message | commit_time | build_email | build_time 
|
+--+---+---++-++
| 1.13.0-SNAPSHOT | 534212456cc25a49272838cba91c223f63df7fd2 | Cleanup when 
closing, and cleanup spill after a kill | 07.03.2018 @ 16:18:27 PST | 
inram...@gmail.com | 08.03.2018 @ 10:09:28 PST |
+--+---+---++-++
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6495) Fragment error message profile dumped into log file. Why?

2018-06-13 Thread Chun Chang (JIRA)
Chun Chang created DRILL-6495:
-

 Summary: Fragment error message profile dumped into log file. Why?
 Key: DRILL-6495
 URL: https://issues.apache.org/jira/browse/DRILL-6495
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.13.0
Reporter: Chun Chang


When a query fails for some reason, we dump the following gigantic json profile 
into drillbit.log. Why do we do this? Has anyone found this info useful? It 
completely clutters the log file. If the profile contains crucial information, 
I recommend finding an alternative.

{noformat}
2018-06-13 14:47:31,094 [24de6f10-0e19-1769-a136-76b15ce5b832:frag:2:7] INFO 
o.a.d.e.w.fragment.FragmentExecutor - 24de6f10-0e19-1769-a136-76b15ce5b832:2:7: 
State change requested CANCELLATION_REQUESTED --> FINISHED
2018-06-13 14:47:31,094 [24de6f10-0e19-1769-a136-76b15ce5b832:frag:2:7] INFO 
o.a.d.e.w.f.FragmentStatusReporter - 24de6f10-0e19-1769-a136-76b15ce5b832:2:7: 
State to report: CANCELLED
2018-06-13 14:47:31,095 [24de6f10-0e19-1769-a136-76b15ce5b832:frag:2:7] WARN 
o.a.d.exec.rpc.control.WorkEventBus - A fragment message arrived but there was 
no registered listener for that message: profile {
state: CANCELLED
minor_fragment_id: 7
operator_profile {
input_profile {
records: 4096
batches: 1
schemas: 1
}
operator_id: 10
operator_type: 29
setup_nanos: 0
process_nanos: 2786560384
peak_local_memory_allocated: 54050816
wait_nanos: 47930232
}
operator_profile {
input_profile {
records: 4096
batches: 1
schemas: 1
}
operator_id: 8
operator_type: 10
setup_nanos: 1705434
process_nanos: 379744084
peak_local_memory_allocated: 105226240
wait_nanos: 0
}
operator_profile {
input_profile {
records: 0
batches: 0
schemas: 0
}
operator_id: 12
operator_type: 42
...
...

{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-1796) display a map as value caused ExecutionSetupException

2014-12-01 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1796:
-

 Summary: display a map as value caused ExecutionSetupException
 Key: DRILL-1796
 URL: https://issues.apache.org/jira/browse/DRILL-1796
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Reporter: Chun Chang


#Mon Dec 01 11:15:02 PST 2014
git.commit.id.abbrev=a60e1db

Have a parquet data contains the following:

0: jdbc:drill:schema=dfs.drillTestDir> select t.fields.map[0] from 
`data.parquet` as t limit 2;
++
|   EXPR$0   |
++
| {"key":"ho_wf2gp","value":{"member2":0}} |
| {"key":"ho_wf2gp","value":{"member2":0}} |
++

The following query displaying 'key' works:

0: jdbc:drill:schema=dfs.drillTestDir> select t.fields.map[0].`key` from 
`data.parquet` as t limit 2;
++
|   EXPR$0   |
++
| ho_wf2gp   |
| ho_wf2gp   |
++

But if the field to display is a map, then I get the exception:

0: jdbc:drill:schema=dfs.drillTestDir> select t.fields.map[0].`value` from 
`data.parquet` as t limit 2;
Query failed: Query failed: Failure while running fragment., You tried to write 
a VarChar type when you are using a ValueWriter of type 
NullableVarCharWriterImpl. [ 5494307b-7c11-421e-be49-d0bf043a4ee1 on 
qa-node120.qa.lab:31010 ]
[ 5494307b-7c11-421e-be49-d0bf043a4ee1 on qa-node120.qa.lab:31010 ]
Error: exception while executing query: Failure while executing query. 
(state=,code=0)

Here is the stack:

12:23:09.249 [2b8331d1-db0c-1150-f58e-7884c00ea314:frag:0:0] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
org.apache.drill.common.exceptions.ExecutionSetupException: 
java.lang.IllegalArgumentException: You tried to write a VarChar type when you 
are using a ValueWriter of type NullableVarCharWriterImpl.
at 
org.apache.drill.exec.store.parquet2.DrillParquetReader.setup(DrillParquetReader.java:252)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.(ScanBatch.java:97) 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:147)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetScanBatchCreator.getBatch(ParquetScanBatchCreator.java:53)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ImplCreator.visitOp(ImplCreator.java:62) 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ImplCreator.visitOp(ImplCreator.java:39) 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.base.AbstractPhysicalVisitor.visitSubScan(AbstractPhysicalVisitor.java:125)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetRowGroupScan.accept(ParquetRowGroupScan.java:107)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:74)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ImplCreator.visitOp(ImplCreator.java:62) 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ImplCreator.visitOp(ImplCreator.java:39) 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1798) write a BigInt type when you are using a ValueWriter of type NullableVarCharWriterImpl

2014-12-01 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1798:
-

 Summary: write a BigInt type when you are using a ValueWriter of 
type NullableVarCharWriterImpl
 Key: DRILL-1798
 URL: https://issues.apache.org/jira/browse/DRILL-1798
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang


#Mon Dec 01 11:15:02 PST 2014
git.commit.id.abbrev=a60e1db

My data looks like this:

0: jdbc:drill:schema=dfs.drillTestDir> select 
t.service_provider_set_1d.`map`[0] from `es/dpa-nested` as t limit 10;
++
|   EXPR$0   |
++
| {"key":"WhatA","value":1} |
| {} |
| {"key":"WhatB","value":2} |
| {} |
| {"key":"WhatB","value":2} |
| {} |
| {"key":"WhatA","value":2} |
| {} |
| {"key":"WhatC","value":2} |
| {"key":"WhatC","value":2} |
++

0: jdbc:drill:schema=dfs.drillTestDir> select 
t.service_provider_set_1d.`map`[0].`value` from `es/dpa-nested` as t limit 1;
Query failed: Query failed: Failure while running fragment.[ 
ec75d174-ffa5-4e6e-98ad-6c99979b9fa5 on qa-node120.qa.lab:31010 ]
[ ec75d174-ffa5-4e6e-98ad-6c99979b9fa5 on qa-node120.qa.lab:31010 ]


Error: exception while executing query: Failure while executing query. 
(state=,code=0)

stack:

16:48:57.682 [2b82f385-b5bc-a2af-83ee-1f54c1c24ae5:frag:0:0] ERROR 
o.a.drill.exec.ops.FragmentContext - Fragment Context received failure.
java.lang.IllegalArgumentException: You tried to write a BigInt type when you 
are using a ValueWriter of type NullableVarCharWriterImpl.
at 
org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.fail(AbstractFieldWriter.java:625)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.write(AbstractFieldWriter.java:185)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.NullableVarCharWriterImpl.write(NullableVarCharWriterImpl.java:88)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet2.DrillParquetGroupConverter$DrillBigIntConverter.addLong(DrillParquetGroupConverter.java:330)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
parquet.column.impl.ColumnReaderImpl$2$4.writeValue(ColumnReaderImpl.java:258) 
~[parquet-column-1.5.1-drill-r5.jar:na]
at 
parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:354)
 ~[parquet-column-1.5.1-drill-r5.jar:na]
at 
parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:400) 
~[parquet-column-1.5.1-drill-r5.jar:na]
at 
org.apache.drill.exec.store.parquet2.DrillParquetReader.next(DrillParquetReader.java:288)
 
~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:191) 
[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.buildSchema(ScanBatch.java:125) 
[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema(IteratorValidatorBatchIterator.java:80)
 
[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1799) kvgen() leaking memory?

2014-12-01 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1799:
-

 Summary: kvgen() leaking memory?
 Key: DRILL-1799
 URL: https://issues.apache.org/jira/browse/DRILL-1799
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 0.7.0
Reporter: Chun Chang


#Mon Dec 01 11:15:02 PST 2014
git.commit.id.abbrev=a60e1db

Running the following query that contains kvgen() appears running out of memory 
and fails. There appears to be a 50 row limit. If I add a limit to the query, 
limit 50 or less works, with limit 51, the query fails. 

0: jdbc:drill:schema=dfs.drillTestDir> select kvgen(sub.esr) from `es/esr-part` 
sub;

The query returns result but fails in the middle of displaying the result. No 
other error message except the following stack:

18:06:08.158 [2b82e176-0aeb-9b05-15f2-520eea269a1c:frag:0:0] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.IllegalStateException: Attempted to close accountor with 151 
buffer(s) still allocatedfor QueryId: 2b82e176-0aeb-9b05-15f2-520eea269a1c, 
MajorFragmentId: 0, MinorFragmentId: 0.


Total 1 allocation(s) of byte size(s): 4096, at stack location:

org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:212)

org.apache.drill.exec.vector.UInt1Vector.allocateNewSafe(UInt1Vector.java:137)

org.apache.drill.exec.vector.NullableBigIntVector.allocateNewSafe(NullableBigIntVector.java:173)

org.apache.drill.exec.vector.complex.MapVector.allocateNewSafe(MapVector.java:165)

org.apache.drill.exec.vector.complex.RepeatedMapVector.allocateNewSafe(RepeatedMapVector.java:236)

org.apache.drill.exec.vector.complex.MapVector.allocateNewSafe(MapVector.java:165)

org.apache.drill.exec.vector.complex.MapVector.allocateNewSafe(MapVector.java:165)

org.apache.drill.exec.vector.complex.impl.SingleMapWriter.allocate(SingleMapWriter.java:135)

org.apache.drill.exec.vector.complex.impl.RepeatedMapWriter.allocate(RepeatedMapWriter.java:137)

org.apache.drill.exec.vector.complex.impl.SingleListWriter.allocate(SingleListWriter.java:116)

org.apache.drill.exec.vector.complex.impl.ComplexWriterImpl.allocate(ComplexWriterImpl.java:157)

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doAlloc(ProjectRecordBatch.java:217)

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:144)

org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:89)

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)

org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:106)

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:124)

org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:86)

org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:76)

org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:52)

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:129)

org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:106)

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:124)

org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67)







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1801) Regression: Mondrian query5843.q failed with run time exception

2014-12-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1801:
-

 Summary: Regression: Mondrian query5843.q failed with run time 
exception
 Key: DRILL-1801
 URL: https://issues.apache.org/jira/browse/DRILL-1801
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang
Priority: Blocker


#Tue Dec 02 14:38:34 EST 2014
git.commit.id.abbrev=757e9a2

Mondrian query5843.q used to work but failed with the following stack:

2014-12-02 16:55:39,696 [2b81a073-d825-bd6d-85c8-022726952867:frag:0:0] WARN  
o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
`T9¦¦*`, returning null instance.
2014-12-02 16:55:39,771 [2b81a073-d825-bd6d-85c8-022726952867:frag:0:0] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.RuntimeException: Only COUNT aggregate function supported for Boolean 
type
  at 
org.apache.drill.exec.test.generated.HashAggregatorGen91048$BatchHolder.setupInterior(HashAggTemplate.java:72)
 ~[na:na]
  at 
org.apache.drill.exec.test.generated.HashAggregatorGen91048$BatchHolder.setup(HashAggTemplate.java:150)
 ~[na:na]
  at 
org.apache.drill.exec.test.generated.HashAggregatorGen91048$BatchHolder.access$600(HashAggTemplate.java:117)
 ~[na:na]
  at 
org.apache.drill.exec.test.generated.HashAggregatorGen91048.addBatchHolder(HashAggTemplate.java:445)
 ~[na:na]
  at 
org.apache.drill.exec.test.generated.HashAggregatorGen91048.setup(HashAggTemplate.java:260)
 ~[na:na]
  at 
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.createAggregatorInternal(HashAggBatch.java:263)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.createAggregator(HashAggBatch.java:189)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:97)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:130)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:97)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:112)
 ~[drill

[jira] [Created] (DRILL-1802) Lots of ForemanException in drillbit.log

2014-12-02 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1802:
-

 Summary: Lots of ForemanException in drillbit.log
 Key: DRILL-1802
 URL: https://issues.apache.org/jira/browse/DRILL-1802
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang


#Tue Dec 02 14:38:34 EST 2014
git.commit.id.abbrev=757e9a2

While running Mondrian query, saw lots of the following exception in 
drillbit.log, queries were successful:

2014-12-02 14:30:42,940 [2b81c26d-4109-6df8-018b-c32616ca359c:frag:0:0] INFO  
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED 
state as query is already at FAILED state (which is terminal).
2014-12-02 14:35:23,392 [2b81c153-a391-0045-d9eb-8801b7d08159:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Dropping request to move to FAILED state 
as query is already at COMPLETED state (which is terminal).
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
during fragment initialization.
  at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
 [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
  at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper. 
Failure while accessing Zookeeper
  at 
org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  ... 4 common frames omitted
Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
  at 
org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  ... 10 common frames omitted
Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: 
KeeperErrorCode = NodeExists for 
/drill/running/2b81c153-a391-0045-d9eb-8801b7d08159
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) 
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) 
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660)
 ~[curator-framework-2.5.0.jar:na]
  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) 
~[curator-client-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:51) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1804) random failures while running large number of queries

2014-12-03 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1804:
-

 Summary: random failures while running large number of queries
 Key: DRILL-1804
 URL: https://issues.apache.org/jira/browse/DRILL-1804
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 0.7.0
Reporter: Chun Chang


#Tue Dec 02 14:38:34 EST 2014
git.commit.id.abbrev=757e9a2

Running Mondrian regression tests, out of over 6000 queries, sometimes I get 
one or two random failures. Here is the stack when it happens:

2014-12-02 17:49:32,271 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:foreman] ERROR 
o.a.drill.exec.work.foreman.Foreman - Error 
aeae057b-ed0a-43aa-902d-fe3a41531511: Query failed: Unexpected exception during 
fragment initialization.
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
during fragment initialization.
  at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
 [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
  at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
  at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper. 
Failure while accessing Zookeeper
  at 
org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  ... 4 common frames omitted
Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
  at 
org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  at 
org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  ... 10 common frames omitted
Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: 
KeeperErrorCode = NodeExists for 
/drill/running/2b8193d3-f0ca-aa7c-094a-d8234d76d068
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) 
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
  at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) 
~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660)
 ~[curator-framework-2.5.0.jar:na]
  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) 
~[curator-client-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:44)
 ~[curator-framework-2.5.0.jar:na]
  at 
org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:51) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
  ... 11 common frames omitted
2014-12-02 17:49:32,287 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:frag:0:0] WARN  
o.a.d.e.p.impl.SendingAccountor - Failure while waiting for send complete.
java.lang.InterruptedException: null
  at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301)
 ~[na:1.7.0_45]

[jira] [Created] (DRILL-1860) When a 'key' is missing in a map, it should not contribute to the count() function

2014-12-12 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1860:
-

 Summary: When a 'key' is missing in a map, it should not 
contribute to the count() function
 Key: DRILL-1860
 URL: https://issues.apache.org/jira/browse/DRILL-1860
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 0.7.0
Reporter: Chun Chang


#Fri Dec 12 11:47:55 EST 2014
git.commit.id.abbrev=d925eab

Have the following json data:

{code}
0: jdbc:drill:schema=dfs.drillTestDir> select t.soa from `complex.json` t limit 
10;
++
|soa |
++
| [{"in":1},{"in":1,"fl":1.12345},{"in":1,"fl":10.12345,"nul":"not 
null"},{"in":1,"fl":10.6789,"nul":"not null","bool":true,"str":"here is a 
string at row 1"}] |
| [{"in":2},{"in":2,"fl":2.12345},{"in":2,"fl":20.12345,"nul":"not 
null"},{"in":2,"fl":20.6789,"bool":false,"str":"here is a string at row 2"}] |
| [{"in":3},{"in":3,"fl":3.12345},{"in":3,"fl":30.12345,"nul":"not 
null"},{"in":3,"fl":30.6789,"nul":"not null","bool":true,"str":"here is a 
string at row 3"}] |
| [{"in":4},{"in":4,"fl":4.12345},{"in":4,"fl":40.12345,"nul":"not 
null"},{"in":4,"fl":40.6789,"bool":true,"str":"here is a string at row 4"}] |
| [{"in":5},{"in":5,"fl":5.12345},{"in":5,"fl":50.12345,"nul":"not 
null"},{"in":5,"fl":50.6789,"nul":"not null","bool":true,"str":"here is a 
string at row 5"}] |
| [{"in":6},{"in":6,"fl":6.12345},{"in":6,"fl":60.12345,"nul":"not 
null"},{"in":6,"fl":60.6789,"nul":"not null","bool":true,"str":"here is a 
string at row 6"}] |
| 
[{"in":7},{"in":7,"fl":7.12345},{"in":7,"fl":70.12345},{"in":7,"fl":70.6789,"nul":"not
 null","bool":false,"str":"here is a string at row 7"}] |
| 
[{"in":8},{"in":8,"fl":8.12345},{"in":8,"fl":80.12345},{"in":8,"fl":80.6789,"bool":true,"str":"here
 is a string at row 8"}] |
| [{"in":9},{"in":9,"fl":9.12345},{"in":9,"fl":90.12345,"nul":"not 
null"},{"in":9,"fl":90.6789,"nul":"not null","bool":true,"str":"here is a 
string at row 9"}] |
| 
[{"in":10},{"in":10,"fl":10.12345},{"in":10,"fl":100.12345},{"in":10,"fl":100.6789,"bool":false,"str":"here
 is a string at row 10"}] |
++
{code}

For some of the rows, for the key named 'nul', it is missing, hence returning 
null:

{code}
0: jdbc:drill:schema=dfs.drillTestDir> select t.soa[2].nul, t.soa[3].bool from 
`complex.json` t limit 10;
+++
|   EXPR$0   |   EXPR$1   |
+++
| not null   | true   |
| not null   | false  |
| not null   | true   |
| not null   | true   |
| not null   | true   |
| not null   | true   |
| null   | false  |
| null   | true   |
| not null   | true   |
| null   | false  |
+++
{code}

But when I do a count on that, the null value still counted:

{code}
0: jdbc:drill:schema=dfs.drillTestDir> select count(t.soa[2].nul), 
count(t.soa[3].bool) from `complex.json` t limit 10;
+++
|   EXPR$0   |   EXPR$1   |
+++
| 100| 100|
+++
{code}

Here is the physical plan:

{code}
0: jdbc:drill:schema=dfs.drillTestDir> explain plan for select 
count(t.soa[2].nul), count(t.soa[3].bool) from `complex.json` t;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(EXPR$0=[$0], EXPR$1=[$0])
00-02StreamAgg(group=[{}], EXPR$0=[$SUM0($0)])
00-03  StreamAgg(group=[{}], EXPR$0=[COUNT()])
00-04Project($f0=[ITEM(ITEM($0, 2), 'nul')], $f1=[ITEM(ITEM($0, 3), 
'bool')])
00-05

[jira] [Created] (DRILL-1872) empty map returned with order by on large dataset

2014-12-15 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1872:
-

 Summary: empty map returned with order by on large dataset
 Key: DRILL-1872
 URL: https://issues.apache.org/jira/browse/DRILL-1872
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang


#Mon Dec 15 11:37:23 EST 2014
git.commit.id.abbrev=3b0ff5d

Have a json file contains 1 million records. The following query without order 
by give me correct result:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.oooi from 
`complex.json` t limit 5;
+++
| id |oooi|
+++
| 1  | {"oa":{"oab":{"oabc":1}}} |
| 2  | {"oa":{"oab":{"oabc":2}}} |
| 3  | {"oa":{"oab":{"oabc":3}}} |
| 4  | {"oa":{"oab":{"oabc":4}}} |
| 5  | {"oa":{"oab":{"oabc":5}}} |
{code}

Add order by will give me empty map"

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.oooi from 
`complex.json` t order by t.id limit 5;
+++
| id |oooi|
+++
| 1  | {} |
| 2  | {} |
| 3  | {} |
| 4  | {} |
| 5  | {} |
+++
{code}

The query with order by against a smaller dataset works. Here is the record:

{code}
{
"id": 1,
"gbyi": 0,
"gbyt": "soa",
"fl": 1.6789,
"nul": "not null",
"bool": false,
"str": "This is row 1",
"sia": [
1,
11,
101,
1001
],
"sfa": [
0,
1.01,
10.222,
10.0006789
],
"sba": [
-1,
-9.8766,
null,
true,
"text row 1"
],
"soa": [
{
"in": 1
},
{
"in": 1,
"fl": 1.12345
},
{
"in": 1,
"fl": 10.12345,
"nul": "not null"
},
{
"in": 1,
"fl": 10.6789,
"nul": "not null",
"bool": true,
"str": "here is a string at row 1"
}
],
"ooa": [
{
"in": 1
},
{
"fl": {
"f1": 1.6789,
"f2": 54331
},
"in": 1
},
{
"a": {
"aa": {
"aaa": "aaa 1"
}
},
"b": {
"bb": {
"bbb": "bbb 1"
},
"c": {
"cc": "ccc 1"
}
}
}
],
"aaa": [
[
[
"aa0 1"
],
[
"ab0 1"
]
],
[
[
"ba0 1"
],
[
"bb0 1"
]
],
[
[
"ca0 1",
"ca1 1"
],
[
"cb0 1",
"cb1 1",
"cb2 1"
]
]
],
"saa": [
-1,
[
-10,
-9.3211
],
[
1,
[
10.12345,
"not null"
],
[
1,
1.6789,
"not null",
true
],
[
-1,
6779,
"not null",
false,
"this is a short string 1"
]
]
],
"oooi": {
"oa": {
"oab": {
"oabc": 1
}
}
},
"ooof": {
"oa": {
"oab": {
"oabc": 1.5678
}
}
},
"ooos": {
"oa": {
"oab": {
"oabc": "ooos string 1"
}
}
},
  

[jira] [Created] (DRILL-1880) Query with three where conditions returned wrong results

2014-12-16 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1880:
-

 Summary: Query with three where conditions returned wrong results
 Key: DRILL-1880
 URL: https://issues.apache.org/jira/browse/DRILL-1880
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang


#Mon Dec 15 11:37:23 EST 2014
git.commit.id.abbrev=3b0ff5d

The following query containing three where conditions returned wrong results. 
(data is too big so not included here.)

{code}
SELECT t.gbyi, 
   Count(t.id) 
FROM   `complex.json` t 
WHERE  t.gbyi <= 5 
OR t.gbyi >= 11 
   AND t.gbyt <> 'ooof' 
GROUP  BY t.gbyi 
ORDER  BY t.gbyi; 
{code}

Wrong result, the count column is mostly wrong:
{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.gbyi, count(t.id) from 
`complex.json` t where t.gbyi <= 5 or t.gbyi >= 11 and t.gbyt <> 'ooof' group 
by t.gbyi order by t.gbyi;
+++
|gbyi|   EXPR$1   |
+++
| 0  | 66943  |
| 1  | 66318  |
| 2  | 66994  |
| 3  | 66683  |
| 4  | 66638  |
| 5  | 66439  |
| 11 | 63172  |
| 12 | 63008  |
| 13 | 62685  |
| 14 | 62970  |
+++
{code}

Reduce the where condition to just two gives the correct result:

{code}
SELECT t.gbyi, 
   Count(t.id) 
FROM   `complex.json` t 
WHERE  t.gbyi <= 5 
   AND t.gbyt <> 'ooof' 
GROUP  BY t.gbyi 
ORDER  BY t.gbyi; 
{code}

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.gbyi, count(t.id) from 
`complex.json` t where t.gbyi <= 5 and t.gbyt <> 'ooof' group by t.gbyi order 
by t.gbyi;
+++
|gbyi|   EXPR$1   |
+++
| 0  | 63305  |
| 1  | 62671  |
| 2  | 63249  |
| 3  | 63070  |
| 4  | 62967  |
| 5  | 62737  |
+++
{code}

physical plan for the query returned wrong result:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select t.gbyi, 
count(t.id) from `complex.json` t where t.gbyi <= 5 or t.gbyi >= 11 and t.gbyt 
<> 'ooof' group by t.gbyi order by t.gbyi;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(gbyi=[$0], EXPR$1=[$1])
00-02SelectionVectorRemover
00-03  Sort(sort0=[$0], dir0=[ASC])
00-04HashAgg(group=[{0}], EXPR$1=[$SUM0($1)])
00-05  HashAgg(group=[{0}], EXPR$1=[COUNT($1)])
00-06Project(gbyi=[$0], id=[$2])
00-07  SelectionVectorRemover
00-08Filter(condition=[OR(<=($0, 5), AND(>=($0, 11), <>($1, 
'ooof')))])
00-09  Project(gbyi=[$1], gbyt=[$2], id=[$0])
00-10Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`gbyi`, `gbyt`, `id`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
 | {
  "head" : {
"version" : 1,
"generator" : {
  "type" : "ExplainHandler",
  "info" : ""
},
"type" : "APACHE_DRILL_PHYSICAL",
"options" : [ ],
"queue" : 0,
"resultMode" : "EXEC"
  },
  "graph" : [ {
"pop" : "fs-scan",
"@id" : 10,
"files" : [ "maprfs:/drill/testdata/complex_type/json/complex.json" ],
"storage" : {
  "type" : "file",
  "enabled" : true,
  "connection" : "maprfs:///",
  "workspaces" : {
"root" : {
  "location" : "/",
  "writable" : false,
  "defaultInputFormat" : null
},
"tmp" : {
  "location" : "/tmp",
  "writable" : true,
  "defaultInputFormat" : "csv"
},
"drillTestDir" : {
  "location" : "/drill/testdata/",
  "writable" : true,
  "defaultInputFormat" : "parquet"
},
"drillTestDirComplexJson" : {
  "location" : "/drill/testdata/complex_type/json",
  "writable" : true,
  "defaultInputFormat" : "json"
},
"drillTestDirAmplab" : {
  "location" : "/dri

[jira] [Resolved] (DRILL-1880) Query with three where conditions returned wrong results

2014-12-16 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-1880.
---
Resolution: Invalid

> Query with three where conditions returned wrong results
> 
>
> Key: DRILL-1880
> URL: https://issues.apache.org/jira/browse/DRILL-1880
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.7.0
>    Reporter: Chun Chang
>
> #Mon Dec 15 11:37:23 EST 2014
> git.commit.id.abbrev=3b0ff5d
> The following query containing three where conditions returned wrong results. 
> (data is too big so not included here.)
> {code}
> SELECT t.gbyi, 
>Count(t.id) 
> FROM   `complex.json` t 
> WHERE  t.gbyi <= 5 
> OR t.gbyi >= 11 
>AND t.gbyt <> 'ooof' 
> GROUP  BY t.gbyi 
> ORDER  BY t.gbyi; 
> {code}
> Wrong result, the count column is mostly wrong:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.gbyi, count(t.id) 
> from `complex.json` t where t.gbyi <= 5 or t.gbyi >= 11 and t.gbyt <> 'ooof' 
> group by t.gbyi order by t.gbyi;
> +++
> |gbyi|   EXPR$1   |
> +++
> | 0  | 66943  |
> | 1  | 66318  |
> | 2  | 66994  |
> | 3  | 66683  |
> | 4  | 66638  |
> | 5  | 66439  |
> | 11 | 63172  |
> | 12 | 63008  |
> | 13 | 62685  |
> | 14 | 62970  |
> +++
> {code}
> Reduce the where condition to just two gives the correct result:
> {code}
> SELECT t.gbyi, 
>Count(t.id) 
> FROM   `complex.json` t 
> WHERE  t.gbyi <= 5 
>AND t.gbyt <> 'ooof' 
> GROUP  BY t.gbyi 
> ORDER  BY t.gbyi; 
> {code}
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.gbyi, count(t.id) 
> from `complex.json` t where t.gbyi <= 5 and t.gbyt <> 'ooof' group by t.gbyi 
> order by t.gbyi;
> +++
> |gbyi|   EXPR$1   |
> +++
> | 0  | 63305  |
> | 1  | 62671  |
> | 2  | 63249  |
> | 3  | 63070  |
> | 4  | 62967  |
> | 5  | 62737  |
> +++
> {code}
> physical plan for the query returned wrong result:
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select 
> t.gbyi, count(t.id) from `complex.json` t where t.gbyi <= 5 or t.gbyi >= 11 
> and t.gbyt <> 'ooof' group by t.gbyi order by t.gbyi;
> +++
> |text|json|
> +++
> | 00-00Screen
> 00-01  Project(gbyi=[$0], EXPR$1=[$1])
> 00-02SelectionVectorRemover
> 00-03  Sort(sort0=[$0], dir0=[ASC])
> 00-04HashAgg(group=[{0}], EXPR$1=[$SUM0($1)])
> 00-05  HashAgg(group=[{0}], EXPR$1=[COUNT($1)])
> 00-06Project(gbyi=[$0], id=[$2])
> 00-07  SelectionVectorRemover
> 00-08Filter(condition=[OR(<=($0, 5), AND(>=($0, 11), 
> <>($1, 'ooof')))])
> 00-09  Project(gbyi=[$1], gbyt=[$2], id=[$0])
> 00-10Scan(groupscan=[EasyGroupScan 
> [selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
> columns=[`gbyi`, `gbyt`, `id`], 
> files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
>  | {
>   "head" : {
> "version" : 1,
> "generator" : {
>   "type" : "ExplainHandler",
>   "info" : ""
> },
> "type" : "APACHE_DRILL_PHYSICAL",
> "options" : [ ],
> "queue" : 0,
> "resultMode" : "EXEC"
>   },
>   "graph" : [ {
> "pop" : "fs-scan",
> "@id" : 10,
> "files" : [ "maprfs:/drill/testdata/complex_type/json/complex.json" ],
> "storage" : {
>   "type" : "file",
>   "enabled" : true,
>   "connection" : "maprfs:///",
>   "workspaces" : {
> "root" : {
>   "location" : "/",
>   "writable" : false,
>   "defaultInputForm

[jira] [Created] (DRILL-1893) VectorContainer.add(VectorContainer.java:188) Assert

2014-12-18 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1893:
-

 Summary: VectorContainer.add(VectorContainer.java:188) Assert
 Key: DRILL-1893
 URL: https://issues.apache.org/jira/browse/DRILL-1893
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang


#Tue Dec 16 13:28:01 EST 2014
git.commit.id.abbrev=3b0ff5d

Have a json record looks like this (actual dataset is too big to attach here):

{code}
{  "id":2,
"ooos": {
"oa": {
"oab": {
"oabc": "ooos string 2"
}
}
}
}
{code}

The following query causes assertion:

{code}
SELECT t.id, 
   t.ooos, 
   t.ooos.oa.oab.oabc 
FROM   `complex.json` t 
WHERE  Length(t.ooos.oa.oab.oabc) < 14 
OR Length(t.ooos.oa.oab.oabc) > 16 
ORDER  BY t.ooos.oa.oab.oabc 
LIMIT  50; 
{code}

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.ooos, 
t.ooos.oa.oab.oabc from `complex.json` t where length(t.ooos.oa.oab.oabc) < 14 
or length(t.ooos.oa.oab.oabc) > 16 order by t.ooos.oa.oab.oabc limit 50;
++++
| id |ooos|   EXPR$2   |
++++
Query failed: Query failed: Failure while running fragment.[ 
d4b0530c-1f06-4b72-9836-07e181adaef1 on qa-node119.qa.lab:31010 ]
[ d4b0530c-1f06-4b72-9836-07e181adaef1 on qa-node119.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
query.
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
at sqlline.SqlLine.print(SqlLine.java:1809)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
at sqlline.SqlLine.dispatch(SqlLine.java:889)
at sqlline.SqlLine.begin(SqlLine.java:763)
at sqlline.SqlLine.start(SqlLine.java:498)
at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Here is the stack trace:

{code}
2014-12-18 14:40:02,615 [2b6ca84c-100a-9bb2-4d32-6798494f13ec:frag:1:3] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.AssertionError: null
at 
org.apache.drill.exec.record.VectorContainer.add(VectorContainer.java:188) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.record.VectorContainer.addHyperList(VectorContainer.java:81)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build(SortRecordBatchBuilder.java:196)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.TopN.TopNBatch.purge(TopNBatch.java:299) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.TopN.TopNBatch.innerNext(TopNBatch.java:228)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:96)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:97)
 ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.imp

[jira] [Created] (DRILL-1894) Complex JSON cause NPE

2014-12-18 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1894:
-

 Summary: Complex JSON cause NPE
 Key: DRILL-1894
 URL: https://issues.apache.org/jira/browse/DRILL-1894
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.7.0
Reporter: Chun Chang


#Tue Dec 16 13:28:01 EST 2014
git.commit.id.abbrev=3b0ff5d

Have the following JSON record (actual dataset too big):

{code}
{
"id": 2,
"oooa": {
"oa": {
"oab": {
"oabc": [
{
"rowId": 2
},
{
"rowValue1": 2,
"rowValue2": 2
}
]
}
}
}
}
{code}

The following query caused NPE:

{code}
SELECT   t.id, 
 t.oooa.oa.oab.oabc, 
 t.oooa.oa.oab.oabc[1].rowvalue2 
FROM `complex.json` t 
ORDER BY t.oooa.oa.oab.oabc[1].rowvalue2 limit 50;
{code}

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.oooa.oa.oab.oabc, 
t.oooa.oa.oab.oabc[1].rowValue2 from `complex.json` t order by 
t.oooa.oa.oab.oabc[1].rowValue2 limit 50;
Query failed: Query failed: Failure while running fragment.[ 
8a2ee7e8-8c7b-4881-883e-7924884a0878 on qa-node117.qa.lab:31010 ]
[ 8a2ee7e8-8c7b-4881-883e-7924884a0878 on qa-node117.qa.lab:31010 ]


Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

stack trace:

{code}
2014-12-18 15:11:54,916 [2b6ca0c5-5b7d-3832-091f-67b37d4e3e6c:frag:1:2] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.NullPointerException: null
2014-12-18 15:11:54,916 [2b6ca0c5-5b7d-3832-091f-67b37d4e3e6c:frag:1:2] ERROR 
o.a.drill.exec.ops.FragmentContext - Fragment Context received failure.
java.lang.NullPointerException: null
2014-12-18 15:11:54,916 [2b6ca0c5-5b7d-3832-091f-67b37d4e3e6c:frag:1:2] ERROR 
o.a.d.e.w.f.AbstractStatusReporter - Error 
798d65b7-9cfb-4276-a50a-e9bae311a7ec: Failure while running fragment.
java.lang.NullPointerException: null
2014-12-18 15:11:54,920 [2b6ca0c5-5b7d-3832-091f-67b37d4e3e6c:frag:2:0] ERROR 
o.a.d.e.p.i.p.StatusHandler - Failure while sending data to user.
org.apache.drill.exec.rpc.RpcException: Interrupted while trying to get sending 
semaphore.
at 
org.apache.drill.exec.rpc.data.DataTunnel.sendRecordBatch(DataTunnel.java:52) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.test.generated.PartitionerGen609$OutgoingRecordBatch.flush(PartitionerTemplate.java:320)
 [na:na]
at 
org.apache.drill.exec.test.generated.PartitionerGen609.flushOutgoingBatches(PartitionerTemplate.java:134)
 [na:na]
at 
org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:176)
 [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:114)
 [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
 [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_45]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_45]
at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
Caused by: java.lang.InterruptedException: null
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1301)
 ~[na:1.7.0_45]
at java.util.concurrent.Semaphore.acquire(Semaphore.java:317) 
~[na:1.7.0_45]
at 
org.apache.drill.exec.rpc.data.DataTunnel.sendRecordBatch(DataTunnel.java:49) 
[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
... 9 common frames omitted
{code}

physical plan:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select t.id, 
t.oooa.oa.oab.oabc, t.oooa.oa.oab.oabc[1].rowValue2 from `complex.json` t order 
by t.oooa.oa.oab.oabc[1].rowValue2 limit 50;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(id=[$0], EXPR$1=[$1], EXPR$2=[$2])
00-02SelectionVectorRemover
00-03  Limit(fetch=[50])
00-04SingleMergeExchange(sort0=[2 ASC])
01-01  SelectionVectorRemover
01-02TopN(limit=[50])
01-03  HashToRandomExchange(dist0=[[$2]])
02-0

[jira] [Resolved] (DRILL-1700) caught a memory assertion on an order by

2014-12-30 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-1700.
---
Resolution: Fixed

> caught a memory assertion on an order by
> 
>
> Key: DRILL-1700
> URL: https://issues.apache.org/jira/browse/DRILL-1700
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.7.0
>Reporter: Chun Chang
>Assignee: Parth Chandra
>Priority: Blocker
> Attachments: DRILL-1700.patch
>
>
> #Wed Nov 12 13:06:45 EST 2014
> git.commit.id.abbrev=1e21045
> The following query on a 1 million row data caused a memory assertion.
> 0: jdbc:drill:schema=dfs> select columns[0], columns[1], columns[2] from 
> `aggregate_1m.csv` order by cast(columns[0] as int);
> ++++
> |   EXPR$0   |   EXPR$1   |   EXPR$2   |
> ++++
> Query failed: Failure while running fragment.[ 
> 7bc42b6c-c00a-41c2-af43-331655e5a178 on qa-node119.qa.lab:31010 ]
> java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
> query.
>   at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
>   at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
>   at sqlline.SqlLine.print(SqlLine.java:1809)
> The same query adding a limit (say limit 100) worked. Here is the assertion 
> stack in drill bit log:
> 2014-11-12 15:16:22,864 [f89183d7-f94d-4ece-818f-8e597859d529:frag:0:0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
> fragment
> java.lang.AssertionError: null
> at 
> org.apache.drill.exec.memory.AtomicRemainder.get(AtomicRemainder.java:126) 
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.AtomicRemainder.forceGet(AtomicRemainder.java:85)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.Accountor.forceAdditionalReservation(Accountor.java:142)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.memory.Accountor.transferTo(Accountor.java:111) 
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at io.netty.buffer.DrillBuf.transferAccounting(DrillBuf.java:182) 
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:4.0.24.Final]
> at 
> org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.takeOwnership(TopLevelAllocator.java:192)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.takeOwnership(ExternalSortBatch.java:464)
>  
> ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1945) max() of all 0.0 value gives wrong results

2015-01-06 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1945:
-

 Summary: max() of all 0.0 value gives wrong results
 Key: DRILL-1945
 URL: https://issues.apache.org/jira/browse/DRILL-1945
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Aman Sinha


#Fri Jan 02 21:20:47 EST 2015
git.commit.id.abbrev=b491cdb

test data can be accessed from  
https://s3.amazonaws.com/apache-drill/files/complex.json.gz

Using the above dataset, the t.sfa[0] field is all 0.0.

{code}
0: jdbc:drill:schema=dfs.drillTestDirMondrian> select t.sfa[0] from 
`complex.json` t limit 10;
++
|   EXPR$0   |
++
| 0.0|
| 0.0|
| 0.0|
| 0.0|
| 0.0|
| 0.0|
| 0.0|
| 0.0|
| 0.0|
| 0.0|
++
10 rows selected (0.083 seconds)
0: jdbc:drill:schema=dfs.drillTestDirMondrian> select t.sfa[0] from 
`complex.json` t where t.sfa[0] <> 0.0;
++
|   EXPR$0   |
++
++
No rows selected (15.616 seconds)
{code}

The max value of this field in a group by query gives wrong results.

{code}
0: jdbc:drill:schema=dfs.drillTestDirMondrian> select mod(trunc(t.sfa[1]), 10) 
sfamod, max(t.sfa[0]) sfamax from `complex.json` t group by 
mod(trunc(t.sfa[1]), 10);
+++
|   sfamod   |   sfamax   |
+++
| 1.0| 4.9E-324   |
| 2.0| 4.9E-324   |
| 3.0| 4.9E-324   |
| 4.0| 4.9E-324   |
| 5.0| 4.9E-324   |
| 6.0| 4.9E-324   |
| 7.0| 4.9E-324   |
| 8.0| 4.9E-324   |
| 9.0| 4.9E-324   |
| 0.0| 4.9E-324   |
+++
10 rows selected (14.474 seconds)
{code}

simple max() on the field also gives the same wrong result.

{code}
0: jdbc:drill:schema=dfs.drillTestDirMondrian> select max(t.sfa[0]) sfamax from 
`complex.json` t;
++
|   sfamax   |
++
| 4.9E-324   |
++
1 row selected (15.666 seconds)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1952) Inconsistent result with function mod() on float

2015-01-07 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1952:
-

 Summary: Inconsistent result with function mod() on float
 Key: DRILL-1952
 URL: https://issues.apache.org/jira/browse/DRILL-1952
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Aman Sinha


#Fri Jan 02 21:20:47 EST 2015
git.commit.id.abbrev=b491cdb

mod() operation on float give inconsistent result. Test data can be accessed at 
https://s3.amazonaws.com/apache-drill/files/complex.json.gz

{code}
0: jdbc:drill:schema=dfs.drillTestDirMondrian> select t.sfa[1], mod(t.sfa[1], 
10) sfamod from `complex.json` t limit 20;
+++
|   EXPR$0   |   sfamod   |
+++
| 1.01   | 1.01   |
| 2.01   | 2.01   |
| 3.01   | 3.01   |
| 4.01   | 4.01   |
| 5.01   | 5.01   |
| 6.01   | 6.01   |
| 7.01   | 7.01   |
| 8.01   | 8.01   |
| 9.01   | 9.01   |
| 10.01  | 0.009787 |
| 11.01  | 1.0098 |
| 12.01  | 2.01   |
| 13.01  | 3.01   |
| 14.01  | 4.01   |
| 15.01  | 5.01   |
| 16.01  | 6.012 |
| 17.01  | 7.012 |
| 18.01  | 8.012 |
| 19.01  | 9.012 |
| 20.01  | 0.011563 |
+++
20 rows selected (0.112 seconds)
{code}

physical plan

{code}
0: jdbc:drill:schema=dfs.drillTestDirMondrian> explain plan for select 
t.sfa[1], mod(t.sfa[1], 10) sfamod from `complex.json` t limit 20;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(EXPR$0=[$0], sfamod=[$1])
00-02SelectionVectorRemover
00-03  Limit(fetch=[20])
00-04Project(EXPR$0=[ITEM($0, 1)], sfamod=[MOD(ITEM($0, 1), 10)])
00-05  Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`sfa`[1]], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1962) accessing nested array from multiple files causing IndexOutOfBoundException

2015-01-08 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1962:
-

 Summary: accessing nested array from multiple files causing 
IndexOutOfBoundException
 Key: DRILL-1962
 URL: https://issues.apache.org/jira/browse/DRILL-1962
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Operators
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin


#Wed Jan 07 18:54:07 EST 2015
git.commit.id.abbrev=35a350f

If the dataset contains nested array of array, and the data is contained in 
more than one file, accessing the second nested array cause 
IndexOutOfBoundsException. For example, with the following dataset:

{code}
{
"id": 2,
"oooa": {
"oa": {
"oab": {
"oabc": [
{
"rowId": 2
},
{
"rowValue1": [{"rv1":1, "rv2":2}, {"rva1":3, "rva2":4}],
"rowValue2": [{"rw1":1, "rw2":2}, {"rwa1":3, "rwa2":4}]
}
]
}
}
}
}
{code}

If you put it in two separate files in the same directory, query using wild 
card to accessing the two files at the second array level will cause the 
exception.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select 
t.oooa.oa.oab.oabc[1].rowValue1 from `jira2file/jira*.json` t;
++
|   EXPR$0   |
++
| [{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}] |
Query failed: Query failed: Failure while running fragment., index: -4, length: 
4 (expected: range(0, 16384)) [ 78235243-4f01-4ee3-9675-fc18bd1e66e3 on 
qa-node120.qa.lab:31010 ]
[ 78235243-4f01-4ee3-9675-fc18bd1e66e3 on qa-node120.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
query.
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
at sqlline.SqlLine.print(SqlLine.java:1809)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
at sqlline.SqlLine.dispatch(SqlLine.java:889)
at sqlline.SqlLine.begin(SqlLine.java:763)
at sqlline.SqlLine.start(SqlLine.java:498)
at sqlline.SqlLine.main(SqlLine.java:460)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ>
{code}

Same query on single file works.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select 
t.oooa.oa.oab.oabc[1].rowValue1 from `jira2file/jira1.json` t;
++
|   EXPR$0   |
++
| [{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}] |
++
{code}

stack trace:

{code}
2015-01-08 14:00:16,127 [2b51020e-daab-a903-ef7c-ef6f9bd606c7:frag:0:0] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.IndexOutOfBoundsException: index: -4, length: 4 (expected: range(0, 
16384))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:156) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:178) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
at io.netty.buffer.DrillBuf.getInt(DrillBuf.java:447) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
at 
org.apache.drill.exec.vector.UInt4Vector$Accessor.get(UInt4Vector.java:297) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.RepeatedMapVector$RepeatedMapAccessor.get(RepeatedMapVector.java:542)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.RepeatedMapReaderImpl.setPosition(RepeatedMapReaderImpl.java:90)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.RepeatedMapReaderImpl.setChildrenPosition(RepeatedMapReaderImpl.java:45)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.RepeatedMapReaderImpl.setPosition(RepeatedMapReaderImpl.java:96)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.test.generated.ProjectorGen4732.doEval(ProjectorTemplate.java:30)
 ~[na:na]
at 
org.apache.drill.exec.test.generated.ProjectorGen4732.projectRecords(ProjectorTemplate.java:64)
 ~[na:na]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:172)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerN

[jira] [Created] (DRILL-1964) Missing key elements in returned array of maps

2015-01-08 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1964:
-

 Summary: Missing key elements in returned array of maps
 Key: DRILL-1964
 URL: https://issues.apache.org/jira/browse/DRILL-1964
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Hanifi Gunes
Priority: Minor


#Wed Jan 07 18:54:07 EST 2015
git.commit.id.abbrev=35a350f

For an array of maps, if the schema for each map is not identical, with today's 
implementation, we suppose to display each map with all elements (keys) from 
all maps. This is not happening. For example, I have the following data:

{code}
{
"id": 2,
"oooa": {
"oa": {
"oab": {
"oabc": [
{
"rowId": 2
},
{
"rowValue1": [{"rv1":1, "rv2":2}, {"rva1":3, "rva2":4}],
"rowValue2": [{"rw1":1, "rw2":2}, {"rwa1":3, "rwa2":4}]
}
]
}
}
}
}
{code}

The following query gives:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.oooa.oa.oab.oabc from 
`jira2file/jira1.json` t;
++
|   EXPR$0   |
++
| 
[{"rowId":2,"rowValue1":[],"rowValue2":[]},{"rowValue1":[{"rv1":1,"rv2":2},{"rva1":3,"rva2":4}],"rowValue2":[{"rw1":1,"rw2":2},{"rwa1":3,"rwa2":4}]}]
 |
++
{code}

The returned result in a nicely formatted json form:

{code}
[
{
"rowId": 2,
"rowValue1": [],
"rowValue2": []
},
{
"rowValue1": [
{
"rv1": 1,
"rv2": 2
},
{
"rva1": 3,
"rva2": 4
}
],
"rowValue2": [
{
"rw1": 1,
"rw2": 2
},
{
"rwa1": 3,
"rwa2": 4
}
]
}
]
{code}

Notice the first map includes all keys from all maps. But the second map is 
missing the "rowId" key.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-1987) join with tons of duplicates hangs with hash join

2015-01-12 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1987:
-

 Summary: join with tons of duplicates hangs with hash join
 Key: DRILL-1987
 URL: https://issues.apache.org/jira/browse/DRILL-1987
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin


#Fri Jan 09 20:39:31 EST 2015
git.commit.id.abbrev=487d98e

With hash join enabled (default), the following join query hangs (running for 
about 30 min now). The join condition has mostly duplicates. Each table has 1 
million rows. Data can be downloaded here:

https://s3.amazonaws.com/apache-drill/files/complex.json.gz

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_mergejoin` = false;
+++
| ok |  summary   |
+++
| true   | planner.enable_mergejoin updated. |
+++
1 row selected (0.025 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_hashjoin` = true;
+++
| ok |  summary   |
+++
| true   | planner.enable_hashjoin updated. |
+++
1 row selected (0.045 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id, b.gbyi, a.str from 
`complex.json` a inner join `complex.json` b on a.gbyi=b.gbyi order by a.id 
limit 20;
++++
| id |gbyi|str |
++++
{code}

physical plan:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select a.id, 
b.gbyi, a.str from `complex.json` a inner join `complex.json` b on 
a.gbyi=b.gbyi order by a.id limit 20;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(id=[$0], gbyi=[$1], str=[$2])
00-02SelectionVectorRemover
00-03  Limit(fetch=[20])
00-04SingleMergeExchange(sort0=[0 ASC])
01-01  SelectionVectorRemover
01-02TopN(limit=[20])
01-03  HashToRandomExchange(dist0=[[$0]])
02-01Project(id=[$1], gbyi=[$3], str=[$2])
02-02  HashJoin(condition=[=($0, $3)], joinType=[inner])
02-04HashToRandomExchange(dist0=[[$0]])
03-01  Project(gbyi=[$0], id=[$2], str=[$1])
03-02Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`gbyi`, `id`, `str`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
02-03Project(gbyi0=[$0])
02-05  HashToRandomExchange(dist0=[[$0]])
04-01Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`gbyi`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}

If I turn merge join on, the query finishes rather quickly, like within a 
minute.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_hashjoin` = false;
+++
| ok |  summary   |
+++
| true   | planner.enable_hashjoin updated. |
+++
1 row selected (0.026 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_mergejoin` = true;
+++
| ok |  summary   |
+++
| true   | planner.enable_mergejoin updated. |
+++
1 row selected (0.024 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select a.id, 
b.gbyi, a.str from `complex.json` a inner join `complex.json` b on 
a.gbyi=b.gbyi order by a.id limit 20;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(id=[$0], gbyi=[$1], str=[$2])
00-02SelectionVectorRemover
00-03  Limit(fetch=[20])
00-04SingleMergeExchange(sort0=[0 ASC])
01-01  SelectionVectorRemover
01-02TopN(limit=[20])
01-03  HashToRandomExchange(dist0=[[$0]])
02-01Project(id=[$1], gbyi=[$3], str=[$2])
02-02  MergeJoin(condition=[=($0, $3)], joinType=[inner])
02-04SelectionVectorRemover
02-06  Sort(sort0=[$0], dir0=[ASC])
02-08HashToRandomExchange(dist0=[[$0]])
03-01  Project(gbyi=[$0], id=[$2], str=[$1])
03-02Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`gbyi`, `id`, `str`], 

[jira] [Created] (DRILL-1988) join returned maps are all empty

2015-01-12 Thread Chun Chang (JIRA)
Chun Chang created DRILL-1988:
-

 Summary: join returned maps are all empty
 Key: DRILL-1988
 URL: https://issues.apache.org/jira/browse/DRILL-1988
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin


#Fri Jan 09 20:39:31 EST 2015
git.commit.id.abbrev=487d98e

For complex json type, a join query returned all maps with empty value. The 
actual data has empty maps for some rows, but mostly with value. Data can be 
downloaded from:

https://s3.amazonaws.com/apache-drill/files/complex.json.gz

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id, a.soa[3].str, 
b.soa[3].str, a.ooa[1].fl from `complex.json` a inner join `complex.json` b on 
a.soa[3].str=b.soa[3].str order by a.id limit 10;
+++++
| id |   EXPR$1   |   EXPR$2   |   EXPR$3   |
+++++
| 1  | here is a string at row 1 | here is a string at row 1 | {}   
  |
| 2  | here is a string at row 2 | here is a string at row 2 | {}   
  |
| 3  | here is a string at row 3 | here is a string at row 3 | {}   
  |
| 4  | here is a string at row 4 | here is a string at row 4 | {}   
  |
| 5  | here is a string at row 5 | here is a string at row 5 | {}   
  |
| 6  | here is a string at row 6 | here is a string at row 6 | {}   
  |
| 7  | here is a string at row 7 | here is a string at row 7 | {}   
  |
| 8  | here is a string at row 8 | here is a string at row 8 | {}   
  |
| 9  | here is a string at row 9 | here is a string at row 9 | {}   
  |
| 10 | here is a string at row 10 | here is a string at row 10 | {} 
|
+++++
{code}

As you can see from the following query, maps is not empty for most of the row 
IDs.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id, a.ooa[1].fl from 
`complex.json` a limit 10;
+++
| id |   EXPR$1   |
+++
| 1  | {"f1":1.6789,"f2":54331.0} |
| 2  | {} |
| 3  | {"f1":3.6789,"f2":54351.0} |
| 4  | {"f1":4.6789,"f2":54361.0} |
| 5  | {"f1":5.6789,"f2":54371.0} |
| 6  | {} |
| 7  | {"f1":7.6789,"f2":54391.0} |
| 8  | {} |
| 9  | {"f1":9.6789,"f2":54411.0} |
| 10 | {"f1":10.6789,"f2":54421.0} |
+++
{code}

physical plan:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select a.id, 
a.soa[3].str, b.soa[3].str, a.ooa[1].fl from `complex.json` a inner join 
`complex.json` b on a.soa[3].str=b.soa[3].str order by a.id limit 10;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(id=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
00-02SelectionVectorRemover
00-03  Limit(fetch=[10])
00-04SingleMergeExchange(sort0=[0 ASC])
01-01  SelectionVectorRemover
01-02TopN(limit=[10])
01-03  HashToRandomExchange(dist0=[[$0]])
02-01Project(id=[$0], EXPR$1=[$2], EXPR$2=[$5], EXPR$3=[$3])
02-02  HashJoin(condition=[=($1, $4)], joinType=[inner])
02-04HashToRandomExchange(dist0=[[$1]])
03-01  Project(id=[$2], $f4=[ITEM(ITEM($1, 3), 'str')], 
ITEM=[ITEM(ITEM($1, 3), 'str')], ITEM3=[ITEM(ITEM($0, 1), 'fl')])
03-02Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`id`, `soa`[3].`str`, `ooa`[1].`fl`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
02-03Project($f40=[$0], ITEM0=[$1])
02-05  HashToRandomExchange(dist0=[[$0]])
04-01Project($f4=[ITEM(ITEM($0, 3), 'str')], 
ITEM=[ITEM(ITEM($0, 3), 'str')])
04-02  Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`soa`[3].`str`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2010) merge join returns wrong number of rows with large dataset

2015-01-13 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2010:
-

 Summary: merge join returns wrong number of rows with large dataset
 Key: DRILL-2010
 URL: https://issues.apache.org/jira/browse/DRILL-2010
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Operators
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin


#Mon Jan 12 18:19:31 EST 2015
git.commit.id.abbrev=5b012bf

When data set is big enough (like larger than one batch size), merge join will 
not returns the correct number of rows. Hash join returns the correct number of 
rows. Data can be downloaded from:

https://s3.amazonaws.com/apache-drill/files/complex100k.json.gz

With this dataset, the following query should return 10,000,000. 

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_mergejoin` = true;
+++
| ok |  summary   |
+++
| true   | planner.enable_mergejoin updated. |
+++
1 row selected (0.024 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_hashjoin` = false;
+++
| ok |  summary   |
+++
| true   | planner.enable_hashjoin updated. |
+++
1 row selected (0.024 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(a.id) from 
`complex100k.json` a inner join `complex100k.json` b on a.gbyi=b.gbyi;
++
|   EXPR$0   |
++
| 9046760|
++
1 row selected (6.205 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_mergejoin` = false;
+++
| ok |  summary   |
+++
| true   | planner.enable_mergejoin updated. |
+++
1 row selected (0.026 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> alter session set 
`planner.enable_hashjoin` = true;
+++
| ok |  summary   |
+++
| true   | planner.enable_hashjoin updated. |
+++
1 row selected (0.024 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(a.id) from 
`complex100k.json` a inner join `complex100k.json` b on a.gbyi=b.gbyi;
++
|   EXPR$0   |
++
| 1000   |
++
1 row selected (4.453 seconds)
{code}

With smaller dataset, both merge and hash join returns the same correct number.

physical plan for merge join:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select 
count(a.id) from `complex100k.json` a inner join `complex100k.json` b on 
a.gbyi=b.gbyi;
+++
|text|json|
+++
| 00-00Screen
00-01  StreamAgg(group=[{}], EXPR$0=[COUNT($0)])
00-02Project(id=[$1])
00-03  MergeJoin(condition=[=($0, $2)], joinType=[inner])
00-05SelectionVectorRemover
00-07  Sort(sort0=[$0], dir0=[ASC])
00-09Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex100k.json, numFiles=1, 
columns=[`gbyi`, `id`], 
files=[maprfs:/drill/testdata/complex_type/json/complex100k.json]]])
00-04Project(gbyi0=[$0])
00-06  SelectionVectorRemover
00-08Sort(sort0=[$0], dir0=[ASC])
00-10  Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex100k.json, numFiles=1, 
columns=[`gbyi`], 
files=[maprfs:/drill/testdata/complex_type/json/complex100k.json]]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2050) remote rpc exception

2015-01-21 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2050:
-

 Summary: remote rpc exception
 Key: DRILL-2050
 URL: https://issues.apache.org/jira/browse/DRILL-2050
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - RPC
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Jacques Nadeau


git.branch=8d1e1affe86a5adca3bc17eeaf7520f0d379a393
git.commit.time=20.01.2015 @ 23\:02\:03 PST

The following tpcds-implala-sf1 automation query caused RemoteRpcException. 
This query works most of the time. But fails on average once in every four or 
five tries. Test data can be downloaded from

http://apache-drill.s3.amazonaws.com/files/tpcds-sf1-parquet.tgz

{code}
0: jdbc:drill:schema=dfs.drillTestDir> select
. . . . . . . . . . . . . . . . . . .>   *
. . . . . . . . . . . . . . . . . . .> from
. . . . . . . . . . . . . . . . . . .>   (select
. . . . . . . . . . . . . . . . . . .> i.i_manufact_id as imid,
. . . . . . . . . . . . . . . . . . .> sum(ss.ss_sales_price) sum_sales
. . . . . . . . . . . . . . . . . . .> -- avg(sum(ss.ss_sales_price)) over 
(partition by i.i_manufact_id) avg_quarterly_sales
. . . . . . . . . . . . . . . . . . .>   from
. . . . . . . . . . . . . . . . . . .> item as i,
. . . . . . . . . . . . . . . . . . .> store_sales as ss,
. . . . . . . . . . . . . . . . . . .> date_dim as d,
. . . . . . . . . . . . . . . . . . .> store as s
. . . . . . . . . . . . . . . . . . .>   where
. . . . . . . . . . . . . . . . . . .> ss.ss_item_sk = i.i_item_sk
. . . . . . . . . . . . . . . . . . .> and ss.ss_sold_date_sk = d.d_date_sk
. . . . . . . . . . . . . . . . . . .> and ss.ss_store_sk = s.s_store_sk
. . . . . . . . . . . . . . . . . . .> and d.d_month_seq in (1212, 1212 + 
1, 1212 + 2, 1212 + 3, 1212 + 4, 1212 + 5, 1212 + 6, 1212 + 7, 1212 + 8, 1212 + 
9, 1212 + 10, 1212 + 11)
. . . . . . . . . . . . . . . . . . .> and ((i.i_category in ('Books', 
'Children', 'Electronics')
. . . . . . . . . . . . . . . . . . .>   and i.i_class in ('personal', 
'portable', 'reference', 'self-help')
. . . . . . . . . . . . . . . . . . .>   and i.i_brand in 
('scholaramalgamalg #14', 'scholaramalgamalg #7', 'exportiunivamalg #9', 
'scholaramalgamalg #9'))
. . . . . . . . . . . . . . . . . . .> or (i.i_category in ('Women', 
'Music', 'Men')
. . . . . . . . . . . . . . . . . . .>   and i.i_class in ('accessories', 
'classical', 'fragrances', 'pants')
. . . . . . . . . . . . . . . . . . .>   and i.i_brand in ('amalgimporto 
#1', 'edu packscholar #1', 'exportiimporto #1', 'importoamalg #1')))
. . . . . . . . . . . . . . . . . . .> and ss.ss_sold_date_sk between 
2451911 and 2452275 -- partition key filter
. . . . . . . . . . . . . . . . . . .>   group by
. . . . . . . . . . . . . . . . . . .> i.i_manufact_id,
. . . . . . . . . . . . . . . . . . .> d.d_qoy
. . . . . . . . . . . . . . . . . . .>   ) tmp1
. . . . . . . . . . . . . . . . . . .> -- where
. . . . . . . . . . . . . . . . . . .> --   case when avg_quarterly_sales > 0 
then abs (sum_sales - avg_quarterly_sales) / avg_quarterly_sales else null end 
> 0.1
. . . . . . . . . . . . . . . . . . .> order by
. . . . . . . . . . . . . . . . . . .>   -- avg_quarterly_sales,
. . . . . . . . . . . . . . . . . . .>   sum_sales,
. . . . . . . . . . . . . . . . . . .>   tmp1.imid
. . . . . . . . . . . . . . . . . . .> limit 100;
+++
|imid| sum_sales  |
+++
Query failed: RemoteRpcException: Failure while running fragment., Attempted to 
close accountor with 1 buffer(s) still allocatedfor QueryId: 
2b401913-4292-26d4-18b0-f105afe06121, MajorFragmentId: 2, MinorFragmentId: 0.


Total 1 allocation(s) of byte size(s): 771, at stack location:

org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.takeOwnership(TopLevelAllocator.java:197)

org.apache.drill.exec.rpc.data.DataServer.handle(DataServer.java:119)

org.apache.drill.exec.rpc.data.DataServer.handle(DataServer.java:48)

org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:194)

org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:173)

io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)

io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)

io

[jira] [Resolved] (DRILL-1804) random failures while running large number of queries

2015-01-21 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-1804.
---
Resolution: Fixed

This appears fixed by various commits. Automation pass now.

> random failures while running large number of queries
> -
>
> Key: DRILL-1804
> URL: https://issues.apache.org/jira/browse/DRILL-1804
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.7.0
>Reporter: Chun Chang
>Assignee: Chris Westin
>Priority: Blocker
> Fix For: 0.8.0
>
>
> #Tue Dec 02 14:38:34 EST 2014
> git.commit.id.abbrev=757e9a2
> Running Mondrian regression tests, out of over 6000 queries, sometimes I get 
> one or two random failures. Here is the stack when it happens:
> 2014-12-02 17:49:32,271 [2b8193d3-f0ca-aa7c-094a-d8234d76d068:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - Error 
> aeae057b-ed0a-43aa-902d-fe3a41531511: Query failed: Unexpected exception 
> during fragment initialization.
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization.
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
>  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper. 
> Failure while accessing Zookeeper
>   at 
> org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 4 common frames omitted
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
>   at 
> org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53) 
> ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 10 common frames omitted
> Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: 
> KeeperErrorCode = NodeExists for 
> /drill/running/2b8193d3-f0ca-aa7c-094a-d8234d76d068
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) 
> ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) 
> ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676)
>  ~[curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660)
>  ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) 
> ~[curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656)
>  ~[curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441)
>  ~[curator-framework-2.5.0.jar:na]
>

[jira] [Created] (DRILL-2082) nested arrays of strings returned wrong results

2015-01-27 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2082:
-

 Summary: nested arrays of strings returned wrong results
 Key: DRILL-2082
 URL: https://issues.apache.org/jira/browse/DRILL-2082
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill/MapR)
Priority: Critical


#Mon Jan 26 14:10:51 PST 2015
git.commit.id.abbrev=3c6d0ef

Querying Complex JSON data type nested array of strings returned wrong results 
when data size is large (1 million row). Smaller data size (a few rows) 
returned correct results. Test data can be accessed at 
http://apache-drill.s3.amazonaws.com/files/complex.json.gz

For small data size, I got correct results:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.aaa from 
`aaa.json` t;
+++
| id |aaa |
+++
| 1  | [[["aa0 1"],["ab0 1"]],[["ba0 1"],["bb0 1"]],[["ca0 1","ca1 
1"],["cb0 1","cb1 1","cb2 1"]]] |
| 2  | [[["aa0 2"],["ab0 2"]],[["ba0 2"],["bb0 2"]],[["ca0 2","ca1 
2"],["cb0 2","cb1 2","cb2 2"]]] |
+++
{code}

But large data size returned wrong results:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id, t.aaa from 
`complex.json` t where t.id=1 limit 1;
+++
| id |aaa |
+++
| 1  | [[["ba0 56"],["bb0 56"],["ca0 56","ca1 56"],["cb0 56","cb1 
56","cb2 56"],["aa0 91"],["ab0 91"],["aa0 125"],["ab0 125"],["aa0 140"],["ab0 
140"],["aa0 142"],["ab0 142"],["aa0 146"],["ab0 146"],["ba0 402"],["bb0 
402"],["ca0 402","ca1 402"],["cb0 402","cb1 402","cb2 402"],["aa0 403"],["ab0 
403"],["ba0 403"],["bb0 403"],["ca0 403","ca1 403"],["cb0 403","cb1 403","cb2 
403"],["aa0 404"],["ab0 404"],["ba0 404"],["bb0 404"],["ca0 404","ca1 
404"],["cb0 404","cb1 404","cb2 404"],["aa0 405"],["ab0 405"],["ba0 405"],["bb0 
405"],["ca0 405","ca1 405"],["cb0 405","cb1 405","cb2 405"],["aa0 437"],["ab0 
437"],["aa0 485"],["ab0 485"],["aa0 503"],["ab0 503"],["aa0 569"],["ab0 
569"],["aa0 581"],["ab0 581"],["aa0 620"],["ab0 620"],["aa0 632"],["ab0 
632"],["aa0 640"],["ab0 640"],["aa0 650"],["ab0 650"],["aa0 669"],["ab0 
669"],["aa0 671"],["ab0 671"],["aa0 728"],["ab0 728"],["aa0 735"],["ab0 
735"],["aa0 772"],["ab0 772"],["aa0 784"],["ab0 784"],["aa0 811"],["ab0 
811"],["aa0 817"],["ab0 817"],["aa0 836"],["ab0 836"],["aa0 881"],["ab0 
881"],["aa0 891"],["ab0 891"],["aa0 924"],["ab0 924"],["aa0 1005"],["ab0 
1005"],["aa0 1057"],["ab0 1057"],["aa0 1086"],["ab0 1086"],["aa0 1089"],["ab0 
1089"],["aa0 1097"],["ab0 1097"],["aa0 1133"],["ab0 1133"],["aa0 1136"],["ab0 
1136"],["aa0 1146"],["ab0 1146"],["aa0 1169"],["ab0 1169"],["aa0 1178"],["ab0 
1178"],["aa0 1184"],["ab0 1184"],["aa0 1189"],["ab0 1189"],["aa0 1223"],["ab0 
1223"],["aa0 1275"],["ab0 1275"],["aa0 1290"],["ab0 1290"],["aa0 1295"],["ab0 
1295"],["aa0 1320"],["ab0 1320"],["aa0 1343"],["ab0 1343"],["aa0 1400"],["ab0 
1400"],["aa0 1426"],["ab0 1426"],["aa0 1442"],["ab0 1442"],["aa0 1455"],["ab0 
1455"],["aa0 1499"],["ab0 1499"],["aa0 1521"],["ab0 1521"],["aa0 1541"],["ab0 
1541"],["aa0 1557"],["ab0 1557"],["aa0 1578"

[jira] [Created] (DRILL-2083) order by on large dataset returns wrong results

2015-01-27 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2083:
-

 Summary: order by on large dataset returns wrong results
 Key: DRILL-2083
 URL: https://issues.apache.org/jira/browse/DRILL-2083
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Operators
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin
Priority: Critical


#Mon Jan 26 14:10:51 PST 2015
git.commit.id.abbrev=3c6d0ef

Test data has 1 million rows and can be accessed at 

http://apache-drill.s3.amazonaws.com/files/complex.json.gz

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count (t.id) from 
`complex.json` t;
++
|   EXPR$0   |
++
| 100|
++
{code}

But order by returned 30 more rows.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.id from `complex.json` 
t order by t.id;

| 97 |
| 98 |
| 99 |
| 100|
++
1,000,030 rows selected (19.449 seconds)
{code}

physical plan

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select t.id 
from `complex.json` t order by t.id;
+++
|text|json|
+++
| 00-00Screen
00-01  SingleMergeExchange(sort0=[0 ASC])
01-01SelectionVectorRemover
01-02  Sort(sort0=[$0], dir0=[ASC])
01-03HashToRandomExchange(dist0=[[$0]])
02-01  Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`id`], files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2166) left join with complex type throw ClassTransformationException

2015-02-04 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2166:
-

 Summary: left join with complex type throw 
ClassTransformationException
 Key: DRILL-2166
 URL: https://issues.apache.org/jira/browse/DRILL-2166
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin


#Thu Jan 29 18:00:57 EST 2015
git.commit.id.abbrev=09f7fb2

Dataset can be downloaded from 
https://s3.amazonaws.com/apache-drill/files/complex.json.gz

The following query caused the exception:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id, a.soa, b.sfa[0], 
b.soa[1] from `complex.json` a left outer join `complex.json` b on 
a.sia[0]=b.sia[0] order by a.id limit 20;
Query failed: RemoteRpcException: Failure while running fragment., Line 35, 
Column 32: No applicable constructor/method found for actual parameters "int, 
int, org.apache.drill.exec.vector.complex.MapVector"; candidates are: "public 
void org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
org.apache.drill.exec.vector.NullableTinyIntVector)", "public void 
org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
org.apache.drill.exec.vector.TinyIntVector)" [ 
fbf47be8-b5fe-4d56-9488-15d45d4224e4 on qa-node117.qa.lab:31010 ]
[ fbf47be8-b5fe-4d56-9488-15d45d4224e4 on qa-node117.qa.lab:31010 ]


Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

stack from drill bit.log

{code}
2015-02-04 13:37:22,117 [2b2d6eee-105b-5544-9111-83a3a356285d:frag:2:6] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
org.apache.drill.common.exceptions.DrillRuntimeException: 
org.apache.drill.exec.exception.SchemaChangeException: 
org.apache.drill.exec.exception.ClassTransformationException: 
java.util.concurrent.ExecutionException: 
org.apache.drill.exec.exception.ClassTransformationException: Failure 
generating transformation classes for value:

package org.apache.drill.exec.test.generated;

import org.apache.drill.exec.exception.SchemaChangeException;
import org.apache.drill.exec.ops.FragmentContext;
import org.apache.drill.exec.record.RecordBatch;
import org.apache.drill.exec.record.VectorContainer;
import org.apache.drill.exec.vector.NullableBigIntVector;
import org.apache.drill.exec.vector.NullableFloat8Vector;
import org.apache.drill.exec.vector.NullableTinyIntVector;
import org.apache.drill.exec.vector.complex.MapVector;
import org.apache.drill.exec.vector.complex.RepeatedMapVector;
{code}

from forman drill bit.log
{code}
2015-02-04 13:37:22,189 [BitServer-5] ERROR o.a.d.exec.rpc.RpcExceptionHandler 
- Exception in pipeline.  Closing channel between local /10.10.100.117:31012 
and remote /10.10.100.120:56250
io.netty.handler.codec.DecoderException: java.lang.NullPointerException
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99)
 [netty-codec-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
 [netty-codec-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:161)
 [netty-codec-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
 [netty-transport-4.0.24.Final.jar:4.0.24.Final]
at 
io.netty.channel.Defau

[jira] [Created] (DRILL-2197) hit no applicable constructor error running outer join involving list of maps

2015-02-09 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2197:
-

 Summary: hit no applicable constructor error running outer join 
involving list of maps
 Key: DRILL-2197
 URL: https://issues.apache.org/jira/browse/DRILL-2197
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Chris Westin


#Mon Feb 09 15:58:57 EST 2015
git.commit.id.abbrev=3d863b5

Dataset can be downloaded from 
https://s3.amazonaws.com/apache-drill/files/complex.json.gz

Following query caused the error:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id, b.oooi.oa.oab.oabc, 
b.ooof.oa.oab from `complex.json` a left outer join `complex.json` b on 
a.soa[2].fl=b.soa[2].fl order by a.id limit 20;
Query failed: RemoteRpcException: Failure while running fragment., Line 109, 
Column 32: No applicable constructor/method found for actual parameters "int, 
int, org.apache.drill.exec.vector.complex.MapVector"; candidates are: "public 
void org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
org.apache.drill.exec.vector.NullableTinyIntVector)", "public void 
org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
org.apache.drill.exec.vector.TinyIntVector)" [ 
73387a1a-1c50-4b1e-862e-e03772e7a291 on qa-node119.qa.lab:31010 ]
[ 73387a1a-1c50-4b1e-862e-e03772e7a291 on qa-node119.qa.lab:31010 ]


Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

physical plan:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select a.id, 
b.oooi.oa.oab.oabc, b.ooof.oa.oab from `complex.json` a left outer join 
`complex.json` b on a.soa[2].fl=b.soa[2].fl order by a.id limit 20;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(id=[$0], EXPR$1=[$1], EXPR$2=[$2])
00-02SelectionVectorRemover
00-03  Limit(fetch=[20])
00-04SingleMergeExchange(sort0=[0 ASC])
01-01  SelectionVectorRemover
01-02TopN(limit=[20])
01-03  HashToRandomExchange(dist0=[[$0]])
02-01Project(id=[$0], EXPR$1=[$3], EXPR$2=[$4])
02-02  HashJoin(condition=[=($1, $2)], joinType=[left])
02-04HashToRandomExchange(dist0=[[$1]])
03-01  Project(id=[$1], $f5=[ITEM(ITEM($0, 2), 'fl')])
03-02Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`id`, `soa`[2].`fl`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
02-03Project($f50=[$0], ITEM=[$1], ITEM2=[$2])
02-05  HashToRandomExchange(dist0=[[$0]])
04-01Project($f5=[ITEM(ITEM($0, 2), 'fl')], 
ITEM=[ITEM(ITEM(ITEM($2, 'oa'), 'oab'), 'oabc')], ITEM2=[ITEM(ITEM($1, 'oa'), 
'oab')])
04-02  Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`soa`[2].`fl`, `oooi`.`oa`.`oab`.`oabc`, `ooof`.`oa`.`oab`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}

log:

{code}
2015-02-09 17:16:17,736 [BitServer-1] INFO  o.a.drill.exec.work.foreman.Foreman 
- State change requested.  RUNNING --> FAILED
org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment., 
Line 109, Column 32: No applicable constructor/method found for actual 
parameters "int, int, org.apache.drill.exec.vector.complex.MapVector"; 
candidates are: "public void 
org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
org.apache.drill.exec.vector.NullableTinyIntVector)", "public void 
org.apache.drill.exec.vector.NullableTinyIntVector.copyFromSafe(int, int, 
org.apache.drill.exec.vector.TinyIntVector)" [ 
73387a1a-1c50-4b1e-862e-e03772e7a291 on qa-node119.qa.lab:31010 ]
[ 73387a1a-1c50-4b1e-862e-e03772e7a291 on qa-node119.qa.lab:31010 ]

at 
org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:95)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.control.WorkEventBus.status(WorkEventBus.java:75) 
[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.work.batch.ControlHandlerImpl.handle(ControlHandlerImpl.java:82)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.rpc.control.ControlServer.handle(ControlServer.java:60) 
[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.

[jira] [Created] (DRILL-2201) clear error message on join on complex type

2015-02-10 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2201:
-

 Summary: clear error message on join on complex type
 Key: DRILL-2201
 URL: https://issues.apache.org/jira/browse/DRILL-2201
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill/MapR)
Priority: Minor


#Mon Feb 09 15:58:57 EST 2015
git.commit.id.abbrev=3d863b5

Dataset can be downloaded from 
https://s3.amazonaws.com/apache-drill/files/complex.json.gz

We do not support join condition on complex type. But the error message is not 
clear to end user.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select a.id from `complex.json` 
a left outer join `complex.json` b on a.oooa=b.oooa;
Query failed: RemoteRpcException: Failure while running fragment., Failure 
while trying to materialize incoming schema.  Errors:

Error in expression at index 0.  Error: Missing function implementation: 
[hash(MAP-REQUIRED)].  Full expression: null.. [ 
6a61d61f-670f-4ddc-bb1d-09a47f49f38e on qa-node120.qa.lab:31010 ]
[ 6a61d61f-670f-4ddc-bb1d-09a47f49f38e on qa-node120.qa.lab:31010 ]


Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

Here oooa is a complex type:

{code}
{
"oooa": {
"oa": {
"oab": {
"oabc": [
{
"rowId": 1
},
{
"rowValue1": 1,
"rowValue2": 1
}
]
}
}
}
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2237) IllegalStateException with subquery on columns containing nulls

2015-02-12 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2237:
-

 Summary: IllegalStateException with subquery on columns containing 
nulls
 Key: DRILL-2237
 URL: https://issues.apache.org/jira/browse/DRILL-2237
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill/MapR)


#Mon Feb 09 15:58:57 EST 2015
git.commit.id.abbrev=3d863b5

The 'nul' columns contains null or strings. 

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.nul from `complex.json` 
t limit 5;
++
|nul |
++
| not null   |
| null   |
| not null   |
| not null   |
| not null   |
++
{code}

The following query caused the exception.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select 
t.nul from `complex.json` t) tt;
++
|   EXPR$0   |
++
Query failed: RemoteRpcException: Failure while running fragment., You tried to 
do a batch data read operation when you were in a state of STOP.  You can only 
do this type of operation when you are in a state of OK or OK_NEW_SCHEMA. [ 
a95efdb4-4ef9-4497-baaf-6401a3ec3a4e on qa-node119.qa.lab:31010 ]
[ a95efdb4-4ef9-4497-baaf-6401a3ec3a4e on qa-node119.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
query.
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
at sqlline.SqlLine.print(SqlLine.java:1809)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
at sqlline.SqlLine.dispatch(SqlLine.java:889)
at sqlline.SqlLine.begin(SqlLine.java:763)
at sqlline.SqlLine.start(SqlLine.java:498)
at sqlline.SqlLine.main(SqlLine.java:460)
{code}

log
{code}
2015-02-12 15:26:30,714 [2b22c958-ff5a-94d7-2f82-04221e8ea55c:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
RUNNING
2015-02-12 15:26:30,749 [2b22c958-ff5a-94d7-2f82-04221e8ea55c:frag:0:0] ERROR 
o.a.drill.exec.ops.FragmentContext - Fragment Context received failure.
java.lang.IllegalStateException: Needed to be in state INIT or IN_FLOAT8 but in 
mode IN_BIGINT
at 
org.apache.drill.exec.vector.complex.impl.SingleListWriter.float8(SingleListWriter.java:427)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:418)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:256)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch(JsonReader.java:208)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector(JsonReader.java:182)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.write(JsonReader.java:156) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:125)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:165) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.Abstr

[jira] [Created] (DRILL-2290) Very slow performance for a query involving nested map

2015-02-24 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2290:
-

 Summary: Very slow performance for a query involving nested map
 Key: DRILL-2290
 URL: https://issues.apache.org/jira/browse/DRILL-2290
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)


#Thu Feb 19 18:40:10 EST 2015
git.commit.id.abbrev=1ceddff

This query took 17 minutes to complete. Too long. I think this happened after 
the fix dealing with nested maps.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select b.id, a.ooa[1].fl.f1, 
b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner join `complex.json` b on 
a.ooa[1].fl.f1=b.ooa[1].fl.f1 order by b.id limit 20;
+++++
| id |   EXPR$1   |oooi|   EXPR$3   |
+++++
| 1  | 1.6789 | {"oa":{"oab":{"oabc":1}}} | 1.5678 |
| 3  | 3.6789 | {"oa":{"oab":{"oabc":3}}} | 3.5678 |
| 4  | 4.6789 | {"oa":{"oab":{"oabc":4}}} | 4.5678 |
| 5  | 5.6789 | {"oa":{"oab":{"oabc":5}}} | 5.5678 |
| 7  | 7.6789 | {"oa":{"oab":{"oabc":7}}} | 7.5678 |
| 9  | 9.6789 | {"oa":{"oab":{"oabc":9}}} | 9.5678 |
| 10 | 10.6789| {"oa":{"oab":{"oabc":10}}} | 10.5678|
| 11 | 11.6789| {"oa":{"oab":{"oabc":11}}} | 11.5678|
| 13 | 13.6789| {"oa":{"oab":{"oabc":13}}} | 13.5678|
| 14 | 14.6789| {"oa":{"oab":{"oabc":14}}} | 14.5678|
| 15 | 15.6789| {"oa":{"oab":{"oabc":15}}} | 15.5678|
| 16 | 16.6789| {"oa":{"oab":{"oabc":16}}} | 16.5678|
| 17 | 17.6789| {"oa":{"oab":{"oabc":17}}} | 17.5678|
| 18 | 18.6789| {"oa":{"oab":{"oabc":18}}} | 18.5678|
| 19 | 19.6789| {"oa":{"oab":{"oabc":19}}} | 19.5678|
| 20 | 20.6789| {"oa":{"oab":{"oabc":20}}} | 20.5678|
| 21 | 21.6789| {"oa":{"oab":{"oabc":21}}} | 21.5678|
| 22 | 22.6789| {"oa":{"oab":{"oabc":22}}} | 22.5678|
| 24 | 24.6789| {"oa":{"oab":{"oabc":24}}} | 24.5678|
| 25 | 25.6789| {"oa":{"oab":{"oabc":25}}} | 25.5678|
+++++
20 rows selected (1020.036 seconds)
{code}

The query deals just a little less than 1 million records so should not be that 
slow.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select 
b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner 
join `complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1) c;
++
|   EXPR$0   |
++
| 900190 |
++
1 row selected (700.516 seconds)
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2309) 'null' is counted with subquery

2015-02-24 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2309:
-

 Summary: 'null' is counted with subquery
 Key: DRILL-2309
 URL: https://issues.apache.org/jira/browse/DRILL-2309
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)
Priority: Critical


#Thu Feb 19 18:40:10 EST 2015
git.commit.id.abbrev=1ceddff

The following query returns correct count involving columns that contains null 
value.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select tt.gbyi, count(tt.nul) 
from (select t.id, t.gbyi, t.fl, t.nul from `complex.json` t) tt group by 
tt.gbyi order by tt.gbyi;
+++
|gbyi|   EXPR$1   |
+++
| 0  | 33580  |
| 1  | 33317  |
| 2  | 33438  |
| 3  | 33535  |
| 4  | 33369  |
| 5  | 32990  |
| 6  | 33661  |
| 7  | 33130  |
| 8  | 33362  |
| 9  | 33364  |
| 10 | 33229  |
| 11 | 33567  |
| 12 | 33379  |
| 13 | 33045  |
| 14 | 33305  |
+++
{code}

But if you add more aggregation to the query, the returned count is wrong (pay 
attention to the last column). 

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select tt.gbyi, sum(tt.id), 
avg(tt.fl), count(tt.nul) from (select t.id, t.gbyi, t.fl, t.nul from 
`complex.json` t) tt group by tt.gbyi order by tt.gbyi;
+++++
|gbyi|   EXPR$1   |   EXPR$2   |   EXPR$3   |
+++++
| 0  | 33445554017 | 499613.0956877819 | 66943  |
| 1  | 33209358334 | 500760.0252919893 | 66318  |
| 2  | 33369118041 | 498091.82200273 | 66994  |
| 3  | 33254533860 | 498696.5063226428 | 66683  |
| 4  | 33393965595 | 501125.64656145993 | 66638  |
| 5  | 33216885506 | 499961.32710397616 | 66439  |
| 6  | 33380205950 | 498875.3923256599 | 66911  |
| 7  | 33405849390 | 501093.43067788356 | 6  |
| 8  | 33136951190 | 498458.1044031481 | 66479  |
| 9  | 33319291474 | 499967.5392457864 | 66643  |
| 10 | 937 | 499190.47462408233 | 66787  |
| 11 | 33571590550 | 502095.86682194035 | 66863  |
| 12 | 33437342090 | 501708.8141502653 | 66647  |
| 13 | 33071800925 | 498896.453904129 | 66290  |
| 14 | 33448664191 | 501487.4206955959 | 66699  |
+++++
[code}

plan for the query returned the wrong result:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> explain plan for select tt.gbyi, 
sum(tt.id), avg(tt.fl), count(tt.nul) from (select t.id, t.gbyi, t.fl, t.nul 
from `complex.json` t) tt group by tt.gbyi order by tt.gbyi;
+++
|text|json|
+++
| 00-00Screen
00-01  Project(gbyi=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3])
00-02SingleMergeExchange(sort0=[0 ASC])
01-01  SelectionVectorRemover
01-02Sort(sort0=[$0], dir0=[ASC])
01-03  Project(gbyi=[$0], EXPR$1=[CASE(=($2, 0), null, $1)], 
EXPR$2=[CAST(/(CastHigh(CASE(=($4, 0), null, $3)), $4)):ANY], EXPR$3=[$5])
01-04HashAgg(group=[{0}], agg#0=[$SUM0($1)], agg#1=[$SUM0($2)], 
agg#2=[$SUM0($3)], agg#3=[$SUM0($4)], EXPR$3=[$SUM0($5)])
01-05  HashToRandomExchange(dist0=[[$0]])
02-01HashAgg(group=[{0}], agg#0=[$SUM0($1)], 
agg#1=[COUNT($1)], agg#2=[$SUM0($2)], agg#3=[COUNT($2)], EXPR$3=[COUNT()])
02-02  Project(gbyi=[$3], id=[$2], fl=[$1], nul=[$0])
02-03Scan(groupscan=[EasyGroupScan 
[selectionRoot=/drill/testdata/complex_type/json/complex.json, numFiles=1, 
columns=[`gbyi`, `id`, `fl`, `nul`], 
files=[maprfs:/drill/testdata/complex_type/json/complex.json]]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2340) count(*) fails with subquery containing limit

2015-02-27 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2340:
-

 Summary: count(*) fails with subquery containing limit
 Key: DRILL-2340
 URL: https://issues.apache.org/jira/browse/DRILL-2340
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)


#Wed Feb 25 17:07:31 EST 2015
git.commit.id.abbrev=f7ef5ec

count(*) with subquery containing limit works fine:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select 
t.soa[0] soa0, t.soa[1] soa1, t.soa[2] soa2 from `complex.json` t limit 20) 
tt;
++
|   EXPR$0   |
++
| 20 |
++
{code}

But if I remove the limit, query fails with IllegalStateException:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select 
t.soa[0] soa0, t.soa[1] soa1, t.soa[2] soa2 from `complex.json` t) tt;
++
|   EXPR$0   |
++
Query failed: RemoteRpcException: Failure while running fragment., You tried to 
do a batch data read operation when you were in a state of STOP.  You can only 
do this type of operation when you are in a state of OK or OK_NEW_SCHEMA. [ 
d3226020-a2b0-4497-948f-34ea2309ddb7 on qa-node120.qa.lab:31010 ]
[ d3226020-a2b0-4497-948f-34ea2309ddb7 on qa-node120.qa.lab:31010 ]


java.lang.RuntimeException: java.sql.SQLException: Failure while executing 
query.
at sqlline.SqlLine$IncrementalRows.hasNext(SqlLine.java:2514)
at sqlline.SqlLine$TableOutputFormat.print(SqlLine.java:2148)
at sqlline.SqlLine.print(SqlLine.java:1809)
at sqlline.SqlLine$Commands.execute(SqlLine.java:3766)
at sqlline.SqlLine$Commands.sql(SqlLine.java:3663)
at sqlline.SqlLine.dispatch(SqlLine.java:889)
at sqlline.SqlLine.begin(SqlLine.java:763)
at sqlline.SqlLine.start(SqlLine.java:498)
at sqlline.SqlLine.main(SqlLine.java:460)
{code}

Here is the exception in drill bit.log:

{code}
2015-02-27 14:17:32,247 [2b0f1303-61ec-2350-4b62-b6b29d11c534:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
RUNNING
2015-02-27 14:17:32,267 [2b0f1303-61ec-2350-4b62-b6b29d11c534:frag:0:0] ERROR 
o.a.drill.exec.ops.FragmentContext - Fragment Context received failure.
java.lang.IllegalStateException: Needed to be in state INIT or IN_FLOAT8 but in 
mode IN_BIGINT
at 
org.apache.drill.exec.vector.complex.impl.SingleListWriter.float8(SingleListWriter.java:427)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:418)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:256)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch(JsonReader.java:208)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector(JsonReader.java:182)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.fn.JsonReader.write(JsonReader.java:156) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:125)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:165) 
~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.

[jira] [Created] (DRILL-2348) 'null' is not treated correctly when compared with int

2015-02-27 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2348:
-

 Summary: 'null' is not treated correctly when compared with int
 Key: DRILL-2348
 URL: https://issues.apache.org/jira/browse/DRILL-2348
 Project: Apache Drill
  Issue Type: Bug
Reporter: Chun Chang
Priority: Critical


#Wed Feb 25 17:07:31 EST 2015
git.commit.id.abbrev=f7ef5ec

Dataset can be downloaded from 
https://s3.amazonaws.com/apache-drill/files/complex.json.gz

The following three query results do not add up.

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(tt.gbyi) from 
(select t.gbyi gbyi, t.ooa[0] ooa0, t.ooa[1] ooa1, t.ooa[2] ooa2 from 
`complex.json` t) tt where tt.ooa0.`in` <> tt.ooa1.`in`;
++
|   EXPR$0   |
++
++
No rows selected (22.952 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(tt.gbyi) from 
(select t.gbyi gbyi, t.ooa[0] ooa0, t.ooa[1] ooa1, t.ooa[2] ooa2 from 
`complex.json` t) tt where tt.ooa0.`in` = tt.ooa1.`in`;
++
|   EXPR$0   |
++
| 949954 |
++
1 row selected (23.053 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(tt.gbyi) from 
(select t.gbyi gbyi, t.ooa[0] ooa0, t.ooa[1] ooa1, t.ooa[2] ooa2 from 
`complex.json` t) tt;
++
|   EXPR$0   |
++
| 100|
++
1 row selected (13.242 seconds)
{code}

Without any comparison condition, the total count is 1,000,000. This is 
correct. But the two query results with <> and = does not add up to the total. 
I am not sure if this has anything to do with subquery with complex type. Will 
investigate more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2348) 'null' is not treated correctly when compared with int

2015-02-27 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang resolved DRILL-2348.
---
Resolution: Not a Problem

> 'null' is not treated correctly when compared with int
> --
>
> Key: DRILL-2348
> URL: https://issues.apache.org/jira/browse/DRILL-2348
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Chun Chang
>Assignee: Chun Chang
>Priority: Critical
>
> #Wed Feb 25 17:07:31 EST 2015
> git.commit.id.abbrev=f7ef5ec
> Dataset can be downloaded from 
> https://s3.amazonaws.com/apache-drill/files/complex.json.gz
> The following three query results do not add up.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(tt.gbyi) from 
> (select t.gbyi gbyi, t.ooa[0] ooa0, t.ooa[1] ooa1, t.ooa[2] ooa2 from 
> `complex.json` t) tt where tt.ooa0.`in` <> tt.ooa1.`in`;
> ++
> |   EXPR$0   |
> ++
> ++
> No rows selected (22.952 seconds)
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(tt.gbyi) from 
> (select t.gbyi gbyi, t.ooa[0] ooa0, t.ooa[1] ooa1, t.ooa[2] ooa2 from 
> `complex.json` t) tt where tt.ooa0.`in` = tt.ooa1.`in`;
> ++
> |   EXPR$0   |
> ++
> | 949954 |
> ++
> 1 row selected (23.053 seconds)
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(tt.gbyi) from 
> (select t.gbyi gbyi, t.ooa[0] ooa0, t.ooa[1] ooa1, t.ooa[2] ooa2 from 
> `complex.json` t) tt;
> ++
> |   EXPR$0   |
> ++
> | 100|
> ++
> 1 row selected (13.242 seconds)
> {code}
> Without any comparison condition, the total count is 1,000,000. This is 
> correct. But the two query results with <> and = does not add up to the 
> total. I am not sure if this has anything to do with subquery with complex 
> type. Will investigate more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2349) null comparison behavior difference between drill and postgres

2015-02-27 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2349:
-

 Summary: null comparison behavior difference between drill and 
postgres
 Key: DRILL-2349
 URL: https://issues.apache.org/jira/browse/DRILL-2349
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)


It appears postgres returns null for the following comparison:

{code}
foodmart=# select null=1;
 ?column?
--

(1 row)

foodmart=# select null <> 1;
 ?column?
--

(1 row)

foodmart=# select null;
 ?column?
--

(1 row)
{code}

Drill gives sqlValidatorException:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select null=1 from sys.drillbits;
Query failed: SqlValidatorException: Cannot apply '=' to arguments of type 
' = '. Supported form(s): ' = '

Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2366) Query nested lists causing IllegalArgumentException

2015-03-03 Thread Chun Chang (JIRA)
Chun Chang created DRILL-2366:
-

 Summary: Query nested lists causing IllegalArgumentException
 Key: DRILL-2366
 URL: https://issues.apache.org/jira/browse/DRILL-2366
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.8.0
Reporter: Chun Chang
Assignee: Daniel Barclay (Drill)


#Wed Feb 25 17:07:31 EST 2015
git.commit.id.abbrev=f7ef5ec

I have nested array in a json file looks like this:

{code}
 "aaa":[[["aa0 1"], ["ab0 1"]], [["ba0 1"], ["bb0 1"]],[["ca0 1", "ca1 
1"],["cb0 1", "cb1 1", "cb2 1"]]]

0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.aaa from `complex.json` 
t limit 1;
++
|aaa |
++
| [[["aa0 1"],["ab0 1"]],[["ba0 1"],["bb0 1"]],[["ca0 1","ca1 1"],["cb0 1","cb1 
1","cb2 1"]]] |
++
1 row selected (0.251 seconds)
0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select t.aaa[0] from 
`complex.json` t limit 1;
++
|   EXPR$0   |
++
| [["aa0 1"],["ab0 1"]] |
++
{code}

When I drill down further, I got the following error:

{code}
0: jdbc:drill:schema=dfs.drillTestDirComplexJ>  select t.aaa[0][0] from 
`complex.json` t limit 1;
Query failed: RemoteRpcException: Failure while running fragment., You tried to 
read a [CopyAsValueList] type when you are using a field reader of type 
[RepeatedVarCharReaderImpl]. [ 32a45935-dd76-4044-9924-1ec0c5e22c2e on 
qa-node120.qa.lab:31010 ]
[ 32a45935-dd76-4044-9924-1ec0c5e22c2e on qa-node120.qa.lab:31010 ]


Error: exception while executing query: Failure while executing query. 
(state=,code=0)
{code}

drillbit.log
{code}
2015-03-03 14:49:04,925 [2b09c59e-b3e7-2bc1-cca3-a398a805119b:frag:0:0] WARN  
o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
fragment
java.lang.IllegalArgumentException: You tried to read a [CopyAsValueList] type 
when you are using a field reader of type [RepeatedVarCharReaderImpl].
at 
org.apache.drill.exec.vector.complex.impl.AbstractFieldReader.fail(AbstractFieldReader.java:806)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.vector.complex.impl.AbstractFieldReader.copyAsValue(AbstractFieldReader.java:269)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.test.generated.ProjectorGen79382.doEval(ProjectorTemplate.java:82)
 ~[na:na]
at 
org.apache.drill.exec.test.generated.ProjectorGen79382.projectRecords(ProjectorTemplate.java:62)
 ~[na:na]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.doWork(ProjectRecordBatch.java:174)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext(LimitRecordBatch.java:113)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
at 
org

  1   2   >