[jira] [Created] (DRILL-4663) FileSystem properties Config block from filesystem plugin are not being applied for file writers

2016-05-10 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4663:
--

 Summary: FileSystem properties Config block from filesystem plugin 
are not being applied for file writers
 Key: DRILL-4663
 URL: https://issues.apache.org/jira/browse/DRILL-4663
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse


Currently all of the record writers create their own empty filesystem 
configuration upon initialization. They do not currently apply the custom 
configurations that are included in the plugin configuration, which prevents 
users from setting custom properties on the write path. If possible this 
configuration should be shared with the readers. If there is a need to isolate 
this from the configuration used for the readers, we should still add the 
configurations from the storage plugin config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4445) Remove extra code to work around mixture of arrays and Lists used in Logical and Physical query plan nodes

2016-04-20 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4445.

Resolution: Fixed

Fixed in d24205d4e795a1aab54b64708dde1e7deeca668b

> Remove extra code to work around mixture of arrays and Lists used in Logical 
> and Physical query plan nodes
> --
>
> Key: DRILL-4445
> URL: https://issues.apache.org/jira/browse/DRILL-4445
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
>
> The physical plan node classes for all of the operators currently use a mix 
> of arrays and Lists to refer to lists of incoming operators, expressions, and 
> other operator properties. This had lead to the introduction of several 
> utility methods for translating between the two representations, examples can 
> be seen in common/logical/data/Abstractbuilder.
> This isn't a major problem, but the new operator test framework uses these 
> classes as a primary interface for setting up the tests. It seemed worthwhile 
> to just refactor the classes to be consistent so that the tests would all be 
> similar. There are a few changes to execution code, but they are all just 
> trivial changes to use the list based interfaces (length vs size(), set() 
> instead of arr[i] = foo, etc.) as Jackson just transparently handles both 
> types the same (which is why this hasn't really been a problem).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4437) Implement framework for testing operators in isolation

2016-04-20 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4437.

Resolution: Fixed

Fixed in d93a3633815ed1c7efd6660eae62b7351a2c9739

> Implement framework for testing operators in isolation
> --
>
> Key: DRILL-4437
> URL: https://issues.apache.org/jira/browse/DRILL-4437
> Project: Apache Drill
>  Issue Type: Test
>  Components: Tools, Build & Test
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
> Fix For: 1.7.0
>
>
> Most of the tests written for Drill are end-to-end. We spin up a full 
> instance of the server, submit one or more SQL queries and check the results.
> While integration tests like this are useful for ensuring that all features 
> are guaranteed to not break end-user functionality overuse of this approach 
> has caused a number of pain points.
> Overall the tests end up running a lot of the exact same code, parsing and 
> planning many similar queries.
> Creating consistent reproductions of issues, especially edge cases found in 
> clustered environments can be extremely difficult. Even the simpler case of 
> testing cases where operators are able to handle a particular series of 
> incoming batches of records has required hacks like generating large enough 
> files so that the scanners happen to break them up into separate batches. 
> These tests are brittle as they make assumptions about how the scanners will 
> work in the future. An example of when this could break, we might do perf 
> evaluation to find out we should be producing larger batches in some cases. 
> Existing tests that are trying to test multiple batches by producing a few 
> more records than the current threshold for batch size would not be testing 
> the same code paths.
> We need to make more parts of the system testable without initializing the 
> entire Drill server, as well as making the different internal settings and 
> state of the server configurable for tests.
> This is a first effort to enable testing the physical operators in Drill by 
> mocking the components of the system necessary to enable operators to 
> initialize and execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4551) Add some missing functions that are generated by Tableau (cot, regex_matches, split_part, isdate)

2016-03-29 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4551:
--

 Summary: Add some missing functions that are generated by Tableau 
(cot, regex_matches, split_part, isdate)
 Key: DRILL-4551
 URL: https://issues.apache.org/jira/browse/DRILL-4551
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Assignee: Jason Altekruse


Several of these functions do not appear to be standard SQL functions, but they 
are available in several other popular databases like SQL Server, Oracle and 
Postgres.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4482) Avro no longer selects data correctly from a sub-structure

2016-03-09 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4482.

Resolution: Fixed
  Assignee: Jason Altekruse  (was: Stefán Baxter)

Fixed in 64ab0a8ec9d98bf96f4d69274dddc180b8efe263

> Avro no longer selects data correctly from a sub-structure
> --
>
> Key: DRILL-4482
> URL: https://issues.apache.org/jira/browse/DRILL-4482
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Avro
>Affects Versions: 1.6.0
>Reporter: Stefán Baxter
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.6.0
>
>
> Parquet:
> 0: jdbc:drill:zk=local> select s.client_ip.ip from 
> dfs.asa.`/processed/<>/transactions` as s limit 1;
> ++
> | EXPR$0 |
> ++
> | 87.55.171.210  |
> ++
> 1 row selected (1.184 seconds)
> Avro:
> 0: jdbc:drill:zk=local> select s.client_ip.ip from 
> dfs.asa.`/streaming/<>/transactions` as s limit 1;
> +-+
> | EXPR$0  |
> +-+
> | null|
> +-+
> 1 row selected (0.29 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4492) TestMergeJoinWithSchemaChanges depends on order files in a directory are read to pass, should be refactored

2016-03-08 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4492:
--

 Summary: TestMergeJoinWithSchemaChanges depends on order files in 
a directory are read to pass, should be refactored
 Key: DRILL-4492
 URL: https://issues.apache.org/jira/browse/DRILL-4492
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse
Assignee: amit hadke


I was running unit tests and saw a failure that seemed unrelated to the changes 
I was making. The test runs fine in isolation both from IntelliJ and the maven 
command line (with -Dtest=TestMergeJoinWithSchemaChanges in the java-exec 
module).

Not sure what about the particular test run made it change the order the files 
were read, but we cannot rely on any particular system to read the files in a 
given order. The test should be updated to remove this assumption.

This is the error I received on one run of the full unit tests:
{code}
testMissingAndNewColumns(TestMergeJoinWithSchemaChanges.java:265)
Caused by: org.apache.drill.common.exceptions.UserRemoteException: 
UNSUPPORTED_OPERATION ERROR: Sort doesn't currently supportsorts with 
changing schemas

Fragment 0:0

[Error Id: bf84bffb-f643-493b-9ed5-720eb18d55f2 on 10.1.10.225:31010]

  (org.apache.drill.exec.exception.SchemaChangeException) Sort currently only 
supports a single schema.
org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.build():146
org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext():442
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.RecordIterator.nextBatch():97
org.apache.drill.exec.record.RecordIterator.next():183
org.apache.drill.exec.record.RecordIterator.prepare():167
org.apache.drill.exec.physical.impl.join.JoinStatus.prepare():87
org.apache.drill.exec.physical.impl.join.MergeJoinBatch.innerNext():162
org.apache.drill.exec.record.AbstractRecordBatch.next():162

org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next():215
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162

{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4332) tests in TestFrameworkTest fail in Java 8

2016-03-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4332.

   Resolution: Fixed
Fix Version/s: (was: Future)
   1.6.0

Fixed in 447b093cd2b05bfeae001844a7e3573935e84389

> tests in TestFrameworkTest fail in Java 8
> -
>
> Key: DRILL-4332
> URL: https://issues.apache.org/jira/browse/DRILL-4332
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
>Assignee: Laurent Goujon
> Fix For: 1.6.0
>
>
> the following unit tests fail in Java 8:
> {noformat}
> TestFrameworkTest.testRepeatedColumnMatching
> TestFrameworkTest.testCSVVerificationOfOrder_checkFailure
> {noformat}
> The tests expect the query to fail with a specific error message. The message 
> generated by DrillTestWrapper.compareMergedVectors assumes a specific order 
> in a map keySet (which we shouldn't). In Java 8 it seems the order changed 
> which causes a slightly different error message



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4486) Expression serializer incorrectly serializes escaped characters

2016-03-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4486.

   Resolution: Fixed
Fix Version/s: 1.6.0

Fixed in 80316f3f8bef866720f99e609fe758ec8e0c4612

> Expression serializer incorrectly serializes escaped characters
> ---
>
> Key: DRILL-4486
> URL: https://issues.apache.org/jira/browse/DRILL-4486
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Steven Phillips
>Assignee: Steven Phillips
> Fix For: 1.6.0
>
>
> the drill expression parser requires backslashes to be escaped. But the 
> ExpressionStringBuilder is not properly escaping them. This causes problems, 
> especially in the case of regex expressions run with parallel execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4375) Fix the maven release profile, broken by jdbc jar size enforcer added in DRILL-4291

2016-03-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4375.

   Resolution: Fixed
Fix Version/s: 1.6.0

Fixed in 1f29914fc5c7d1e36651ac28167804c4012501fe

> Fix the maven release profile, broken by jdbc jar size enforcer added in 
> DRILL-4291
> ---
>
> Key: DRILL-4375
> URL: https://issues.apache.org/jira/browse/DRILL-4375
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
> Fix For: 1.6.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4471) Add unit test for the Drill Web UI

2016-03-03 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4471:
--

 Summary: Add unit test for the Drill Web UI
 Key: DRILL-4471
 URL: https://issues.apache.org/jira/browse/DRILL-4471
 Project: Apache Drill
  Issue Type: Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse


While the Web UI isn't being very actively developed, a few times changes to 
the Drill build or internal parts of the server have broken parts of the Web UI.

As the web UI is a primary interface for viewing cluster information, 
cancelling queries, configuring storage and other tasks, we really should add 
automated tests for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4451) Improve operator unit tests to allow for direct inspection of the sequence of result batches

2016-02-26 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4451:
--

 Summary: Improve operator unit tests to allow for direct 
inspection of the sequence of result batches
 Key: DRILL-4451
 URL: https://issues.apache.org/jira/browse/DRILL-4451
 Project: Apache Drill
  Issue Type: Test
  Components: Tools, Build & Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse


The first version of the operator test framework allows for comparison of the 
result set with a baseline, but does not give a way to specify the expected 
batch boundaries. All of the batches are combined together before they are 
compared (sharing code with the existing test infrastructure for complete SQL 
queries).

The framework should also include a way to directly inspect SV2 and SV4 batches 
that are produced by operators like filter and sort. These structures are used 
to store a view into the incoming data (an SV2 is a bitmask for everything that 
matched the filter and an SV4 is used to represent cross-batch pointers to 
reflect the sorted order of a series of batches without rewriting them). 
Currently the test just follows the pointers to iterate over the values as they 
would appear after a rewrite of the data (by the SelectionVectorRemover 
operator).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4450) Improve operator unit tests to allow for setting custom options on a test

2016-02-26 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4450:
--

 Summary: Improve operator unit tests to allow for setting custom 
options on a test
 Key: DRILL-4450
 URL: https://issues.apache.org/jira/browse/DRILL-4450
 Project: Apache Drill
  Issue Type: Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse


The initial work done on the operator test framework included mocking of the 
system/session options just complete enough to get the first ~10 operators to 
execute a single query. These values are currently shared across all tests. To 
test all code paths we will need a way to set options from individual tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4448) Specification of Ordering (ASC, DESC) on a sort plan node uses Strings for it construction, should also allow for use of the corresponding Calcite Enums

2016-02-26 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4448:
--

 Summary: Specification of Ordering (ASC, DESC) on a sort plan node 
uses Strings for it construction, should also allow for use of the 
corresponding Calcite Enums
 Key: DRILL-4448
 URL: https://issues.apache.org/jira/browse/DRILL-4448
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4439) Improve new unit operator tests to handle operators that expect RawBatchBuffers off of the wire, such as the UnorderedReciever and MergingReciever

2016-02-25 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4439:
--

 Summary: Improve new unit operator tests to handle operators that 
expect RawBatchBuffers off of the wire, such as the UnorderedReciever and 
MergingReciever
 Key: DRILL-4439
 URL: https://issues.apache.org/jira/browse/DRILL-4439
 Project: Apache Drill
  Issue Type: Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4437) Implement framework for testing operators in isolation

2016-02-25 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4437:
--

 Summary: Implement framework for testing operators in isolation
 Key: DRILL-4437
 URL: https://issues.apache.org/jira/browse/DRILL-4437
 Project: Apache Drill
  Issue Type: Test
  Components: Tools, Build & Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.6.0


Most of the tests written for Drill are end-to-end. We spin up a full instance 
of the server, submit one or more SQL queries and check the results.

While integration tests like this are useful for ensuring that all features are 
guaranteed to not break end-user functionality overuse of this approach has 
caused a number of pain points.

Overall the tests end up running a lot of the exact same code, parsing and 
planning many similar queries.

Creating consistent reproductions of issues, especially edge cases found in 
clustered environments can be extremely difficult. Even the simpler case of 
testing cases where operators are able to handle a particular series of 
incoming batches of records has required hacks like generating large enough 
files so that the scanners happen to break them up into separate batches. These 
tests are brittle as they make assumptions about how the scanners will work in 
the future. An example of when this could break, we might do perf evaluation to 
find out we should be producing larger batches in some cases. Existing tests 
that are trying to test multiple batches by producing a few more records than 
the current threshold for batch size would not be testing the same code paths.

We need to make more parts of the system testable without initializing the 
entire Drill server, as well as making the different internal settings and 
state of the server configurable for tests.

This is a first effort to enable testing the physical operators in Drill by 
mocking the components of the system necessary to enable operators to 
initialize and execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4438) Fix out of memory failure identified by new operator unit tests

2016-02-25 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4438:
--

 Summary: Fix out of memory failure identified by new operator unit 
tests
 Key: DRILL-4438
 URL: https://issues.apache.org/jira/browse/DRILL-4438
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3930) Remove direct references to TopLevelAllocator from unit tests

2016-02-25 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3930.

   Resolution: Fixed
 Assignee: (was: Chris Westin)
Fix Version/s: 1.3.0

> Remove direct references to TopLevelAllocator from unit tests
> -
>
> Key: DRILL-3930
> URL: https://issues.apache.org/jira/browse/DRILL-3930
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Chris Westin
> Fix For: 1.3.0
>
>
> The RootAllocatorFactory should be used throughout the code to allow us to 
> change allocators via configuration or other software choices. Some unit 
> tests still reference TopLevelAllocator directly. We also need to do a better 
> job of handling exceptions that can be handled by close()ing an allocator 
> that isn't in the proper state (remaining open child allocators, outstanding 
> buffers, etc).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4394) Can’t build the custom functions for Apache Drill 1.5.0

2016-02-24 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4394.

Resolution: Fixed
  Assignee: Jason Altekruse

> Can’t build the custom functions for Apache Drill 1.5.0
> ---
>
> Key: DRILL-4394
> URL: https://issues.apache.org/jira/browse/DRILL-4394
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.5.0
>Reporter: Kumiko Yada
>Assignee: Jason Altekruse
>Priority: Critical
>
> I tried to build the custom functions for Drill 1.5.0, but I got the below 
> error:
> Failure to find org.apache.drill.exec:drill-java-exec:jar:1.5.0 in 
> http://repo.maven.apache.org/maven2 was cached in the local repository.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4435) Add YARN jar required for running Drill on cluster with Kerberos

2016-02-24 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4435:
--

 Summary: Add YARN jar required for running Drill on cluster with 
Kerberos
 Key: DRILL-4435
 URL: https://issues.apache.org/jira/browse/DRILL-4435
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Assignee: Jason Altekruse


As described here, Drill currently requires adding a YARN jar to the classpath 
to run on Kerberos. If it doesn't conflict with any jars currently included 
with Drill we should just include this in the distribution to make this work 
out of the box.

http://www.dremio.com/blog/securing-sql-on-hadoop-part-2-installing-and-configuring-drill/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3229) Create a new EmbeddedVector

2016-02-24 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3229.

   Resolution: Fixed
Fix Version/s: (was: Future)
   1.4.0

> Create a new EmbeddedVector
> ---
>
> Key: DRILL-3229
> URL: https://issues.apache.org/jira/browse/DRILL-3229
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Execution - Codegen, Execution - Data Types, Execution - 
> Relational Operators, Functions - Drill
>Reporter: Jacques Nadeau
>Assignee: Steven Phillips
> Fix For: 1.4.0
>
>
> Embedded Vector will leverage a binary encoding for holding information about 
> type for each individual field.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-284) Publish artifacts to maven for Drill

2016-02-24 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-284.
---
   Resolution: Fixed
Fix Version/s: (was: Future)
   1.1.0

> Publish artifacts to maven for Drill
> 
>
> Key: DRILL-284
> URL: https://issues.apache.org/jira/browse/DRILL-284
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Timothy Chen
> Fix For: 1.1.0
>
>
> We need to publish our artifacts and version to maven so other dependencies 
> (Whirr, or other ones that wants maven include) can use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4426) Review storage and format plugins like parquet, JSON, Avro, Hive, etc. to ensure they fail with useful error messages including filename, column, etc.

2016-02-23 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4426:
--

 Summary: Review storage and format plugins like parquet, JSON, 
Avro, Hive, etc. to ensure they fail with useful error messages including 
filename, column, etc.
 Key: DRILL-4426
 URL: https://issues.apache.org/jira/browse/DRILL-4426
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Assignee: Jason Altekruse


A number of these issues have been fixed in the past in individual instances. 
but we should review any remaining cases where a failure does not produce an 
error message with as much useful context information as possible. Filename 
should always be possible, column or record/line number where possible would be 
good.

One such case with a low level parquet failure was reported here.

http://search-hadoop.com/m/qRVAX48ao4xTDne/drill+Query+Return+Error+because+of+a+single+file=Query+Return+Error+because+of+a+single+file



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4383) Allow passign custom configuration options to a file system through the storage plugin config

2016-02-11 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4383:
--

 Summary: Allow passign custom configuration options to a file 
system through the storage plugin config
 Key: DRILL-4383
 URL: https://issues.apache.org/jira/browse/DRILL-4383
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Other
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.6.0


A similar feature already exists in the Hive and Hbase plugins, it simply 
provides a key/value map for passing custom configuration options to the 
underlying storage system.

This would be useful for the filesystem plugin to configure S3 without needing 
to create a core-site.xml file or restart Drill.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4230) NullReferenceException when SELECTing from empty mongo collection

2016-02-09 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4230.

   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in ed2f1ca8ed3c0ebac7e33494db6749851fc2c970

This was applied separately to the 1.5 release branch, so the commit there has 
identical content and the same commit message, but will have a different hash.

> NullReferenceException when SELECTing from empty mongo collection
> -
>
> Key: DRILL-4230
> URL: https://issues.apache.org/jira/browse/DRILL-4230
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - MongoDB
>Affects Versions: 1.3.0
>Reporter: Brick Shitting Bird Jr.
>Assignee: Jason Altekruse
> Fix For: 1.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4375) Fix the maven release profile, broken by jdbc jar size enforcer added in DRILL-4291

2016-02-08 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4375:
--

 Summary: Fix the maven release profile, broken by jdbc jar size 
enforcer added in DRILL-4291
 Key: DRILL-4375
 URL: https://issues.apache.org/jira/browse/DRILL-4375
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse
Assignee: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4295) Obsolete protobuf generated files under protocol/

2016-02-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4295.

Resolution: Fixed

Fixed in fbb0165def5e23b6b2f6a690d47dc5fbeb2bdbcb

> Obsolete protobuf generated files under protocol/
> -
>
> Key: DRILL-4295
> URL: https://issues.apache.org/jira/browse/DRILL-4295
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build & Test
>Reporter: Laurent Goujon
>Assignee: Laurent Goujon
>Priority: Trivial
> Fix For: 1.6.0
>
>
> The following two files don't have a protobuf definition anymore, and are not 
> generated when running {{mvn process-sources -P proto-compile}} under 
> {{protocol/}}:
> {noformat}
> src/main/java/org/apache/drill/exec/proto/beans/RpcFailure.java
> src/main/java/org/apache/drill/exec/proto/beans/ViewPointer.java
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4331) TestFlattenPlanning.testFlattenPlanningAvoidUnnecessaryProject fail in Java 8

2016-02-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4331.

Resolution: Fixed

Fixed in 32da4675e8bf1358b863532daadd2769f380600f

> TestFlattenPlanning.testFlattenPlanningAvoidUnnecessaryProject fail in Java 8
> -
>
> Key: DRILL-4331
> URL: https://issues.apache.org/jira/browse/DRILL-4331
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build & Test
>Affects Versions: 1.5.0
>Reporter: Deneche A. Hakim
> Fix For: 1.6.0
>
>
> This test expects the following Project in the query plan:
> {noformat}
> Project(EXPR$0=[$1], rownum=[$0])
> {noformat}
> In Java 8, for some reason the scan operator exposes the columns in reverse 
> order which causes the project to be different than the one expected:
> {noformat}
> Project(EXPR$0=[$0], rownum=[$1])
> {noformat}
> The plan is still correct, so the test must be fixed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4359) EndpointAffinity missing equals method

2016-02-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4359.

   Resolution: Fixed
Fix Version/s: 1.6.0

Fixed in 6b1b4d257b89e5579140e75388cd37db5563a6a8

> EndpointAffinity missing equals method
> --
>
> Key: DRILL-4359
> URL: https://issues.apache.org/jira/browse/DRILL-4359
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Laurent Goujon
>Assignee: Laurent Goujon
>Priority: Trivial
> Fix For: 1.6.0
>
>
> EndpointAffinity is a placeholder class, but has no equals method to allow 
> comparison.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4353) Expired sessions in web server are not cleaning up resources, leading to resource leak

2016-02-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4353.

Resolution: Fixed

Fixed in 282dfd762f1bd6628b293c68b20cdff321bd70a3

This was also merged into the 1.5 release branch, that commit has a different 
hash, but there were other changes that had already been merged into master 
that we didn't want to include in the release.

> Expired sessions in web server are not cleaning up resources, leading to 
> resource leak
> --
>
> Key: DRILL-4353
> URL: https://issues.apache.org/jira/browse/DRILL-4353
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.5.0
>Reporter: Venki Korukanti
>Assignee: Venki Korukanti
>Priority: Blocker
> Fix For: 1.5.0
>
>
> Currently we store the session resources (including DrillClient) in attribute 
> {{SessionAuthentication}} object which implements 
> {{HttpSessionBindingListener}}. Whenever a session is invalidated, all 
> attributes are removed and if an attribute class implements 
> {{HttpSessionBindingListener}}, listener is informed. 
> {{SessionAuthentication}} implementation of {{HttpSessionBindingListener}} 
> logs out the user which includes cleaning up the resources as well, but 
> {{SessionAuthentication}} relies on ServletContext stored in thread local 
> variable (see 
> [here|https://github.com/eclipse/jetty.project/blob/jetty-9.1.5.v20140505/jetty-security/src/main/java/org/eclipse/jetty/security/authentication/SessionAuthentication.java#L88]).
>  In case of thread that cleans up the expired sessions there is no 
> {{ServletContext}} in thread local variable, leading to not logging out the 
> user properly and resource leak.
> Fix: Add {{HttpSessionEventListener}} to cleanup the 
> {{SessionAuthentication}} and resources every time a HttpSession is expired 
> or invalidated.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4361) Allow for FileSystemPlugin subclasses to override FormatCreator

2016-02-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4361.

   Resolution: Fixed
Fix Version/s: 1.6.0

Fixed in 5e57b0e3b44f46aa93bf82f366eb3a3f61990da3

> Allow for FileSystemPlugin subclasses to override FormatCreator
> ---
>
> Key: DRILL-4361
> URL: https://issues.apache.org/jira/browse/DRILL-4361
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Laurent Goujon
>Assignee: Laurent Goujon
>Priority: Minor
> Fix For: 1.6.0
>
>
> FileSystemPlugin subclasses are not able to customize plugins, as 
> FormatCreator in created in FileSystemPlugin constructor and immediately used 
> to create SchemaFactory instance.
> FormatCreator instantiation should be moved to a protected method so that 
> subclass can choose to implement it differently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4225) TestDateFunctions#testToChar fails when the locale is non-English

2016-02-08 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4225.

   Resolution: Fixed
Fix Version/s: 1.6.0

Fixed in 4e9b82562cf0fc46e759b89857ffb85e129a178b

> TestDateFunctions#testToChar fails when the locale is non-English
> -
>
> Key: DRILL-4225
> URL: https://issues.apache.org/jira/browse/DRILL-4225
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.4.0
> Environment: Mac OS X 10.10.5
>Reporter: Akihiko Kusanagi
> Fix For: 1.6.0
>
>
> Set the locale to ja_JP on Mac OS X: 
> {noformat}
> $ defaults read -g AppleLocale
> ja_JP
> {noformat}
> TestDateFunctions#testToChar fails with the following output:
> {noformat}
> Running org.apache.drill.exec.fn.impl.TestDateFunctions#testToChar
> 2008-2-23
> 12 20 30
> 2008 2 23 12:00:00
> ...
> Tests run: 6, Failures: 1, Errors: 0, Skipped: 2, Time elapsed: 14.333 sec 
> <<< FAILURE! - in org.apache.drill.exec.fn.impl.TestDateFunctions
> testToChar(org.apache.drill.exec.fn.impl.TestDateFunctions)  Time elapsed: 
> 2.793 sec  <<< FAILURE!
> org.junit.ComparisonFailure: expected:<2008-[Feb]-23> but was:<2008-[2]-23>
>   at 
> org.apache.drill.exec.fn.impl.TestDateFunctions.testCommon(TestDateFunctions.java:66)
>   at 
> org.apache.drill.exec.fn.impl.TestDateFunctions.testToChar(TestDateFunctions.java:139)
> ...
> Failed tests: 
>   TestDateFunctions.testToChar:139->testCommon:66 expected:<2008-[Feb]-23> 
> but was:<2008-[2]-23>
> {noformat}
> Test queries are like this:
> {noformat}
> to_char((cast('2008-2-23' as date)), '-MMM-dd')
> to_char(cast('12:20:30' as time), 'HH mm ss')
> to_char(cast('2008-2-23 12:00:00' as timestamp), ' MMM dd HH:mm:ss')
> {noformat}
> This failure occurs because org.joda.time.format.DateTimeFormat interprets 
> the pattern 'MMM' differently depending on the locale. This will probably 
> occur in other OS platforms.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4032) Drill unable to parse json files with schema changes

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4032.

   Resolution: Fixed
Fix Version/s: 1.4.0

> Drill unable to parse json files with schema changes
> 
>
> Key: DRILL-4032
> URL: https://issues.apache.org/jira/browse/DRILL-4032
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - JSON
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Steven Phillips
>Priority: Blocker
> Fix For: 1.4.0
>
>
> git.commit.id.abbrev=bb69f22
> {code}
> select d.col2.col3  from reg1 d;
> Error: DATA_READ ERROR: Error parsing JSON - index: 0, length: 4 (expected: 
> range(0, 0))
> File  /drill/testdata/reg1/a.json
> Record  2
> Fragment 0:0
> {code}
> The folder reg1 contains 2 files
> File 1 : a.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": {"col3":"abc", "col4":"xyz"}}
> {code}
> File 2 : b.json
> {code}
> {"col1": "val1","col2": null}
> {"col1": "val1","col2": null}
> {code}
> Exception from the log file :
> {code}
> [Error Id: a7e3c716-838d-4f8f-9361-3727b98f04cd ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
>  ~[drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.handleAndRaise(JSONRecordReader.java:165)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next(JSONRecordReader.java:205)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:183) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:113)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:103)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:130)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:156)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:119)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:80)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at java.security.AccessController.doPrivileged(Native Method) 
> [na:1.7.0_71]
> at javax.security.auth.Subject.doAs(Subject.java:415) [na:1.7.0_71]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
>  [hadoop-common-2.7.0-mapr-1506.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
>  [drill-java-exec-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.3.0-SNAPSHOT.jar:1.3.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_71]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_71]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]
> Caused by: java.lang.IndexOutOfBoundsException: index: 0, length: 4 
> (expected: range(0, 0))
> at 

[jira] [Resolved] (DRILL-4048) Parquet reader corrupts dictionary encoded binary columns

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4048.

   Resolution: Fixed
Fix Version/s: 1.4.0

> Parquet reader corrupts dictionary encoded binary columns
> -
>
> Key: DRILL-4048
> URL: https://issues.apache.org/jira/browse/DRILL-4048
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.3.0
>Reporter: Rahul Challapalli
>Assignee: Jason Altekruse
>Priority: Blocker
> Fix For: 1.4.0
>
> Attachments: lineitem_dic_enc.parquet
>
>
> git.commit.id.abbrev=04c01bd
> The below query returns corrupted data (not even showing up here) for binary 
> columns
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   |  |  | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PE  | T   | 
> egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> The same query from an older build (git.commit.id.abbrev=839f8da)
> {code}
> select * from `lineitem_dic_enc.parquet` limit 1;
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | l_orderkey  | l_partkey  | l_suppkey  | l_linenumber  | l_quantity  | 
> l_extendedprice  | l_discount  | l_tax  | l_returnflag  | l_linestatus  | 
> l_shipdate  | l_commitdate  | l_receiptdate  |   l_shipinstruct   | 
> l_shipmode  |l_comment |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> | 1   | 1552   | 93 | 1 | 17.0| 
> 24710.35 | 0.04| 0.02   | N | O | 
> 1996-03-13  | 1996-02-12| 1996-03-22 | DELIVER IN PERSON  | TRUCK 
>   | egular courts above the  |
> +-+++---+-+--+-++---+---+-+---+++-+--+
> {code}
> Below is the output of the parquet-meta command for this dataset
> {code}
> creator: parquet-mr 
> file schema: root 
> ---
> l_orderkey:  REQUIRED INT32 R:0 D:0
> l_partkey:   REQUIRED INT32 R:0 D:0
> l_suppkey:   REQUIRED INT32 R:0 D:0
> l_linenumber:REQUIRED INT32 R:0 D:0
> l_quantity:  REQUIRED DOUBLE R:0 D:0
> l_extendedprice: REQUIRED DOUBLE R:0 D:0
> l_discount:  REQUIRED DOUBLE R:0 D:0
> l_tax:   REQUIRED DOUBLE R:0 D:0
> l_returnflag:REQUIRED BINARY O:UTF8 R:0 D:0
> l_linestatus:REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipdate:  REQUIRED INT32 O:DATE R:0 D:0
> l_commitdate:REQUIRED INT32 O:DATE R:0 D:0
> l_receiptdate:   REQUIRED INT32 O:DATE R:0 D:0
> l_shipinstruct:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_shipmode:  REQUIRED BINARY O:UTF8 R:0 D:0
> l_comment:   REQUIRED BINARY O:UTF8 R:0 D:0
> row group 1: RC:60175 TS:3049610 
> 

[jira] [Resolved] (DRILL-4243) CTAS with partition by, results in Out Of Memory

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4243.

   Resolution: Fixed
Fix Version/s: 1.5.0

> CTAS with partition by, results in Out Of Memory
> 
>
> Key: DRILL-4243
> URL: https://issues.apache.org/jira/browse/DRILL-4243
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.5.0
> Environment: 4 node cluster
>Reporter: Khurram Faraaz
> Fix For: 1.5.0
>
>
> CTAS with partition by, results in Out Of Memory. It seems to be coming from 
> ExternalSortBatch
> Details of Drill are
> {noformat}
> version   commit_id   commit_message  commit_time build_email 
> build_time
> 1.5.0-SNAPSHOTe4372f224a4b474494388356355a53808092a67a
> DRILL-4242: Updates to storage-mongo03.01.2016 @ 15:31:13 PST   
> Unknown 04.01.2016 @ 01:02:29 PST
>  create table `tpch_single_partition/lineitem` partition by (l_moddate) as 
> select l.*, l_shipdate - extract(day from l_shipdate) + 1 l_moddate from 
> cp.`tpch/lineitem.parquet` l;
> Error: RESOURCE ERROR: One or more nodes ran out of memory while 
> executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010] 
> (state=,code=0)
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory 
> while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1923)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:73)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:101)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:746)
>   at sqlline.SqlLine.runCommands(SqlLine.java:1651)
>   at sqlline.Commands.run(Commands.java:1304)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at 
> sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
>   at sqlline.SqlLine.dispatch(SqlLine.java:742)
>   at sqlline.SqlLine.initArgs(SqlLine.java:553)
>   at sqlline.SqlLine.begin(SqlLine.java:596)
>   at sqlline.SqlLine.start(SqlLine.java:375)
>   at sqlline.SqlLine.main(SqlLine.java:268)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: RESOURCE 
> ERROR: One or more nodes ran out of memory while executing the query.
> Fragment 0:0
> [Error Id: 3323fd1c-4b78-42a7-b311-23ee73c7d550 on atsqa4-193.qa.lab:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:119)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:113)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:46)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:31)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:69)
>   at org.apache.drill.exec.rpc.RpcBus$RequestEvent.run(RpcBus.java:400)
>   at 
> org.apache.drill.common.SerializedExecutor$RunnableProcessor.run(SerializedExecutor.java:105)
>   at 
> org.apache.drill.exec.rpc.RpcBus$SameExecutor.execute(RpcBus.java:264)
>   at 
> org.apache.drill.common.SerializedExecutor.execute(SerializedExecutor.java:142)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:298)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:269)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> 

[jira] [Resolved] (DRILL-4163) Support schema changes for MergeJoin operator.

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4163.

   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in cc9175c13270660ffd9ec2ddcbc70780dd72dada

> Support schema changes for MergeJoin operator.
> --
>
> Key: DRILL-4163
> URL: https://issues.apache.org/jira/browse/DRILL-4163
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: amit hadke
>Assignee: Jason Altekruse
> Fix For: 1.5.0
>
>
> Since external sort operator supports schema changes, allow use of union 
> types in merge join to support for schema changes.
> For now, we assume that merge join always works on record batches from sort 
> operator. Thus merging schemas and promoting to union vectors is already 
> taken care by sort operator.
> Test Cases:
> 1) Only one side changes schema (join on union type and primitive type)
> 2) Both sids change schema on all columns.
> 3) Join between numeric types and string types.
> 4) Missing columns - each batch has different columns. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4205) Simple query hit IndexOutOfBoundException

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4205.

   Resolution: Fixed
Fix Version/s: 1.5.0

>  Simple query hit IndexOutOfBoundException
> --
>
> Key: DRILL-4205
> URL: https://issues.apache.org/jira/browse/DRILL-4205
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.4.0
>Reporter: Dechang Gu
>Assignee: Dechang Gu
> Fix For: 1.5.0
>
>
> The following query failed due to IOB:
> 0: jdbc:drill:schema=wf_pigprq100> select * from 
> `store_sales/part-m-00073.parquet`;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: srcIndex: 1048587
> Fragment 0:0
> [Error Id: ad8d2bc0-259f-483c-9024-93865963541e on ucs-node4.perf.lab:31010]
>   (org.apache.drill.common.exceptions.DrillRuntimeException) Error in parquet 
> record reader.
> Message: 
> Hadoop path: /tpcdsPigParq/SF100/store_sales/part-m-00073.parquet
> Total records read: 135280
> Mock records read: 0
> Records to read: 1424
> Row group index: 0
> Records in row group: 3775712
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message pig_schema {
>   optional int64 ss_sold_date_sk;
>   optional int64 ss_sold_time_sk;
>   optional int64 ss_item_sk;
>   optional int64 ss_customer_sk;
>   optional int64 ss_cdemo_sk;
>   optional int64 ss_hdemo_sk;
>   optional int64 ss_addr_sk;
>   optional int64 ss_store_sk;
>   optional int64 ss_promo_sk;
>   optional int64 ss_ticket_number;
>   optional int64 ss_quantity;
>   optional double ss_wholesale_cost;
>   optional double ss_list_price;
>   optional double ss_sales_price;
>   optional double ss_ext_discount_amt;
>   optional double ss_ext_sales_price;
>   optional double ss_ext_wholesale_cost;
>   optional double ss_ext_list_price;
>   optional double ss_ext_tax;
>   optional double ss_coupon_amt;
>   optional double ss_net_paid;
>   optional double ss_net_paid_inc_tax;
>   optional double ss_net_profit;
> }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4192) Dir0 and Dir1 from drill-1.4 are messed up

2016-02-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4192.

   Resolution: Fixed
Fix Version/s: 1.5.0

> Dir0 and Dir1 from drill-1.4 are messed up
> --
>
> Key: DRILL-4192
> URL: https://issues.apache.org/jira/browse/DRILL-4192
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.4.0
>Reporter: Krystal
>Assignee: Aman Sinha
>Priority: Blocker
> Fix For: 1.5.0
>
>
> I have the following directories:
> /drill/testdata/temp1/abc/dt=2014-12-30/lineitem.parquet
> /drill/testdata/temp1/abc/dt=2014-12-31/lineitem.parquet
> The following queries returned incorrect data.
> select dir0,dir1 from dfs.`/drill/testdata/temp1` limit 2;
> ++---+
> |  dir0  | dir1  |
> ++---+
> | dt=2014-12-30  | null  |
> | dt=2014-12-30  | null  |
> ++---+
> select dir0 from dfs.`/drill/testdata/temp1` limit 2;
> ++
> |  dir0  |
> ++
> | dt=2014-12-31  |
> | dt=2014-12-31  |
> ++



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4336) Fix weird interaction between maven-release, maven-enforcer and RAT plugins

2016-02-01 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4336:
--

 Summary: Fix weird interaction between maven-release, 
maven-enforcer and RAT plugins
 Key: DRILL-4336
 URL: https://issues.apache.org/jira/browse/DRILL-4336
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse


While trying to make the 1.5.0 release I ran into a bizarre failure from RAT 
complaining about a file it should have been ignoring according to the plugin 
configuration.

Disabling the newly added maven-enforcer plugin "fixed" the issue, but we need 
to keep this in the build to make sure new dependencies don't creep into the 
JDBC driver that is supposed to be as small as possible.

For the sake of the release the jdbc-all jar's size was checked manually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4322) User exception created upon failed DROP TABLE eats the underlying exception

2016-01-28 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4322:
--

 Summary: User exception created upon failed DROP TABLE eats the 
underlying exception
 Key: DRILL-4322
 URL: https://issues.apache.org/jira/browse/DRILL-4322
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse
Assignee: Jason Altekruse


Reported in this thread on the list:

http://mail-archives.apache.org/mod_mbox/drill-user/201601.mbox/%3CCAMpYv7Cd%2BRuj5L5RAOOe4CoVNxjU6HOSuH2m0XTcyzjzuKiadw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4322) User exception created upon failed DROP TABLE eats the underlying exception

2016-01-28 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-4322.

   Resolution: Fixed
Fix Version/s: 1.5.0

Fixed in 1b51850f31c02f0a7fa77f0258a83081a9d9e826

> User exception created upon failed DROP TABLE eats the underlying exception
> ---
>
> Key: DRILL-4322
> URL: https://issues.apache.org/jira/browse/DRILL-4322
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jason Altekruse
>Assignee: Jason Altekruse
> Fix For: 1.5.0
>
>
> Reported in this thread on the list:
> http://mail-archives.apache.org/mod_mbox/drill-user/201601.mbox/%3CCAMpYv7Cd%2BRuj5L5RAOOe4CoVNxjU6HOSuH2m0XTcyzjzuKiadw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4259) Add new functional tests to ensure that failures can be detected independent of the testing environment

2016-01-11 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4259:
--

 Summary: Add new functional tests to ensure that failures can be 
detected independent of the testing environment
 Key: DRILL-4259
 URL: https://issues.apache.org/jira/browse/DRILL-4259
 Project: Apache Drill
  Issue Type: Test
Reporter: Jason Altekruse


In DRILL-4243 an out of memory issue was fixed after a change to the memory 
allocator made memory limits more strict. While the regression tests had been 
run by the team at Dremio prior to merging the patch, running the tests on a 
cluster with more cores changed the memory limits on the queries and caused 
several tests to fail.

While changes of this magnitude are not going to be common, we should have a 
test suite that reliably fails independent of the environment it is run 
(assuming that there are sufficient resources for the tests to run).

It would be good to at least try to reproduce this failure on a few different 
setups (cores, nodes in cluster) by adjusting available configuration options 
and adding tests with those different configurations so that the tests will 
fail in different environments.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4206) Move all_text_mode and read_numbers_as_double options to the JSON format plugin and out of system/session

2015-12-16 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4206:
--

 Summary: Move all_text_mode and read_numbers_as_double options to 
the JSON format plugin and out of system/session
 Key: DRILL-4206
 URL: https://issues.apache.org/jira/browse/DRILL-4206
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.5.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4179) Update UDF documentation now that classpath scanning is more strict

2015-12-09 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4179:
--

 Summary: Update UDF documentation now that classpath scanning is 
more strict
 Key: DRILL-4179
 URL: https://issues.apache.org/jira/browse/DRILL-4179
 Project: Apache Drill
  Issue Type: Improvement
  Components: Documentation
Reporter: Jason Altekruse


A few issues have come up with users that have UDFs that could be found with 
1.0-1.2, but fail to be loaded with 1.3. There were changes in 1.3 to speed up 
finding all UDFs on the classpath made the setup a little more strict.

Some discussions on the topic:
DRILL-4178

http://search-hadoop.com/m/qRVAXvthcn1xIHUm/+add+your+package+to+drill.classpath.scanning=Re+UDFs+and+1+3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4110) Avro tests are not verifying their results

2015-11-17 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4110:
--

 Summary: Avro tests are not verifying their results
 Key: DRILL-4110
 URL: https://issues.apache.org/jira/browse/DRILL-4110
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.3.0
Reporter: Jason Altekruse
Priority: Critical


A lot of tests have been written for the Avro format plugin that generate a 
variety of different files with various schema properties. These tests are 
currently just verifying that the files can be read without throwing any 
exceptions, but the results coming out of drill are not being verified. Some of 
these tests were fixed as a part of DRILL-4056, the rest still need to be 
refactored to add baseline verification checks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4028) Merge Drill parquet modifications back into the mainline project

2015-11-03 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-4028:
--

 Summary: Merge Drill parquet modifications back into the mainline 
project
 Key: DRILL-4028
 URL: https://issues.apache.org/jira/browse/DRILL-4028
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.3.0


Drill has been maintaining a fork of Parquet for over a year. The changes need 
to make it back into the main repository so we don't have to bother merging in 
all of the new changes from the master repository into the fork.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3943) CannotPlanException caused by ExpressionReductionRule

2015-10-16 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3943.

Resolution: Fixed

Fixed in 826144d89391dbadfc7fec84e633359c602bcd5a

> CannotPlanException caused by ExpressionReductionRule
> -
>
> Key: DRILL-3943
> URL: https://issues.apache.org/jira/browse/DRILL-3943
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Reporter: Sean Hsuan-Yi Chu
>Assignee: Jason Altekruse
> Fix For: 1.3.0
>
>
> For a query such as:
> {code}
> SELECT count(DISTINCT employee_id) as col1,
> count((to_number(date_diff(now(), cast(birth_date AS date)),''))) as col2
> FROM cp.`employee.json`
> {code}
> cannot plan exception will be thrown because ExpressionReductionRule does not 
> properly simply the call "now()".
> This issue is similar to DRILL-2808, but that one focuses on error message.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3939) Drill fails to parse valid JSON object

2015-10-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3939.

Resolution: Duplicate

> Drill fails to parse valid JSON object
> --
>
> Key: DRILL-3939
> URL: https://issues.apache.org/jira/browse/DRILL-3939
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.0.0, 1.1.0, 1.2.0
> Environment: Redhat Linux 6.7 Java 1.7 , 1.8
>Reporter: Jaroslaw Sosnicki
>
> The following valid JSON object queried from DRILL using various clients:
> --- t.json start---
> {
> "l1": {
> "f1": "text1",
> "f2": {
> "command": "list",
> "StorageArray": [
> {
> "array1": "Array1",
> "Pool": {
> "myPool": "PoolName"
> }
> },
> {
> "array2": "Arrays2",
> "Pool": [
> {
> "myPool": "PoolName1"
> },
> {
> "myPool": "PoolName2"
> }
> ]
> }
> ]
> }
> }
> }
> --- t.json end ---
> Generates the following error:
> 
> ERROR [HY000] [MapR][Drill] (1040) Drill failed to execute the query: SELECT 
> * FROM `dfs`.`hdvm`.`./t.json` LIMIT 100
> [30027]Query execution error. Details:[ 
> DATA_READ ERROR: You tried to write a Map type when you are using a 
> ValueWriter of type SingleMapWriter.
> File  /mapr/demo.mapr.com/data/hcs/hdvm/t.json
> Record  1
> Line  16
> Column  39
> Field  Pool
> Line  16
> Column  39
> Field  Pool
> Fragment 0:0
> [Error Id: 13e7a786-1135-410f-a4f0-877eab9222d6 on maprdemo:31010]
> ]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3899) SplitUpComplexExpressions rule should be enhanced to avoid planning unnecessary copies of data

2015-10-06 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3899:
--

 Summary: SplitUpComplexExpressions rule should be enhanced to 
avoid planning unnecessary copies of data
 Key: DRILL-3899
 URL: https://issues.apache.org/jira/browse/DRILL-3899
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse


A small enhancement was made as part of DRILL-3876 to remove an unnecessary 
copy in a simple flatten case. This was easy to implement, but did not cover 
all of the possible cases where the rewrite rule is currently planning 
inefficient operations. This issue is tracking the more complete fix to handle 
all of the more complex cases optimally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3783) Incorrect results : COUNT() over results returned by UNION ALL

2015-09-16 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3783.

Resolution: Not A Problem

> Incorrect results : COUNT() over results returned by UNION ALL 
> 
>
> Key: DRILL-3783
> URL: https://issues.apache.org/jira/browse/DRILL-3783
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
> Environment: 4 node cluster on CentOS
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Critical
> Fix For: 1.2.0
>
>
> Count over results returned union all query, returns incorrect results. The 
> below query returned an Exception (please se DRILL-2637) that JIRA was marked 
> as fixed, however the query returns incorrect results. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from (select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`);
> +-+
> | EXPR$0  |
> +-+
> | 11  |
> | 100 |
> | 10  |
> | 2   |
> | 50  |
> | 55  |
> | 67  |
> | 113 |
> | 119 |
> | 89  |
> | 57  |
> | 61  |
> +-+
> 12 rows selected (0.753 seconds)
> {code}
> Results returned by the query on LHS and RHS of Union all operator are
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c1 from 
> `testWindow.csv`;
> +--+
> |  c1  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.197 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select cast(columns[0] as int) c2 from 
> `testWindow.csv`;
> +--+
> |  c2  |
> +--+
> | 100  |
> | 10   |
> | 2|
> | 50   |
> | 55   |
> | 67   |
> | 113  |
> | 119  |
> | 89   |
> | 57   |
> | 61   |
> +--+
> 11 rows selected (0.173 seconds)
> {code}
> Note that enclosing the queries within correct parentheses returns correct 
> results. We do not want to return incorrect results to user when the 
> parentheses are missing.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select count(c1) from ((select cast(columns[0] 
> as int) c1 from `testWindow.csv`) union all (select cast(columns[0] as int) 
> c2 from `testWindow.csv`));
> +-+
> | EXPR$0  |
> +-+
> | 22  |
> +-+
> 1 row selected (0.234 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3669) fix missing direct dependency

2015-09-03 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3669.

Resolution: Fixed

Fixed in 4b8e85ad6fb40554e6752144f09bdfb474d62d9b

> fix missing direct dependency
> -
>
> Key: DRILL-3669
> URL: https://issues.apache.org/jira/browse/DRILL-3669
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Julien Le Dem
>Assignee: Jason Altekruse
> Attachments: DRILL-3669.1.patch.txt, DRILL-3669.2.patch.txt
>
>
> This prevents generating a compiling project with mvn eclipse:eclipse
> pull request here:
> https://github.com/apache/drill/pull/121/files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3122) Changing a session option to default value results in status as changed

2015-08-04 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3122.

Resolution: Fixed
  Assignee: Sudheesh Katkam  (was: Jason Altekruse)

 Changing a session option to default value results in status as changed
 ---

 Key: DRILL-3122
 URL: https://issues.apache.org/jira/browse/DRILL-3122
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 1.0.0
Reporter: Ramana Inukonda Nagaraj
Assignee: Sudheesh Katkam
 Fix For: 1.2.0

 Attachments: DRILL-3122.1.patch.txt


 Alter session option hash join to true(which is the default) and the 
 following query shows that the option has changed which could be misleading 
 to users relying on the status field to see if an option has changed or not. 
 Especially in the case of a boolean value. 
 {code}
 0: jdbc:drill:zk=10.10.100.171:5181 select * from sys.options where name 
 like '%hash%';
 ++--+-+--+-+-+---++
 |name|   kind   |  type   |  status  
 |   num_val   | string_val  | bool_val  | float_val  |
 ++--+-+--+-+-+---++
 | exec.max_hash_table_size   | LONG | SYSTEM  | DEFAULT  
 | 1073741824  | null| null  | null   |
 | exec.min_hash_table_size   | LONG | SYSTEM  | DEFAULT  
 | 65536   | null| null  | null   |
 | planner.enable_hash_single_key | BOOLEAN  | SYSTEM  | DEFAULT  
 | null| null| true  | null   |
 | planner.enable_hashagg | BOOLEAN  | SYSTEM  | DEFAULT  
 | null| null| true  | null   |
 | planner.enable_hashjoin| BOOLEAN  | SYSTEM  | DEFAULT  
 | null| null| true  | null   |
 | planner.enable_hashjoin_swap   | BOOLEAN  | SYSTEM  | DEFAULT  
 | null| null| true  | null   |
 | planner.join.hash_join_swap_margin_factor  | DOUBLE   | SYSTEM  | DEFAULT  
 | null| null| null  | 10.0   |
 | planner.memory.hash_agg_table_factor   | DOUBLE   | SYSTEM  | DEFAULT  
 | null| null| null  | 1.1|
 | planner.memory.hash_join_table_factor  | DOUBLE   | SYSTEM  | DEFAULT  
 | null| null| null  | 1.1|
 ++--+-+--+-+-+---++
 9 rows selected (0.191 seconds)
 0: jdbc:drill:zk=10.10.100.171:5181 alter session set 
 `planner.enable_hashjoin`=true;
 +---+---+
 |  ok   |  summary  |
 +---+---+
 | true  | planner.enable_hashjoin updated.  |
 +---+---+
 1 row selected (0.083 seconds)
 0: jdbc:drill:zk=10.10.100.171:5181 select * from sys.options where name 
 like '%hash%';
 ++--+--+--+-+-+---++
 |name|   kind   |   type   |  status  
 |   num_val   | string_val  | bool_val  | float_val  |
 ++--+--+--+-+-+---++
 | exec.max_hash_table_size   | LONG | SYSTEM   | DEFAULT  
 | 1073741824  | null| null  | null   |
 | exec.min_hash_table_size   | LONG | SYSTEM   | DEFAULT  
 | 65536   | null| null  | null   |
 | planner.enable_hash_single_key | BOOLEAN  | SYSTEM   | DEFAULT  
 | null| null| true  | null   |
 | planner.enable_hashagg | BOOLEAN  | SYSTEM   | DEFAULT  
 | null| null| true  | null   |
 | planner.enable_hashjoin| BOOLEAN  | SYSTEM   | DEFAULT  
 | null| null| true  | null   |
 | planner.enable_hashjoin| BOOLEAN  | SESSION  | CHANGED  
 | null| null| true  | null   |
 | planner.enable_hashjoin_swap   | BOOLEAN  | SYSTEM   | DEFAULT  
 | null| null| true  | null   |
 | planner.join.hash_join_swap_margin_factor  | DOUBLE   | SYSTEM   | DEFAULT  
 | null| null| null  | 10.0   |
 | planner.memory.hash_agg_table_factor   | DOUBLE   | SYSTEM   | 

[jira] [Resolved] (DRILL-3341) Move OperatorWrapper list and FragmentWrapper list creation to ProfileWrapper ctor

2015-08-04 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3341.

Resolution: Fixed
  Assignee: Sudheesh Katkam  (was: Jason Altekruse)

 Move OperatorWrapper list and FragmentWrapper list creation to ProfileWrapper 
 ctor
 --

 Key: DRILL-3341
 URL: https://issues.apache.org/jira/browse/DRILL-3341
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Sudheesh Katkam
Assignee: Sudheesh Katkam
Priority: Minor
 Fix For: 1.2.0

 Attachments: DRILL-3341.1.patch.txt, DRILL-3341.2.patch.txt


 + avoid re-computation in some cases
 + consistent comparator names



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3521) [umbrella] Review switch statements throughout codebase to add default cases where there are none

2015-07-20 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3521:
--

 Summary: [umbrella] Review switch statements throughout codebase 
to add default cases where there are none
 Key: DRILL-3521
 URL: https://issues.apache.org/jira/browse/DRILL-3521
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Reporter: Jason Altekruse
Assignee: Jason Altekruse


There are a number of places in the code that are missing default branches on 
case statements. One particular instance I noticed is in 
OptionValue.createOption, which returns null if passed an unexpected type. This 
and a few other places in the code could be made a little nicer to work with if 
we just provided the standard behavior of throwing an exception.

One additional note, in a number of places where we do have defaults, but the 
exception thrown is an UnsupportedOperationException with no message. 
DRILL-2680 discusses this problem, so this might be handled over there, but as 
we fill in the switch defaults we should try to avoid introducing the new 
problem of exceptions lacking descriptions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3483) Clarify CommonConstants' constants.

2015-07-16 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3483.

Resolution: Fixed

Fixed in 9b351c945b5f10d27cf07b9b5c1a435a029614b7

 Clarify CommonConstants' constants.
 ---

 Key: DRILL-3483
 URL: https://issues.apache.org/jira/browse/DRILL-3483
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)
Assignee: Jason Altekruse
 Fix For: 1.2.0


 Document, rename, and otherwise clean up CommonConstants' constants.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3484) Error using functions with no parameters when `drill.exec.functions.cast_empty_string_to_null` is set to true

2015-07-09 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3484:
--

 Summary: Error using functions with no parameters when 
`drill.exec.functions.cast_empty_string_to_null` is set to true
 Key: DRILL-3484
 URL: https://issues.apache.org/jira/browse/DRILL-3484
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.1.0, 1.0.0
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.2.0


There is an issue with function materialization when the function has no 
parameters and this option is set. This cases a low level IOOB exception to be 
thrown.

0: jdbc:drill:zk=local select *, random() from sys.drillbits;
Error: SYSTEM ERROR: IndexOutOfBoundsException: index (0) must be less than 
size (0)

Fragment 0:0

[Error Id: 5853c1da-ea8d-41c3-812c-2fdde799803b on 10.250.50.33:31010] 
(state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1094) Using events to parse JSON

2015-07-07 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1094.

   Resolution: Fixed
Fix Version/s: (was: Future)
   1.0.0

The issues described here have been fixed a few release ago, closing this to 
get it out of the list of future bugs.

 Using events to parse JSON
 --

 Key: DRILL-1094
 URL: https://issues.apache.org/jira/browse/DRILL-1094
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - JSON
Affects Versions: Future
Reporter: Sudheesh Katkam
Assignee: Neeraja
 Fix For: 1.0.0


 + Define events to parse JSON.
 + Add project pushdown and flatten.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2745) Query returns IOB Exception when JSON data with empty arrays is input to flatten function

2015-07-06 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2745.

Resolution: Not A Problem

 Query returns IOB Exception when JSON data with empty arrays is input to 
 flatten function
 -

 Key: DRILL-2745
 URL: https://issues.apache.org/jira/browse/DRILL-2745
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 0.9.0
 Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580: 
 Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53 
 EDT 
Reporter: Khurram Faraaz
Assignee: Khurram Faraaz
 Fix For: 1.2.0


 IOB Exception is returned when JSON file that has many empty arrays and 
 arrays with different types of data is passed to flatten function.
 Tested on 4 node cluster on CentOS
 {code}
 0: jdbc:drill: select flatten(outkey) from `nestedJArry.json` ;
 Query failed: RemoteRpcException: Failure while running fragment., index: 
 176, length: 4 (expected: range(0, 176)) [ 
 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
 [ 2627cf84-9dfb-4077-8531-9955ecdbdec7 on centos-02.qa.lab:31010 ]
 Error: exception while executing query: Failure while executing query. 
 (state=,code=0)
 0: jdbc:drill: select outkey from `nestedJArry.json`;
 ++
 |   outkey   |
 ++
 | 
 [[100,1000,200,99,1,0,-1,10],[a,b,c,d,e,p,o,f,m,q,d,s,v],[2012-04-01,1998-02-20,2011-08-05,1992-01-01],[10:30:29.123,12:29:21.999],[sdfklgjsdlkjfghlsidhfgopiuesrtoipuertoiurtyoiurotuiydkfjlbn,bfn;waokefpqowertoipuwergklnjdfbpdsiofgoigiuewqrqiugkjehgjksdhbvkjshdfkjsdfbnlkfbkljrghljrelkhbdlkfjbgkdfjbgkndfbnkldfgklbhjdflkghjlnkoiurty984756897345609782-3458745uiyoheirluht7895e6y],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[null],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[test
  string,hello world!,just do it!,houston we have a 
 problem],[1,2,3,4,5,6,7,8,9,0]] |
 ++
 1 row selected (0.088 seconds)
 Stack trace from drillbit.log
 2015-04-09 23:54:41,965 [2ad8eebd-adb6-6f7e-469e-4bb8ca276984:frag:0:0] WARN  
 o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
 fragment
 java.lang.IndexOutOfBoundsException: index: 176, length: 4 (expected: 
 range(0, 176))
 at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:187) 
 ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
 at io.netty.buffer.DrillBuf.chk(DrillBuf.java:209) 
 ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
 at io.netty.buffer.DrillBuf.setInt(DrillBuf.java:513) 
 ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
 at 
 org.apache.drill.exec.vector.UInt4Vector$Mutator.set(UInt4Vector.java:363) 
 ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.vector.RepeatedVarCharVector.splitAndTransferTo(RepeatedVarCharVector.java:173)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.vector.RepeatedVarCharVector$TransferImpl.splitAndTransfer(RepeatedVarCharVector.java:200)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.test.generated.FlattenerGen1107.flattenRecords(FlattenTemplate.java:106)
  ~[na:na]
 at 
 org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.doWork(FlattenRecordBatch.java:156)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
  ~[drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
  

[jira] [Resolved] (DRILL-1754) Flatten nested within another flatten fails

2015-07-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1754.

   Resolution: Duplicate
Fix Version/s: (was: 1.2.0)

 Flatten nested within another flatten fails
 ---

 Key: DRILL-1754
 URL: https://issues.apache.org/jira/browse/DRILL-1754
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Jason Altekruse
Assignee: Jason Altekruse

 A query that tried to flatten a repeated list, and then flatten the resulting 
 simple lists fails in execution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1770) Flatten on top a subquery which applies flatten over kvgen results in a ClassCastException

2015-07-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1770.

   Resolution: Fixed
Fix Version/s: (was: 1.2.0)
   0.8.0

This was actually merged a while ago, it just wasn't updated here.

Fixed in 09aa34b68c97a20412e9917d2ab6bf182477beb4

 Flatten on top a subquery which applies flatten over kvgen results in a 
 ClassCastException
 --

 Key: DRILL-1770
 URL: https://issues.apache.org/jira/browse/DRILL-1770
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
Priority: Minor
 Fix For: 0.8.0

 Attachments: DRILL_1770.patch


 git.commit.id.abbrev=108d29f
 Dataset :
 {code}
 {map:{rm: [ {rptd: [{ a: foo}]}]}}
 {code}
 Query :
 {code}
 select flatten(sub.fk.`value`) from (select flatten(kvgen(map)) fk from 
 `nested.json`) sub;
 Query failed: Failure while running fragment., 
 org.apache.drill.exec.vector.NullableIntVector cannot be cast to 
 org.apache.drill.exec.vector.RepeatedVector
 {code}
 Let me know if you need more information.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1753) Flatten fails on a repeated map, where the maps being flattened contain repeated lists

2015-07-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1753.

   Resolution: Fixed
Fix Version/s: (was: 1.2.0)
   1.1.0

 Flatten fails on a repeated map, where the maps being flattened contain 
 repeated lists
 --

 Key: DRILL-1753
 URL: https://issues.apache.org/jira/browse/DRILL-1753
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators, Functions - Drill
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.1.0

 Attachments: error.log


 We currently fail to flatten the following data, the issue is the repeated 
 list nested inside of the map, this is not being copied properly during the 
 flatten operation.
 {
 r_map_3 : [
 { d : [ [1021, 1022], [1]] },
 { d : [ [1010] ] }
 ]
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2105) Query fails when using flatten on JSON data where some documents have an empty array

2015-07-02 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2105.

   Resolution: Fixed
Fix Version/s: (was: 1.2.0)
   0.8.0

It looks like this might have been reported with the wrong stacktrace, but 
Andries said he hasn't seen this issue so I'm closing it.

 Query fails when using flatten on JSON data where some documents have an 
 empty array
 

 Key: DRILL-2105
 URL: https://issues.apache.org/jira/browse/DRILL-2105
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 0.7.0
 Environment: MFS with JSON
Reporter: Andries Engelbrecht
Assignee: Jason Altekruse
 Fix For: 0.8.0


 Drill query fails when using flatten on an array, where some records contain 
 an empty array. Especially with larger data sets where the number of JSON 
 documents are greater than 100k.
 Using twitter data as sample.
 select flatten (entities.hashtags) from dfs.foo.`file.json`;
 Empty array
   entities: {
 trends: [],
 symbols: [],
 urls: [
   {
 expanded_url: http://on.nfl.com/1BkThQF;,
 indices: [
   118,
   140
 ],
 display_url: on.nfl.com/1BkThQF,
 url: http://t.co/Unr5KFy6hG;
   }
 ],
 hashtags: [],
 user_mentions: [
   {
 id: 19362299,
 name: NFL Network,
 indices: [
   3,
   14
 ],
 screen_name: nflnetwork,
 id_str: 19362299
   }
 ]
   },
 Array with content
   entities: {
 trends: [],
 symbols: [],
 urls: [],
 hashtags: [
   {
 text: djpreps,
 indices: [
   47,
   55
 ]
   },
   {
 text: MSPreps,
 indices: [
   56,
   64
 ]
   }
 ],
 user_mentions: []
   },
 Log output
 2015-01-27 02:26:13,478 [2b3908b9-cf08-3fd5-3bd8-ebb6bb5b70f1:foreman] INFO  
 o.a.d.e.store.mock.MockStorageEngine - Failure while attempting to check for 
 Parquet metadata file.
 java.io.IOException: Open failed for file: /data/twitter/nfl/2015, error: 
 Invalid argument (22)
 at com.mapr.fs.MapRClientImpl.open(MapRClientImpl.java:191) 
 ~[maprfs-4.0.1.28318-mapr.jar:4.0.1.28318-mapr]
 at com.mapr.fs.MapRFileSystem.open(MapRFileSystem.java:776) 
 ~[maprfs-4.0.1.28318-mapr.jar:4.0.1.28318-mapr]
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:800) 
 ~[hadoop-common-2.4.1-mapr-1408.jar:na]
 at 
 org.apache.drill.exec.store.dfs.shim.fallback.FallbackFileSystem.open(FallbackFileSystem.java:94)
  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.dfs.BasicFormatMatcher$MagicStringMatcher.matches(BasicFormatMatcher.java:138)
  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.dfs.BasicFormatMatcher.isReadable(BasicFormatMatcher.java:107)
  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isDirReadable(ParquetFormatPlugin.java:232)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.parquet.ParquetFormatPlugin$ParquetFormatMatcher.isReadable(ParquetFormatPlugin.java:212)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory.create(WorkspaceSchemaFactory.java:141)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory.create(WorkspaceSchemaFactory.java:58)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:273)
  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
 at 
 net.hydromatic.optiq.jdbc.SimpleOptiqSchema.getTable(SimpleOptiqSchema.java:75)
  [optiq-core-0.9-drill-r12.jar:na]
 at 
 net.hydromatic.optiq.prepare.OptiqCatalogReader.getTableFrom(OptiqCatalogReader.java:87)
  [optiq-core-0.9-drill-r12.jar:na]
 at 
 net.hydromatic.optiq.prepare.OptiqCatalogReader.getTable(OptiqCatalogReader.java:70)
  [optiq-core-0.9-drill-r12.jar:na]
 

[jira] [Resolved] (DRILL-3370) FLATTEN error with a where clause

2015-06-29 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3370.

Resolution: Fixed

Fixed in a915085e8a8b4255ff659086d047cc5dd874a5bf

 FLATTEN error with a where clause
 -

 Key: DRILL-3370
 URL: https://issues.apache.org/jira/browse/DRILL-3370
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Affects Versions: 1.0.0
Reporter: Joanlynn LIN
Assignee: Jason Altekruse
 Fix For: 1.1.0

 Attachments: DRILL-3370.patch, jsonarray.150.json


 I've got a JSON file which contains 150 JSON strings all like this:
 {arr: [94]}
 {arr: [39]}
 {arr: [180]}
 I was trying to Flatten() the arrays and filter the values using such an SQL 
 query:
 select flatten(arr) as a from dfs.`/data/test/jsonarray.150.json` where a 
  100;
 However, it returned no result. Then I modified my expression like this:
   select a from (select flatten(arr) as a from 
 dfs.`/data/test/jsonarray.150.json`) where a  100;
 It then failed:
 Error: SYSTEM ERROR: 
 org.apache.drill.exec.exception.SchemaChangeException: Failure while trying 
 to materialize incoming schema.  Errors:
 Error in expression at index -1.  Error: Missing function implementation: 
 [flatten(BIGINT-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
 Fragment 0:0
 [Error Id: 1d71bf0e-48da-43f8-8b36-6a513120d7e0 on slave2:31010] 
 (state=,code=0)
 After a lot of attempts, I finally got it work:
 select a from (select flatten(arr) as a from 
 dfs.`/data/test/jsonarray.150.json` limit 1000) where a  100;
 See, I just added a limit 1000 in this query and I am wondering if this 
 is a bug or what in Drill?
 Looking forward to your attention and help. Many thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1616) Add support for count() on maps and arrays

2015-06-23 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1616.

Resolution: Duplicate

 Add support for count() on maps and arrays
 --

 Key: DRILL-1616
 URL: https://issues.apache.org/jira/browse/DRILL-1616
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - JSON
Reporter: Abhishek Girish
Assignee: Jason Altekruse
Priority: Minor

 Count(field) throws error on fields which are objects or arrays and these are 
 not clean. They do not indicate an error in usage. Also, count on 
 objects/arrays should be supported. 
  select * from `abc.json`;
 ++++++
 |  field_1   |  field_2   |  field_3   |  field_4   |  field_5   |
 ++++++
 | [1]  | null   | {inner_3:[]} | {inner_1:[],inner_3:{}} | [] 
 |
 | [5]  | 2  | {inner_1:2,inner_3:[]} | 
 {inner_1:[1,2,3],inner_2:3,inner_3:{inner_object_field_1:2}}
  | [{inner_list:[1,null,6],inner_ |
 | [5,10,15] | A wild string appears! | 
 {inner_1:5,inner_2:3,inner_3:[{},{inner_object_field_1:10}]} | 
 {inner_1:[4,5,6],inner_2:3,inner_3:{}} | [{ |
 ++++++
 3 rows selected (0.081 seconds)
  select count(field_1) from `abc.json`;
 Query failed: Failure while running fragment., Schema is currently null.  You 
 must call buildSchema(SelectionVectorMode) before this container can return a 
 schema. [ b6f021f9-213e-475e-83f4-a6facf6fd76d on abhi7.qa.lab:31010 ]
 Error: exception while executing query: Failure while executing query. 
 (state=,code=0)
 Error is seen on fields 1,3,4,5. 
 The issue is not seen when array index is specified. 
  select count(field_1[0]) from `abc.json`;
 ++
 |   EXPR$0   |
 ++
 | 3  |
 ++
 1 row selected (0.152 seconds)
 Or when the element in the object is specified:
  select count(t.field_3.inner_3) from `textmode.json` as t;
 ++
 |   EXPR$0   |
 ++
 | 3  |
 ++
 1 row selected (0.155 seconds)
 LOG:
 2014-10-30 13:28:20,286 [a90cc246-e60b-452b-ba96-7f79709f5ffa:frag:0:0] ERROR 
 o.a.d.e.w.f.AbstractStatusReporter - Error 
 bc438332-0828-4a86-8063-9dc8c5a703d9: Failure while running fragment.
 java.lang.NullPointerException: Schema is currently null.  You must call 
 buildSchema(SelectionVectorMode) before this container can return a schema.
 at 
 com.google.common.base.Preconditions.checkNotNull(Preconditions.java:208) 
 ~[guava-14.0.1.jar:na]
 at 
 org.apache.drill.exec.record.VectorContainer.getSchema(VectorContainer.java:273)
  
 ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.getSchema(AbstractRecordBatch.java:116)
  
 ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.getSchema(IteratorValidatorBatchIterator.java:75)
  
 ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema(ScreenCreator.java:100)
  
 ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:103)
  
 ~[drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
 at 
 org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:249)
  
 [drill-java-exec-0.7.0-incubating-SNAPSHOT-rebuffed.jar:0.7.0-incubating-SNAPSHOT]
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  [na:1.7.0_65]
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  [na:1.7.0_65]
 at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3323) Flatten planning rule creates unneeded copy of the list being flattened, causes executuion/allocation issues with large lists

2015-06-19 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3323:
--

 Summary: Flatten planning rule creates unneeded copy of the list 
being flattened, causes executuion/allocation issues with large lists
 Key: DRILL-3323
 URL: https://issues.apache.org/jira/browse/DRILL-3323
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.1.0


The planning rule for flatten was written to not only handle the flatten 
operator, but it also was designed to address some shortcomings in expression 
evaluation involving complex types. The rule currently plans inefficiently to 
try to cover some of these more advanced cases, but there is not thorough test 
coverage to even demonstrate the benefits of it. We should disable a particular 
behavior of copying complex data and extra time when it is not needed, because 
it is causing flatten queries to fail with allocation issues. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3263) Read smallint and tinyint data from hive as integer until these types are well supported throughout Drill

2015-06-18 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-3263.

Resolution: Fixed

Fixed in 437706f750b0ec50b60582ea2c47e7017e2718e3

 Read smallint and tinyint data from hive as integer until these types are 
 well supported throughout Drill
 -

 Key: DRILL-3263
 URL: https://issues.apache.org/jira/browse/DRILL-3263
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types, Storage - Hive
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.1.0






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3209) [Umbrella] Plan reads of Hive tables as native Drill reads when a native reader for the underlying table format exists

2015-05-28 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3209:
--

 Summary: [Umbrella] Plan reads of Hive tables as native Drill 
reads when a native reader for the underlying table format exists
 Key: DRILL-3209
 URL: https://issues.apache.org/jira/browse/DRILL-3209
 Project: Apache Drill
  Issue Type: Improvement
  Components: Query Planning  Optimization, Storage - Hive
Reporter: Jason Altekruse
Assignee: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3116) Headers do not resize in enhanced sqlline that correctly resizes columns to nicely format data

2015-05-15 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3116:
--

 Summary: Headers do not resize in enhanced sqlline that correctly 
resizes columns to nicely format data
 Key: DRILL-3116
 URL: https://issues.apache.org/jira/browse/DRILL-3116
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - CLI
Reporter: Jason Altekruse
Assignee: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3092) Memory leak when an allocation fails near the creation of a RecordBatchData object

2015-05-14 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-3092:
--

 Summary: Memory leak when an allocation fails near the creation of 
a RecordBatchData object
 Key: DRILL-3092
 URL: https://issues.apache.org/jira/browse/DRILL-3092
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse
Assignee: Jason Altekruse


A number of locations in the code need try/finally blocks around the code that 
interacts with the buffers stored in a RecordBatchData object. Runtime 
exceptions (like running out of memory) in these code blocks can cause buffers 
to leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2960) Default hive storage plugin missing from fresh drill install

2015-05-09 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2960.

Resolution: Fixed

Fixed in d4f9bf2e994969c863b2b90b58e90139d242b106

 Default hive storage plugin missing from fresh drill install
 

 Key: DRILL-2960
 URL: https://issues.apache.org/jira/browse/DRILL-2960
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive
Affects Versions: 0.9.0
Reporter: Krystal
Assignee: Jason Altekruse
 Fix For: 1.0.0

 Attachments: 2960.patch


 Installed drill on a fresh node.  The default storage plugin for hive is 
 missing from the webUI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1460) JsonReader fails reading files with decimal numbers and integers in the same field

2015-05-07 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1460.

Resolution: Fixed

 JsonReader fails reading files with decimal numbers and integers in the same 
 field
 --

 Key: DRILL-1460
 URL: https://issues.apache.org/jira/browse/DRILL-1460
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Affects Versions: 0.6.0, 0.7.0
Reporter: Bhallamudi Venkata Siva Kamesh
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.0.0

 Attachments: DRILL-1460.1.patch.txt, DRILL-1460.2.patch.txt


 Used the following dataset : 
 http://thecodebarbarian.wordpress.com/2014/03/28/plugging-usda-nutrition-data-into-mongodb
 Executed the following query
 {noformat}select t.nutrients from dfs.usda.`usda.json` t limit 1;{noformat}
 and it failed with following exception
 {noformat}
 2014-09-27 17:48:39,421 [b9dfbb9b-29a9-425d-801c-2e418533525f:frag:0:0] ERROR 
 o.a.d.e.p.i.ScreenCreator$ScreenRoot - Error 
 0568d90a-d7df-4a5d-87e9-8b9f718dffa4: Screen received stop request sent.
 java.lang.IllegalArgumentException: You tried to write a BigInt type when you 
 are using a ValueWriter of type NullableFloat8WriterImpl.
   at 
 org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.fail(AbstractFieldWriter.java:513)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.write(AbstractFieldWriter.java:145)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.impl.NullableFloat8WriterImpl.write(NullableFloat8WriterImpl.java:88)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:257)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:310)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:204)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.write(JsonReader.java:134) 
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReaderWithState.write(JsonReaderWithState.java:65)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.store.easy.json.JSONRecordReader2.next(JSONRecordReader2.java:111)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
 {noformat}
 {noformat}select t.nutrients[0].units from dfs.usda.`usda.json` t limit 
 1;{noformat}
 and it failed with following exception
 {noformat}
 2014-09-27 17:50:04,394 [9ee8a529-17fd-492f-9cba-2d1f5842eae1:frag:0:0] ERROR 
 o.a.d.e.p.i.ScreenCreator$ScreenRoot - Error 
 c4c6bffd-b62b-4878-af1e-58db64453307: Screen received stop request sent.
 java.lang.IllegalArgumentException: You tried to write a BigInt type when you 
 are using a ValueWriter of type NullableFloat8WriterImpl.
   at 
 org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.fail(AbstractFieldWriter.java:513)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.impl.AbstractFieldWriter.write(AbstractFieldWriter.java:145)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.impl.NullableFloat8WriterImpl.write(NullableFloat8WriterImpl.java:88)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:257)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:310)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.writeData(JsonReader.java:204)
  
 ~[drill-java-exec-0.6.0-incubating-SNAPSHOT-rebuffed.jar:0.6.0-incubating-SNAPSHOT]
   at 
 org.apache.drill.exec.vector.complex.fn.JsonReader.write(JsonReader.java:134) 
 

[jira] [Resolved] (DRILL-2772) Display status of query when viewing the query's profile page

2015-05-07 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2772.

Resolution: Fixed

Fixed in fd337efbcaabb15ec0c5f3336848f4347e42cf27

 Display status of query when viewing the query's profile page
 -

 Key: DRILL-2772
 URL: https://issues.apache.org/jira/browse/DRILL-2772
 Project: Apache Drill
  Issue Type: Improvement
  Components: Client - HTTP
Affects Versions: 0.8.0
 Environment: RHEL 6.4
Reporter: Kunal Khatua
Assignee: Jason Altekruse
 Fix For: 1.2.0

 Attachments: DRILL-2772.1.patch.txt


 When viewing the profile of a query that has run/executed a while ago, it 
 would be helpful to see whether the query was marked as completed, cancelled 
 or failed. 
 The summary on the http://hostname:8047/profiles page shows the status but 
 none of the profile pages show this information. Since the summary is limited 
 to the last 100 queries, having the status would help.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2569) Minor fragmentId in Profile UI gets truncated to the last 2 digits

2015-05-07 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2569.

   Resolution: Fixed
Fix Version/s: (was: 1.2.0)
   1.0.0

 Minor fragmentId in Profile UI gets truncated to the last 2 digits
 --

 Key: DRILL-2569
 URL: https://issues.apache.org/jira/browse/DRILL-2569
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - HTTP
Affects Versions: 0.9.0
Reporter: Krystal
Assignee: Jason Altekruse
 Fix For: 1.0.0

 Attachments: DRILL-2569.1.patch.txt


 git.commit.id.abbrev=8493713
 When the profile json contains minorFragmentId  99, the UI only display the 
 last 2 digits.  For example if minorFragmentId=100, it is being displayed as 
 00.  Here is a snippet of such data from the profile UI:
 04-xx-03 - PARQUET_ROW_GROUP_SCAN
 Minor FragmentSetup   Process WaitMax Batches Max Records 
 Peak Mem
 04-98-03  0.000   3.807   1.795   0   0   15MB
 04-99-03  0.000   3.247   3.111   0   0   24MB
 04-00-03  0.000   3.163   2.545   0   0   20MB
 04-01-03  0.000   3.272   2.278   0   0   15MB
 04-02-03  0.000   3.496   2.004   0   0   15MB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2508) Add new column to sys.options table that exposes whether or not the current system value is the default

2015-05-07 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2508.

Resolution: Fixed

Fixed in d12bee05a8f6e974c70d5d2a94176b176d7dba5b

 Add new column to sys.options table that exposes whether or not the current 
 system value is the default
 ---

 Key: DRILL-2508
 URL: https://issues.apache.org/jira/browse/DRILL-2508
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Other
Reporter: Victoria Markman
Assignee: Jason Altekruse
 Fix For: 1.0.0

 Attachments: DRILL-2508.1.patch.txt, DRILL-2508.2.patch.txt


 Need to be able to see system parameters that I changed.
 There is an enhancement already to reset them to default values: drill-1065
 I don't necessarily want to do that, I just want to see only things that I 
 changed : default value vs. my change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1545) Json files can only be read when they have a .json extension

2015-05-07 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1545.

Resolution: Fixed

 Json files can only be read when they have a .json extension
 

 Key: DRILL-1545
 URL: https://issues.apache.org/jira/browse/DRILL-1545
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - JSON
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.0.0

 Attachments: DRILL-1545.2.patch.txt, DRILL-1545.3.patch.txt


 It seems that Drill can only discover json data if the file extension is 
 .json.   
 We have tried to add the file extension.log as type json in the Storage 
 Plugin (and validated the json) , but without success. 
 Would be great if somebody can share a example config or has an idea.
 Storage Plugin Configuration.
 {
   type: file,
   enabled: true,
   connection: maprfs:///,
   workspaces: {
 root: {
   location: /,
   writable: false,
   storageformat: null
 },
 tmp: {
   location: /tmp,
   writable: true,
   storageformat: csv
 }
   },
   formats: {
 log: {
   type: json,
   extensions: [
 log
   ]
 },
 psv: {
   type: text,
   extensions: [
 tbl
   ],
   delimiter: |
 },
 csv: {
   type: text,
   extensions: [
 csv
   ],
   delimiter: ,
 },
 tsv: {
   type: text,
   extensions: [
 tsv
   ],
   delimiter: \t
 },
 parquet: {
   type: parquet
 },
 json: {
   type: json
 }
   }
 }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2228) Projecting '*' returns all nulls when we have flatten in a filter and order by

2015-05-04 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2228.

Resolution: Later

Flatten has been disabled in the order by clause by DRILL-2181. We will look at 
how best to re-enable this past 1.0

 Projecting '*' returns all nulls when we have flatten in a filter and order by
 --

 Key: DRILL-2228
 URL: https://issues.apache.org/jira/browse/DRILL-2228
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.0.0


 git.commit.id.abbrev=3d863b5
 The below query returns all nulls currently :
 {code}
 0: jdbc:drill:schema=dfs_eea select * from `data.json` where 2 in (select 
 flatten(lst_lst[1]) from `data.json`) order by flatten(lst_lst[1]);
 ++
 | *  |
 ++
 | null   |
 | null   |
 | null   |
 | null   |
 | null   |
 | null   |
 | null   |
 | null   |
 | null   |
 | null   |
 ++
 {code}
 There seems to be another issue here since the no of records returned also 
 does not look right. I will raise a separate JIRA for that.
 The issue goes away, if we do an order by without the flatten. Below query 
 works
 {code}
 select * from `data.json` where 2 in (select flatten(lst_lst[1]) from 
 `data.json`) order by uid;
 {code}
 Attached the data files



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2232) Flatten functionality not well defined when we use flatten in an order by without projecting it

2015-05-04 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2232.

Resolution: Later

Flatten has been disabled in the order by clause by 2181. We will look at how 
best to re-enable this past 1.0

 Flatten functionality not well defined when we use flatten in an order by 
 without projecting it
 ---

 Key: DRILL-2232
 URL: https://issues.apache.org/jira/browse/DRILL-2232
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.0.0


 git.commit.id.abbrev=3d863b5
 Data Set :
 {code}
 {
   id : 1,
   lst : [1,2,3,4]
 }
 {code}
 The below query returns 4 rows instead of 1. The expected behavior in this 
 case is not documented properly
 {code}
 select id from `data.json` where 2 in (select flatten(lst) from `data.json`) 
 order by flatten(lst);
 ++
 | id |
 ++
 | 1  |
 | 1  |
 | 1  |
 | 1  |
 ++
 {code}
 The below projects a flatten. 
 {code}
 0: jdbc:drill:schema=dfs_eea select id, flatten(lst) from `temp.json` where 
 2 in (select flatten(lst) from `temp.json`) order by flatten(lst);
 +++
 | id |   EXPR$1   |
 +++
 | 1  | 1  |
 | 1  | 2  |
 | 1  | 3  |
 | 1  | 4  |
 +++
 {code}
 We can agree on one of the 3 possibilites when flatten is not projected:
 1. Irrespective of whether flatten is in the select list or not, we would 
 still return more records based on flatten in the order by
 2. Flatten in the order by clause does not change the no of records we return
 3. Using flatten in an order by (or probably group by) is not supported
 Whatever we agree on, we should document it more clearly. Let me know your 
 thoughts



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2208) Error message must be updated when query contains operations on a flattened column

2015-05-04 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2208.

   Resolution: Duplicate
Fix Version/s: (was: 1.0.0)

 Error message must be updated when query contains operations on a flattened 
 column
 --

 Key: DRILL-2208
 URL: https://issues.apache.org/jira/browse/DRILL-2208
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Affects Versions: 0.8.0
Reporter: Abhishek Girish
Assignee: Jason Altekruse
Priority: Minor
 Attachments: drillbit_flatten.log


 Currently i observe that if there is a flatten/kvgen operation applied on a 
 column, no further operations can be performed on the said column unless it 
 is wrapped inside a nested query. 
 Consider a simple flatten/kvgen operation on a complex JSON file :
  select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f limit 1;
 ++
 | p  |
 ++
 | {key:languages,value:{text:Mandarin Chinese 12.44%, Spanish 4.85%, 
 English 4.83%, Arabic 3.25%, Hindi 2.68%, Bengali 2.66%, Portuguese 2.62%, 
 Russian 2.12%, Japanese 1.8%, Standard German 1.33%, Javanese 1.25% (2009 
 est.),note_1:percents are for \first language\ speakers only; the six 
 UN languages - Arabic, Chinese (Mandarin), English, French, Russian, and 
 Spanish (Castilian) - are the mother tongue or second language of about half 
 of the world's population, and are the official languages in more than half 
 the states in the world; some 150 to 200 languages have more than a million 
 speakers,note_2:all told, there are an estimated 7,100 languages spoken 
 in the world; aproximately 80% of these languages are spoken by less than 
 100,000 people; about 50 languages are spoken by only 1 person; communities 
 that are isolated from each other in mountainous regions often develop 
 multiple languages; Papua New Guinea, for example, boasts about 836 separate 
 languages,note_3:approximately 2,300 languages are spoken in Asia, 2,150, 
 in Africa, 1,311 in the Pacific, 1,060 in the Americas, and 280 in Europe}} |
 | {key:religions,value:{text:Christian 33.39% (of which Roman 
 Catholic 16.85%, Protestant 6.15%, Orthodox 3.96%, Anglican 1.26%), Muslim 
 22.74%, Hindu 13.8%, Buddhist 6.77%, Sikh 0.35%, Jewish 0.22%, Baha'i 0.11%, 
 other religions 10.95%, non-religious 9.66%, atheists 2.01% (2010 est.)}} |
 | {key:population,value:{text:7,095,217,980 (July 2013 
 est.),top_ten_most_populous_countries_in_millions:China 1,349.59; India 
 1,220.80; United States 316.67; Indonesia 251.16; Brazil 201.01; Pakistan 
 193.24; Nigeria 174.51; Bangladesh 163.65; Russia 142.50; Japan 127.25}} |
 | {key:age_structure,value:{0_14_years:26% (male 953,496,513/female 
 890,372,474),15_24_years:16.8% (male 614,574,389/female 
 579,810,490),25_54_years:40.6% (male 1,454,831,900/female 
 1,426,721,773),55_64_years:8.4% (male 291,435,881/female 
 305,185,398),65_years_and_over:8.2% (male 257,035,416/female 321,753,746) 
 (2013 est.)}} |
 | {key:dependency_ratios,value:{total_dependency_ratio:52 
 %,youth_dependency_ratio:39.9 %,elderly_dependency_ratio:12.1 
 %,potential_support_ratio:8.3 (2013)}} |
 ++
 *Adding a WHERE clause with conditions on this column fails:*
  select flatten(kvgen(f.`people`)) as p from `factbook/world.json` f where 
  f.p.`key` = 'languages';
 Query failed: RemoteRpcException: Failure while running fragment., languages 
 [ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
 [ 686bcd40-c23b-448c-93d8-b98a3b092657 on abhi5.qa.lab:31010 ]
 Error: exception while executing query: Failure while executing query. 
 (state=,code=0)
 Logs indicate a NumberFormat Exception in the above case.
 *And query fails to parse in the below case*
  select flatten(kvgen(f.`people`)).`value` as p from `factbook/world.json` f 
  limit 5;
 Query failed: ParseException: Encountered . at line 1, column 34.
 Was expecting one of:
 FROM ...
 , ...
 AS ...
  
  
 OVER ...
 Error: exception while executing query: Failure while executing query. 
 (state=,code=0)
 Rewriting using an inner query succeeds:
 select g.p.`value`.`note_3` from (select flatten(kvgen(f.`people`)) as p from 
 `factbook/world.json` f) g where g.p.`key`='languages';
 ++
 |   EXPR$0   |
 ++
 | approximately 2,300 languages are spoken in Asia, 2,150, in Africa, 1,311 
 in the Pacific, 1,060 in the Americas, and 280 in Europe |
 ++
 *In both the failure cases the error message needs to be updated to indicate 
 that the operation is not supported. The current error message and logs are 
 not clear for an end user. *



--
This message was sent by Atlassian JIRA

[jira] [Resolved] (DRILL-2264) Incorrect data when we use aggregate functions with flatten

2015-05-04 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2264.

Resolution: Later

Flatten has been disabled in the group by clause by DRILL-2181. We will look at 
how best to re-enable this past 1.0

 Incorrect data when we use aggregate functions with flatten
 ---

 Key: DRILL-2264
 URL: https://issues.apache.org/jira/browse/DRILL-2264
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.0.0


 git.commit.id.abbrev=6676f2d
 Data Set :
 {code}
 {
   uid:1,
   lst_lst : [[1,2],[3,4]]
 }
 {
   uid:2,
   lst_lst : [[1,2],[3,4]]
 }
 {code}
 The below query returns incorrect results :
 {code}
 select uid,MAX( flatten(lst_lst[1]) + flatten(lst_lst[0])) from `temp.json` 
 group by uid, flatten(lst_lst[1]), flatten(lst_lst[0]);
 +++
 |uid |   EXPR$1   |
 +++
 | 1  | 6  |
 | 1  | 6  |
 | 1  | 6  |
 | 1  | 6  |
 | 2  | 6  |
 | 2  | 6  |
 | 2  | 6  |
 | 2  | 6  |
 +++
 {code}
 However if we use a sub query, drill returns the right data
 {code}
 select uid, MAX(l1+l2) from (select uid,flatten(lst_lst[1]) l1, 
 flatten(lst_lst[0]) l2 from `temp.json`) sub group by uid, l1, l2;
 +++
 |uid |   EXPR$1   |
 +++
 | 1  | 4  |
 | 1  | 5  |
 | 1  | 5  |
 | 1  | 6  |
 | 2  | 4  |
 | 2  | 5  |
 | 2  | 5  |
 | 2  | 6  |
 +++
 {code}
 Also using a single flatten yields proper results
 {code}
 select uid,MAX(flatten(lst_lst[0])) from `temp.json` group by uid, 
 flatten(lst_lst[0]);
 +++
 |uid |   EXPR$1   |
 +++
 | 1  | 1  |
 | 1  | 2  |
 | 2  | 1  |
 | 2  | 2  |
 +++
 {code}
 Marked it as critical since we return in-correct data. Let me know if you 
 have any other questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2938) Refactor

2015-05-01 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2938:
--

 Summary: Refactor 
 Key: DRILL-2938
 URL: https://issues.apache.org/jira/browse/DRILL-2938
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-1502) Can't connect to mongo when requiring auth

2015-04-30 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-1502.

Resolution: Fixed

Fixed in f5b0f4928d9c8c47c145a179c52ba3933d85c0b4

 Can't connect to mongo when requiring auth
 --

 Key: DRILL-1502
 URL: https://issues.apache.org/jira/browse/DRILL-1502
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - MongoDB
Affects Versions: 0.6.0
Reporter: Robert Malko
Assignee: Jason Altekruse
Priority: Minor
 Fix For: 1.0.0


 It doesn't appear that the latest 0.6.0 version allows you to connect to a 
 mongo database requiring auth.  The usual mongo db connection string of 
 mongodb://user:pass@host:port/ is not honored.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2908) Support reading the Parquet int 96 type

2015-04-29 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2908:
--

 Summary: Support reading the Parquet int 96 type
 Key: DRILL-2908
 URL: https://issues.apache.org/jira/browse/DRILL-2908
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Parquet
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 1.0.0


While Drill does not currently have an int96 type, it is supported by the 
parquet format and we should be able to read files that contain columns of this 
type. For now we will read the data into a varbinary and users will have to use 
existing convert_from functions or write their own to interpret the type of 
data actually stored. One example is the Impala timestamp format which is 
encoded in an int96 column.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2913) Directory explorer UDFs causing warnings from failed janino compilation

2015-04-29 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2913:
--

 Summary: Directory explorer UDFs causing warnings from failed 
janino compilation
 Key: DRILL-2913
 URL: https://issues.apache.org/jira/browse/DRILL-2913
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Reporter: Jason Altekruse
Assignee: Jason Altekruse


The functions added in DRILL-2173 never need to be compiled using janino 
because they are never used during the regular java code generation based 
evaluation, they are only useful if they can be folded at planning time to 
allow pruning partitions dynamically. As they are being registered in Drill 
they are currently causing warnings as they are using generics which jainino 
does not support. They need to be modified to remove the generics or forced to 
use the JDK compiler upon registration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2754) Allocation bug in splitAndTransfer method causing some flatten queries to fail

2015-04-21 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2754.

   Resolution: Fixed
Fix Version/s: (was: 0.9.0)
   1.0.0

 Allocation bug in splitAndTransfer method causing some flatten queries to fail
 --

 Key: DRILL-2754
 URL: https://issues.apache.org/jira/browse/DRILL-2754
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types, Execution - Relational Operators
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 1.0.0

 Attachments: DRILL-2754.patch


 Data for reproduction:
 {code}
 {
  config: [
  [],
  [ a string ]
  ]
 }
 {code}
 {code}
 select flatten(config) as flat from cp.`/store/json/null_list.json`
 {code}
 This was carved out of a larger use case that was failing, so in the course 
 of coming up with a minimal reproduction I fixed the bug. I will be posting a 
 patch shortly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2842) Parquet files with large file metadata sometimes fail to read in the FooterGather

2015-04-21 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2842:
--

 Summary: Parquet files with large file metadata sometimes fail to 
read in the FooterGather
 Key: DRILL-2842
 URL: https://issues.apache.org/jira/browse/DRILL-2842
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2616) strings loaded incorrectly from parquet files

2015-04-16 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2616.

Resolution: Duplicate

 strings loaded incorrectly from parquet files
 -

 Key: DRILL-2616
 URL: https://issues.apache.org/jira/browse/DRILL-2616
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 0.7.0
Reporter: Jack Crawford
Assignee: Jason Altekruse
Priority: Critical
  Labels: parquet

 When loading string columns from parquet data sources, some rows have their 
 string values replaced with the value from other rows.
 Example parquet for which the problem occurs:
 https://drive.google.com/file/d/0B2JGBdceNMxdeFlJcW1FUElOdXc/view?usp=sharing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2162) Multiple flattens on a list within a list results in violating the incoming batch size limit

2015-04-15 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2162.

Resolution: Fixed

 Multiple flattens on a list within a list results in violating the incoming 
 batch size limit
 

 Key: DRILL-2162
 URL: https://issues.apache.org/jira/browse/DRILL-2162
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Rahul Challapalli
Assignee: Jason Altekruse
 Fix For: 0.9.0

 Attachments: data.json, drill-2162.patch


 git.commit.id.abbrev=3e33880
 I attached the data set with 2 records.
 The below query succeeds on top of the attached data set. However when I 
 copied over the same data set 5 times, the same query failed
 {code}
 select uid, flatten(d.lst_lst[1]) lst1, flatten(d.lst_lst[0]) lst0, 
 flatten(d.lst_lst) lst from `data.json` d;
 Query failed: RemoteRpcException: Failure while running fragment., Incoming 
 batch of org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch has 
 size 102375, which is beyond the limit of 65536 [ 
 ef16dd95-40e2-4b66-ba30-8650ddb99812 on qa-node190.qa.lab:31010 ]
 [ ef16dd95-40e2-4b66-ba30-8650ddb99812 on qa-node190.qa.lab:31010 ]
 {code}
 Error from the logs :
 {code}
 java.lang.IllegalStateException: Incoming batch of 
 org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch has size 
 102375, which is beyond the limit of 65536
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:129)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.flatten.FlattenRecordBatch.innerNext(FlattenRecordBatch.java:122)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
  

[jira] [Created] (DRILL-2712) Issue with PruneScanRule and filter expressions that evaluate to false

2015-04-07 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2712:
--

 Summary: Issue with PruneScanRule and filter expressions that 
evaluate to false
 Key: DRILL-2712
 URL: https://issues.apache.org/jira/browse/DRILL-2712
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning  Optimization
Reporter: Jason Altekruse
Assignee: Aman Sinha
Priority: Critical


Testing out the recently committed changes to allow for partition querying in 
UDFs (DRILL-2173) I ran into a query that was not able to plan. Oddly it was 
not throwing a typical error we would see out of calcite, it seemed to be just 
spinning indefinitely.

I was able to create a simple reproduction that removed the new UDF use, it 
seems to be related to using a false filter along with a directory filter. I 
fixed one issue while I was creating the repro (I found the error message in 
the logs, with the patch attached here that issue goes away but I see a 
different exception after letting it run for a long time) These issues might be 
completely unrelated, I just do not currently have a separate reproduction for 
the issue I fixed that can actually complete successfully for the sake of 
writing a test for it.

Disabling constant folding is not required, but it happened both with it and 
without it, so it simplified debugging for the time being. The fix for this 
issue should probably be tested with constant folding turned off and with the 
default behavior of it turned on.

The test is:
{code}
  @Test
  public void testFailingPrune() throws Exception {
test(alter session set `planner.enable_constant_folding` = false);
test(explain plan for select * from ( + String.format(select dir0, dir1, 
o_custkey, o_orderdate from dfs_test.`%s/multilevel/parquet` where dir0=1994 
and dir1='Q1', TEST_RES_PATH)
+ ) t where 1=0;);
  }
{code}

The issue I saw in the logs after it ran for a while:

{code}
org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
during fragment initialization: Internal error: Error while applying rule 
DrillMergeFilterRule, args 
[rel#1669:FilterRel.NONE.ANY([]).[](child=rel#23:Subset#2.NONE.ANY([]).[],condition=AND(=($0,
 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 1994), =($1, 'Q1'), =($0, 
1994), =($1, 

[jira] [Resolved] (DRILL-2226) Create test utilities for checking plans for patterns

2015-04-06 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2226?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2226.

   Resolution: Fixed
Fix Version/s: (was: 1.0.0)
   0.8.0

Fixed in ed397862eb9584572aa0fcb684dfc9554b00cf60

 Create test utilities for checking plans for patterns
 -

 Key: DRILL-2226
 URL: https://issues.apache.org/jira/browse/DRILL-2226
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build  Test
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 0.8.0

 Attachments: DRILL-2226.patch


 Regex matching for calcite text format plans, includes expected and excluded 
 pattern matching.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2143) Remove RecordBatch from setup method of DrillFunc interface

2015-03-17 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2143.

Resolution: Fixed

Resolved in bff7b9ef5a9f345908aca160a97b98f6ab187708 and 
1c5decc17cf38cbf4a4119d7ca19653cb19e1b53

 Remove RecordBatch from setup method of DrillFunc interface
 ---

 Key: DRILL-2143
 URL: https://issues.apache.org/jira/browse/DRILL-2143
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Reporter: Jason Altekruse
Assignee: Jason Altekruse
 Fix For: 0.8.0

 Attachments: DRILL-2143-part1-feb-27.patch, 
 DRILL-2143-part1-feb-6.patch, DRILL-2143-part1-mar-3.patch, 
 DRILL-2143-part2-15-mar-15.patch, DRILL-2143-part2-feb-27.patch, 
 DRILL-2143-part2-feb-6.patch, DRILL-2143-part2-mar-3.patch, 
 DRILL-2143-remove-record-batch-from-udfs.patch


 Drill UDFs currently are exposed to too much system state by receiving a 
 reference to a RecordBatch in their setup method. This is not necessary as 
 all of the schema change triggered operator functionality is handled outside 
 of UDFs (the UDFS themselves are actually required to define a specific type 
 they take as input, except in the case of complex types (maps and lists)). 
 The only remaining artifact left from this interface is the date/time 
 functions that ask for the query start time or current timezone. This can be 
 provided to functions using a new injectable type, as DrillBufs are provided 
 to functions currently. For more info read here: 
 http://mail-archives.apache.org/mod_mbox/drill-dev/201501.mbox/%3ccampyv7ac_-9u4irz+5fxoenzbojctovjronn0qri4bqzf53...@mail.gmail.com%3E
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2406) Fix expression interpreter to allow executing expressions at planning time

2015-03-17 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2406.

Resolution: Fixed

Resolve din 0aa8b19d624d173da51de36aa164f3435d3366a4 and 
3f93454f014196a4da198ce012b605b70081fde0

 Fix expression interpreter to allow executing expressions at planning time
 --

 Key: DRILL-2406
 URL: https://issues.apache.org/jira/browse/DRILL-2406
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 0.8.0

 Attachments: DRILL-2406-part1-15-mar-15.patch, 
 DRILL-2406-part1-planning-time-expression-evaulutation.patch, 
 DRILL-2406-part2-15-mar-15.patch, 
 DRILL-2406-part2-planning-time-expression-evaulutation.diff, 
 DRILL-2406-part2-v2-planning-time-expression-evaulutation.patch, 
 DRILL-2406-part2-v3-planning-time-expression-evaulutation.diff


 The expression interpreter currently available in Drill cannot be used at 
 planning time, as it does not have a means to connect to the direct memory 
 allocator stored at the DrillbitContext level. To implement new rules based 
 on evaluating expressions on constants, or small datasets, such as partition 
 information this limitation must be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2448) Remove outdated code to ignore type resolution with varchar vs varbinary now that implicit casting subsumes it

2015-03-12 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2448:
--

 Summary: Remove outdated code to ignore type resolution with 
varchar vs varbinary now that implicit casting subsumes it
 Key: DRILL-2448
 URL: https://issues.apache.org/jira/browse/DRILL-2448
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types
Affects Versions: 0.7.0
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 0.8.0


Function resolution included a small condition to allow varchar and varbinary 
functions to be resolved for either incoming type. While it is valid to 
implicitly cast between these two, this early workaround creates a technically 
invalid expression tree that happens to work with the current code generation 
system. This however does create an issue for the interpreted expression 
evaluator. Removing the code simply causes an implicit cast to be added during 
materialization, this works for both generated code expression evaluation as 
well as the interpreter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2406) Fix expression interpreter to allow executing expressions at planning time

2015-03-09 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2406:
--

 Summary: Fix expression interpreter to allow executing expressions 
at planning time
 Key: DRILL-2406
 URL: https://issues.apache.org/jira/browse/DRILL-2406
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical
 Fix For: 0.8.0


The expression interpreter currently available in Drill cannot be used at 
planning time, as it does not have a means to connect to the direct memory 
allocator stored at the DrillbitContext level. To implement new rules based on 
evaluating expressions on constants, or small datasets, such as partition 
information this limitation must be addressed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2391) NPE during cleanup in parquet record writer when query fails during execution on CTAS

2015-03-05 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2391.

Resolution: Duplicate

 NPE during cleanup in parquet record writer when query fails during execution 
 on CTAS
 -

 Key: DRILL-2391
 URL: https://issues.apache.org/jira/browse/DRILL-2391
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 0.8.0
Reporter: Victoria Markman
Assignee: Steven Phillips
 Attachments: query.sql, t5.csv


 Query below fails during execution due to the user error:
 {code}
 0: jdbc:drill:schema=dfs select
 . . . . . . . . . . . .  case when columns[0] = '' then cast(null as 
 varchar(255)) else cast(columns[0] as varchar(255)) end,
 . . . . . . . . . . . .  case when columns[1] = '' then cast(null as 
 integer) else cast(columns[1] as integer) end,
 . . . . . . . . . . . .  case when columns[2] = '' then cast(null as 
 bigint) else cast(columns[2] as bigint) end,
 . . . . . . . . . . . .  case when columns[3] = '' then cast(null as 
 float) else cast(columns[3] as float) end,
 . . . . . . . . . . . .  case when columns[4] = '' then cast(null as 
 double) else cast(columns[4] as double) end,
 . . . . . . . . . . . .  case when columns[5] = '' then cast(null as 
 date) else cast(columns[6] as date) end,
 . . . . . . . . . . . .  case when columns[6] = '' then cast(null as 
 time) else cast(columns[7] as time) end,
 . . . . . . . . . . . .  case when columns[7] = '' then cast(null as 
 timestamp) else cast(columns[8] as timestamp) end,
 . . . . . . . . . . . .  case when columns[8] = '' then cast(null as 
 boolean) else cast(columns[9] as boolean) end,
 . . . . . . . . . . . .  case when columns[9] = '' then cast(null as 
 decimal(8,2)) else cast(columns[9] as decimal(8,2)) end,
 . . . . . . . . . . . .  case when columns[10] = '' then cast(null 
 as decimal(18,4)) else cast(columns[10] as decimal(18,4)) end,
 . . . . . . . . . . . .  case when columns[11] = '' then cast(null 
 as decimal(28,4)) else cast(columns[11] as decimal(28,4)) end,
 . . . . . . . . . . . .  case when columns[12] = '' then cast(null 
 as decimal(38,6)) else cast(columns[12] as decimal(38,6)) end
 . . . . . . . . . . . .  from `t5.csv`;
 Query failed: RemoteRpcException: Failure while running fragment., Value 0 
 for monthOfYear must be in the range [1,12] [ 
 5a56453c-304d-430a-b4b2-fbc48c9c2766 on atsqa4-133.qa.lab:31010 ]
 [ 5a56453c-304d-430a-b4b2-fbc48c9c2766 on atsqa4-133.qa.lab:31010 ]
 Error: exception while executing query: Failure while executing query. 
 (state=,code=0)
 {code}
 If I run the same query in CTAS, I get NPE during cleanup in parquet writer.
 {code}
 2015-03-05 22:31:05,212 [2b0726d7-4127-2a83-8c83-2376b767d800:frag:0:0] ERROR 
 o.a.d.e.w.f.AbstractStatusReporter - Error 
 50633e23-7e6f-48b8-82ec-a395c5c596e4: Failure while running fragment.
 java.lang.NullPointerException: null
 at 
 org.apache.drill.exec.store.parquet.ParquetRecordWriter.cleanup(ParquetRecordWriter.java:318)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.WriterRecordBatch.cleanup(WriterRecordBatch.java:187)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.cleanup(IteratorValidatorBatchIterator.java:148)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.record.AbstractSingleRecordBatch.cleanup(AbstractSingleRecordBatch.java:121)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.cleanup(IteratorValidatorBatchIterator.java:148)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.internalStop(ScreenCreator.java:178)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:101)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) 
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:121)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 at 
 org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303)
  

[jira] [Created] (DRILL-2395) Improve interpreted expression evaluation to only call the setup method once per batch

2015-03-05 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2395:
--

 Summary: Improve interpreted expression evaluation to only call 
the setup method once per batch
 Key: DRILL-2395
 URL: https://issues.apache.org/jira/browse/DRILL-2395
 Project: Apache Drill
  Issue Type: Improvement
  Components: Functions - Drill
Reporter: Jason Altekruse
Assignee: Daniel Barclay (Drill)
Priority: Minor


This enhancement request came out of the review for a patch for DRILL-2060. 
Copied below is the discussion from there:

Jinfeng:
Do you have a plan to move setup() call into places such that setup() will be 
called once for each VectorAccessible input?

In the code compile + evaluation model, doSetup() will be called per batch, in 
stead of per row.

Jason:
I have started working on a fix for this. Its a little complicated with setting 
constant inputs before the setup method is called. I'm trying to figure out the 
best way to share code with the rest of the input passing used in the 
EvalVisitor. Would you be okay with this being opened as an enhancement request 
to be merged a little later? Considering the current use of the interpreter 
this won't have an impact on any actual queries today.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2360) Add appropriate constant flag on UDF inputs that are used in the setup method

2015-03-02 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2360:
--

 Summary: Add appropriate constant flag on UDF inputs that are used 
in the setup method
 Key: DRILL-2360
 URL: https://issues.apache.org/jira/browse/DRILL-2360
 Project: Apache Drill
  Issue Type: Bug
  Components: Functions - Drill
Reporter: Jason Altekruse
Assignee: Daniel Barclay (Drill)


A number of UDFs do not include the appropriate flag to declare them as 
constants when the should be. Any input used in the setup method should include 
a constant flag in the @Param annotation for it (this allows us to detect and 
throw a helpful error message if they are used incorrectly). A correct example 
can be found in StringFunctions.RegexpReplace, an incorrect example can be 
found in GFloat8ToChar (available after running the freemarker generation that 
is a part of the default mvn install build).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2031) IndexOutOfBoundException when reading a wide parquet table with boolean columns

2015-02-27 Thread Jason Altekruse (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Altekruse resolved DRILL-2031.

Resolution: Fixed

Fixed in 1d2ed349699a326165c721257937905e3043418c

 IndexOutOfBoundException when reading a wide parquet table with boolean 
 columns
 ---

 Key: DRILL-2031
 URL: https://issues.apache.org/jira/browse/DRILL-2031
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 0.7.0
Reporter: Aman Sinha
Assignee: Jason Altekruse
 Fix For: 0.8.0

 Attachments: DRILL-2031-Parquet-bit-reader-fix-v2.patch, 
 DRILL-2031-Parquet-bit-reader-fix.patch, wide1.sql


 I created a wide table with 128 Lineitem columns plus 6 additional boolean 
 columns for a total of 134 columns via a CTAS script (see attached SQL).  The 
 source data is from TPCH scale factor 1 (smaller scale factor may not 
 reproduce the problem). The creation of the table was Ok.  Reading from the 
 table gives an IOBE.  See stack below.  It seems to occur for the boolean 
 columns.  
 {code}
 0: jdbc:drill:zk=local select * from wide1 where 1=0;
 java.lang.IndexOutOfBoundsException: srcIndex: 97792
   
 io.netty.buffer.PooledUnsafeDirectByteBuf.setBytes(PooledUnsafeDirectByteBuf.java:255)
  ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
   io.netty.buffer.WrappedByteBuf.setBytes(WrappedByteBuf.java:378) 
 ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
   
 io.netty.buffer.UnsafeDirectLittleEndian.setBytes(UnsafeDirectLittleEndian.java:25)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
   io.netty.buffer.DrillBuf.setBytes(DrillBuf.java:645) 
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:4.0.24.Final]
   io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:850) 
 ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
   
 org.apache.drill.exec.store.parquet.columnreaders.BitReader.readField(BitReader.java:54)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   
 org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.readValues(ColumnReader.java:120)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   
 org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.processPageData(ColumnReader.java:169)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   
 org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.determineSize(ColumnReader.java:146)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   
 org.apache.drill.exec.store.parquet.columnreaders.ColumnReader.processPages(ColumnReader.java:107)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   
 org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.readAllFixedFields(ParquetRecordReader.java:367)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   
 org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:413)
  ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
   org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:158) 
 ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2226) Create test utilities for checking plans for patterns

2015-02-12 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2226:
--

 Summary: Create test utilities for checking plans for patterns
 Key: DRILL-2226
 URL: https://issues.apache.org/jira/browse/DRILL-2226
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2218) Constant folding rule not being used in plan where the constant expression is in the select list

2015-02-11 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2218:
--

 Summary: Constant folding rule not being used in plan where the 
constant expression is in the select list
 Key: DRILL-2218
 URL: https://issues.apache.org/jira/browse/DRILL-2218
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Jason Altekruse
Priority: Minor


This test method and rule is not currently in the master branch, but it does 
appear in the patch posted for constant expression folding during planning, 
DRILL-2060. Once it is merged, the test 
TestConstantFolding.testConstExprFolding_InSelect() which is currently ignored, 
will be failing. The issue is that even though the constant folding rule for 
project is firing, and I have traced it to see that a replacement project with 
a literal is created, it is not being selected in the final plan. This seems 
rather odd, as there is a comment in the last line of the onMatch() method of 
the rule that says the following. This does not appear to be having the desired 
effect, may need to file a bug in calcite.

{code}
// New plan is absolutely better than old plan.
call.getPlanner().setImportance(project, 0.0);
{code}

Here is the query from the test, I expect the sum to be folded in planning with 
the newly enabled project constant folding rule.

{code}
select columns[0], 3+5 from cp.`test_input.csv`
{code}





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2219) Concurrent modification exception in TopLevelAllocator if a child allocator is added during loop in close()

2015-02-11 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2219:
--

 Summary: Concurrent modification exception in TopLevelAllocator if 
a child allocator is added during loop in close()
 Key: DRILL-2219
 URL: https://issues.apache.org/jira/browse/DRILL-2219
 Project: Apache Drill
  Issue Type: Bug
Reporter: Jason Altekruse
Assignee: Jason Altekruse
Priority: Critical






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-2173) Enable querying partition information without reading all data

2015-02-05 Thread Jason Altekruse (JIRA)
Jason Altekruse created DRILL-2173:
--

 Summary: Enable querying partition information without reading all 
data
 Key: DRILL-2173
 URL: https://issues.apache.org/jira/browse/DRILL-2173
 Project: Apache Drill
  Issue Type: New Feature
  Components: Query Planning  Optimization
Affects Versions: 0.7.0
Reporter: Jason Altekruse
Assignee: Jason Altekruse


When reading a series of files in nested directories, Drill currently adds 
columns representing the directory structure that was traversed to reach the 
file currently being read. These columns are stored as varchar under tha names 
dir0, dir1, ...  As these are just regular columns, Drill allows arbitrary 
queries against this data, in terms of aggregates, filter, sort, etc. To allow 
optimizing reads, basic partition pruning has already been added to prune in 
the case of an expression like dir0 = 2015 or a simple in list, which is 
converted during planning to a series of ORs of equals expressions. If users 
want to query the directory information dynamically, and not include specific 
directory names in the query, this will prompt a full table scan and filter 
operation on the dir columns. This enhancement is to allow more complex queries 
to be run against directory metadata, and only scanning the matching 
directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >