[jira] [Resolved] (DRILL-3955) Possible bug in creation of Drill columns for HBase column families

2015-12-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-3955.
---
Resolution: Fixed

Resolved as part of DRILL-2288 patch.

> Possible bug in creation of Drill columns for HBase column families
> ---
>
> Key: DRILL-3955
> URL: https://issues.apache.org/jira/browse/DRILL-3955
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> If all of the rows read by a given {{HBaseRecordReader}} have no HBase 
> columns in a given HBase column family, {{HBaseRecordReader}} doesn't create 
> a Drill column for that HBase column family.
> Later, in a {{ProjectRecordBatch}}'s {{setupNewSchema}}, because no Drill 
> column exists for that HBase column family, that {{setupNewSchema}} creates a 
> dummy Drill column using the usual {{NullableIntVector}} type.  In 
> particular, it is not a map vector as {{HBaseRecordReader}} creates when it 
> sees an HBase column family.
> Should {{HBaseRecordReader}} and/or something around setting up for reading 
> HBase (including setting up that {{ProjectRecordBatch}}) make sure that all 
> HBase column families are represented with map vectors so that 
> {{setupNewSchema}} doesn't create a dummy field of type {{NullableIntVector}}?
> The problem is that, currently, when an HBase table is read in two separate 
> fragments, one fragment (seeing rows with columns in the column family) can 
> get a map vector for the column family while the other (seeing only rows with 
> no columns in the column familar) can get the {{NullableIntVector}}.  
> Downstream code that receives the two batches ends up with an unresolved 
> conflict, yielding IndexOutOfBoundsExceptions as in DRILL-3954.
> It's not clear whether there is only one bug\--that downstream code doesn't 
> resolve {{NullableIntValue}} dummy fields right (DRILL-TBD)\--or two\--that 
> the HBase reading code should set up a Drill column for every HBase column 
> family (regardless of whether it has any columns in the rows that were read) 
> and that downstream code doesn't resolve {{NullableIntValue}} dummy fields 
> (resolution is applicable to sources other than just HBase).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3659) UnionAllRecordBatch infers wrongly from next() IterOutcome values

2015-12-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-3659.
---
   Resolution: Fixed
Fix Version/s: (was: 1.5.0)

Resolved as part of DRILL-2288 patch.

> UnionAllRecordBatch infers wrongly from next() IterOutcome values
> -
>
> Key: DRILL-3659
> URL: https://issues.apache.org/jira/browse/DRILL-3659
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> When UnionAllRecordBatch uses IterOutcome values returned from the next() 
> method of upstream batches, it seems to be using those values wrongly (making 
> incorrect inferences about what they mean).
> In particular, some switch statements seem to check for NONE vs. 
> OK_NEW_SCHEMA in order to determine whether there are any rows (instead of 
> explicitly checking the number of rows).  However, OK_NEW_SCHEMA can be 
> returned even when there are zero rows.
> The apparent latent bug in the union code blocks the fix for DRILL-2288 
> (having ScanBatch return OK_NEW_SCHEMA for a zero-rows case in which is was 
> wrongly (per the IterOutcome protocol) returning NONE without first returning 
> OK_NEW_SCHEMA).
>  
> For details of IterOutcome values, see the Javadoc documentation of 
> RecordBatch.IterOutcome (after DRILL-3641 is merged; until then, see 
> https://github.com/apache/drill/pull/113).
> For an environment/code state that exposes the UnionAllRecordBatch problems, 
> see https://github.com/dsbos/incubator-drill/tree/bugs/WORK_2288_etc, which 
> includes:
> - a test that exposes the DRILL-2288 problem;
> - an enhanced IteratorValidatorBatchIterator, which now detects IterOutcome 
> value sequence violations; and
> - a fixed (though not-yet-cleaned) version of ScanBatch that fixes the 
> DRILL-2288 problem and thereby exposes the UnionAllRecordBatch problem 
> (several test methods in each of TestUnionAll and TestUnionDistinct fail).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-4010) In HBase reader, create child vectors for referenced HBase columns to avoid spurious schema changes

2015-12-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-4010.
---
Resolution: Fixed

Resolved as part of DRILL-2288 patch.

> In HBase reader, create child vectors for referenced HBase columns to avoid 
> spurious schema changes
> ---
>
> Key: DRILL-4010
> URL: https://issues.apache.org/jira/browse/DRILL-4010
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - HBase
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> {{HBaseRecordReader}} needs to create child vectors for all 
> referenced/requested columns.
> Currently, if a fragment reads only HBase rows that don't have a particular 
> referenced column (within a given column family), downstream code adds a 
> dummy column of type {{NullableIntVector}} (as a child in the {{MapVector}} 
> for the containing HBase column family).
> If any other fragment reads an HBase row that _does_ contain the referenced 
> column, that fragment's reader will create a child 
> {{NullableVarBinaryVector}} for the referenced column.
> When the data from those two fragments comes together, Drill detects a schema 
> change, even though logically there isn't really any schema change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3641) Document RecordBatch.IterOutcome (enumerators and possible sequences)

2015-12-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-3641.
---
Resolution: Fixed

Resolved as part of DRILL-2288 patch.

> Document RecordBatch.IterOutcome (enumerators and possible sequences)
> -
>
> Key: DRILL-3641
> URL: https://issues.apache.org/jira/browse/DRILL-3641
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3641) Document RecordBatch.IterOutcome (enumerators and possible sequences)

2015-12-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-3641:
-

Assignee: Daniel Barclay (Drill)  (was: Steven Phillips)

> Document RecordBatch.IterOutcome (enumerators and possible sequences)
> -
>
> Key: DRILL-3641
> URL: https://issues.apache.org/jira/browse/DRILL-3641
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4045) FLATTEN case in testNestedFlatten yields no rows (test didn't detect)

2015-11-05 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-4045:
-

 Summary: FLATTEN case in testNestedFlatten yields no rows (test 
didn't detect)
 Key: DRILL-4045
 URL: https://issues.apache.org/jira/browse/DRILL-4045
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Reporter: Daniel Barclay (Drill)


The case of using {{FLATTEN}} on nested lists appearing in 
{{TestComplexTypeReader.testNestedFlatten()}} yields no rows.

Part of the problem is that in the code generated by {{FlattenRecordBatch}}, 
the methods are empty.

(That test method doesn't check the results, so, previous to DRILL-2288 work, 
the problem was not detected. 

However, with DRILL-2288 fixes, the flatten problem causes an 
{{IllegalArgumentException}} (logically, an assertion exception) in 
RecordBatchLoader, so the test is being disabled (with @Ignore) as part of 
DRILL-2288.)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-4010) In HBase reader, create child vectors for referenced HBase columns to avoid spurious schema changes

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-4010:
-

Assignee: Daniel Barclay (Drill)

> In HBase reader, create child vectors for referenced HBase columns to avoid 
> spurious schema changes
> ---
>
> Key: DRILL-4010
> URL: https://issues.apache.org/jira/browse/DRILL-4010
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types, Storage - HBase
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> {{HBaseRecordReader}} needs to create child vectors for all 
> referenced/requested columns.
> Currently, if a fragment reads only HBase rows that don't have a particular 
> referenced column (within a given column family), downstream code adds a 
> dummy column of type {{NullableIntVector}} (as a child in the {{MapVector}} 
> for the containing HBase column family).
> If any other fragment reads an HBase row that _does_ contain the referenced 
> column, that fragment's reader will create a child 
> {{NullableVarBinaryVector}} for the referenced column.
> When the data from those two fragments comes together, Drill detects a schema 
> change, even though logically there isn't really any schema change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983410#comment-14983410
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2288 at 11/4/15 5:42 AM:


Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[Note:  DRILL-2288 does not address other problems with {{NullableIntVector}} 
dummy columns from {{JsonRecordReader}}.]

7.  HBase tests used only one table region, ignoring known problems with 
multi-region HBase tables (so latent {{HBaseRecordReader}} problems were left 
undetected and unresolved.)   \[Note: DRILL-2288 addresses only one test table 
(increasing the number of regions on the other test tables exposed at least one 
other problem; others remain).]

8.  {{HBaseRecordReader}} didn't create a {{MapVector}} for every HBase column 
family and every requested HBase column (so {{NullableIntVector}} dummy columns 
got created, causing spurious schema changes and downstream exceptions).

9.  Some {{RecordBatch}} classes didn't reset their record counts to zero 
({{OrderedPartitionRecordBatch.recordCount}}, 
{{ProjectRecordBatch.recordCount}}, and/or {{TopNBatch.recordCount}}) (so 
downstream code tried to access elements of (correctly) empty vectors, yielding 
{{IndexOutOfBoundException}} (with ~"... {{range (0, 0)}}") ).

10.  {{RecordBatchLoader}}'s record count was not reset to zero by 
{{UnorderedReceiverBatch}} (so, again, downstream code tried to access elements 
of (correctly) empty vectors, yielding {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}") ).

11.  {{MapVector.load(...)}} left some existing vectors empty, not matching the 
returned length and the length of sibling vectors (so 
{{MapVector.getObject(int)}} got {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}").  \[Note: DRILL-2288 does not address the root problem.]

12. {{BaseTestQuery.printResult(...)}} skipped deallocation calls in the case 
of a zero-record record batch (so when it read a zero-row record batch, it 
caused a memory leak reported at Drillbit shutdown time).

13. {{TestHBaseProjectPushDown.testRowKeyAndColumnPushDown()}} used delimited 
identifiers of a form (with a period) that Drill can't handle (so the test 
failed when the test ran with multiple fragments).





was (Author: dsbos):
Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema cha

[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Description: 
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This causes 
{{MapVector.getObject(int)}} to fail, saying 
"{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 
0))}}" (one of the errors seen in fixing DRILL-2288).

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



  was:
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This causes 
{{MapVector.getObject(int)}} to fail, saying 
"{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 
0)}}" (one of the errors seen in fixing DRILL-2288).

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.




> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  This causes 
> {{MapVector.getObject(int)}} to fail, saying 
> "{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: 
> range(0, 0))}}" (one of the errors seen in fixing DRILL-2288).
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Description: 
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This causes 
{{MapVector.getObject(int)}} to fail, saying 
"{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 
0)}}" (one of the errors seen in fixing DRILL-2288).

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



  was:
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This caused 
IndexOutOfBoundsException errors saying (roughly) "

  (This caused some of the {{IndexOutOfBoundException}} errors seen in fixing 
DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.




> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  This causes 
> {{MapVector.getObject(int)}} to fail, saying 
> "{{java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: 
> range(0, 0)}}" (one of the errors seen in fixing DRILL-2288).
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-03 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Description: 
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  This caused 
IndexOutOfBoundsException errors saying (roughly) "

  (This caused some of the {{IndexOutOfBoundException}} errors seen in fixing 
DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



  was:
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  (This caused some of the 
{{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.




> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  This caused 
> IndexOutOfBoundsException errors saying (roughly) "
>   (This caused some of the {{IndexOutOfBoundException}} errors seen in fixing 
> DRILL-2288.)
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4010) In HBase reader, create child vectors for referenced HBase columns to avoid spurious schema changes

2015-11-02 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-4010:
-

 Summary: In HBase reader, create child vectors for referenced 
HBase columns to avoid spurious schema changes
 Key: DRILL-4010
 URL: https://issues.apache.org/jira/browse/DRILL-4010
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types, Storage - HBase
Reporter: Daniel Barclay (Drill)


{{HBaseRecordReader}} needs to create child vectors for all 
referenced/requested columns.

Currently, if a fragment reads only HBase rows that don't have a particular 
referenced column (within a given column family), downstream code adds a dummy 
column of type {{NullableIntVector}} (as a child in the {{MapVector}} for the 
containing HBase column family).

If any other fragment reads an HBase row that _does_ contain the referenced 
column, that fragment's reader will create a child {{NullableVarBinaryVector}} 
for the referenced column.

When the data from those two fragments comes together, Drill detects a schema 
change, even though logically there isn't really any schema change.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-02 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Description: 
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  (This caused some of the 
{{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



  was:
In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  (This caused some of the 
{{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-) mark and workaround in {{MapVector.getObject(int)}}.




> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  (This caused some of the 
> {{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-4001) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4003) Tests expecting Drill OversizedAllocationException yield OutOfMemoryError

2015-11-01 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-4003:
-

 Summary: Tests expecting Drill OversizedAllocationException yield 
OutOfMemoryError
 Key: DRILL-4003
 URL: https://issues.apache.org/jira/browse/DRILL-4003
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Data Types, Tools, Build & Test
Reporter: Daniel Barclay (Drill)


Tests that expect Drill's {{OversizedAllocationException}} (for example, 
{{TestValueVector.testFixedVectorReallocation()}}) sometimes fail with an 
{{OutOfMemoryError}} instead.

(Do the tests check whether there's enough memory available for the test before 
proceeding?)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-01 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983410#comment-14983410
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2288 at 11/1/15 10:10 PM:
-

Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[Note:  DRILL-2288 does not address other problems with {{NullableIntVector}} 
dummy columns from {{JsonRecordReader}}.]

7.  HBase tests used only one table region, ignoring known problems with 
multi-region HBase tables (so latent {{HBaseRecordReader}} problems were left 
undetected and unresolved.)   \[Note: DRILL-2288 addresses only one test table 
(increasing the number of regions on the other test tables exposed at least one 
other problem; others remain).]

8.  {{HBaseRecordReader}} didn't create a {{MapVector}} for every column family 
(so {{NullableIntVector}} dummy columns got created, causing spurious schema 
changes and downstream exceptions).

9.  Some {{RecordBatch}} classes didn't reset their record counts to zero 
({{OrderedPartitionRecordBatch.recordCount}}, 
{{ProjectRecordBatch.recordCount}}, and/or {{TopNBatch.recordCount}}) (so 
downstream code tried to access elements of (correctly) empty vectors, yielding 
{{IndexOutOfBoundException}} (with ~"... {{range (0, 0)}}") ).

10.  {{RecordBatchLoader}}'s record count was not reset to zero by 
{{UnorderedReceiverBatch}} (so, again, downstream code tried to access elements 
of (correctly) empty vectors, yielding {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}") ).

11.  {{MapVector.load(...)}} left some existing vectors empty, not matching the 
returned length and the length of sibling vectors (so 
{{MapVector.getObject(int)}} got {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}").  \[Note: DRILL-2288 does not address the root problem.]

12. {{BaseTestQuery.printResult(...)}} skipped deallocation calls in the case 
of a zero-record record batch (so when it read a zero-row record batch, it 
caused a memory leak reported at Drillbit shutdown time).

13. {{TestHBaseProjectPushDown.testRowKeyAndColumnPushDown()}} used delimited 
identifiers of a form (with a period) that Drill can't handle (so the test 
failed when the test ran with multiple fragments).





was (Author: dsbos):

Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[

[jira] [Comment Edited] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-11-01 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983410#comment-14983410
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2288 at 11/1/15 9:59 PM:



Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[Note:  DRILL-2288 does not address other problems with {{NullableIntVector}} 
dummy columns from {{JsonRecordReader}}.]

7.  HBase tests used only one table region, ignoring known problems with 
multi-region HBase tables (so latent {{HBaseRecordReader}} problems were left 
undetected and unresolved.)   \[Note: DRILL-2288 addresses only one test table 
(increasing the number of regions on the other test tables exposes at least one 
other problem).]

8.  {{HBaseRecordReader}} didn't create a {{MapVector}} for every column family 
(so {{NullableIntVector}} dummy columns got created, causing spurious schema 
changes and downstream exceptions).

9.  Some {{RecordBatch}} classes didn't reset their record counts to zero 
({{OrderedPartitionRecordBatch.recordCount}}, 
{{ProjectRecordBatch.recordCount}}, and/or {{TopNBatch.recordCount}}) (so 
downstream code tried to access elements of (correctly) empty vectors, yielding 
{{IndexOutOfBoundException}} (with ~"... {{range (0, 0)}}") ).

10.  {{RecordBatchLoader}}'s record count was not reset to zero by 
{{UnorderedReceiverBatch}} (so, again, downstream code tried to access elements 
of (correctly) empty vectors, yielding {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}") ).

11.  {{MapVector.load(...)}} left some existing vectors empty, not matching the 
returned length and the length of sibling vectors (so 
{{MapVector.getObject(int)}} got {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}").  \[Note: DRILL-2288 does not address the root problem.]

12. {{BaseTestQuery.printResult(...)}} skipped deallocation calls in the case 
of a zero-record record batch (so when it read a zero-row record batch, it 
caused a memory leak reported at Drillbit shutdown time).

13. {{TestHBaseProjectPushDown.testRowKeyAndColumnPushDown()}} used delimited 
identifiers of a form (with a period) that Drill can't handle (so the test 
failed when the test ran with multiple fragments).





was (Author: dsbos):


Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[Note:  DRILL-22

[jira] [Updated] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-11-01 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-4001:
--
Component/s: Execution - Data Types

> Empty vectors from previous batch left by 
> MapVector.load(...)/RecordBatchLoader.load(...)
> -
>
> Key: DRILL-4001
> URL: https://issues.apache.org/jira/browse/DRILL-4001
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Reporter: Daniel Barclay (Drill)
>
> In certain cases, {{MapVector.load(...)}} (called by 
> {{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
> length of zero instead of having a length matching the length of sibling 
> vectors and the number of records in the batch.  (This caused some of the 
> {{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)
> The condition seems to be that a child field (e.g., an HBase column in a 
> HBase column family) appears in an earlier batch and does not appear in a 
> later batch.  
> (The HBase column's child vector gets created (in the MapVector for the HBase 
> column family) during loading of the earlier batch.  During loading of the 
> later batch, all vectors get reset to zero length, and then only vectors for 
> fields _appearing in the batch message being loaded_ get loaded and set to 
> the length of the batch-\-other vectors created from earlier 
> messages/{{load}} calls are left with a length of zero (instead of, say, 
> being filled with nulls to the length of their siblings and the current 
> record batch).)
> See the TODO(DRILL-) mark and workaround in {{MapVector.getObject(int)}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4002) Result check doesn't execute in TestNewMathFunctions.runTest(...)

2015-10-31 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-4002:
-

 Summary: Result check doesn't execute in 
TestNewMathFunctions.runTest(...) 
 Key: DRILL-4002
 URL: https://issues.apache.org/jira/browse/DRILL-4002
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Reporter: Daniel Barclay (Drill)


In {{TestNewMathFunctions}}, method {{runTest}}'s check of the result does not 
execute.

Method {{runTest(...)}} skips the first record batch--which currently contains 
the results to be checked.

The loop that is right after that and that checks any subsequent batches never 
executes.

Additionally, the test has no self-check assertions (e.g., that a second batch 
existed) to detect that its assumptions are not longer valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-4001) Empty vectors from previous batch left by MapVector.load(...)/RecordBatchLoader.load(...)

2015-10-31 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-4001:
-

 Summary: Empty vectors from previous batch left by 
MapVector.load(...)/RecordBatchLoader.load(...)
 Key: DRILL-4001
 URL: https://issues.apache.org/jira/browse/DRILL-4001
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


In certain cases, {{MapVector.load(...)}} (called by 
{{RecordBatchLoader.load(...)}}) returns with some map child vectors having a 
length of zero instead of having a length matching the length of sibling 
vectors and the number of records in the batch.  (This caused some of the 
{{IndexOutOfBoundException}} errors seen in fixing DRILL-2288.)

The condition seems to be that a child field (e.g., an HBase column in a HBase 
column family) appears in an earlier batch and does not appear in a later 
batch.  

(The HBase column's child vector gets created (in the MapVector for the HBase 
column family) during loading of the earlier batch.  During loading of the 
later batch, all vectors get reset to zero length, and then only vectors for 
fields _appearing in the batch message being loaded_ get loaded and set to the 
length of the batch-\-other vectors created from earlier messages/{{load}} 
calls are left with a length of zero (instead of, say, being filled with nulls 
to the length of their siblings and the current record batch).)

See the TODO(DRILL-) mark and workaround in {{MapVector.getObject(int)}}.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3954) HBase tests use only 1 region, don't detect bug(s) in dummy-column NullableIntVector creation/resolution

2015-10-31 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14984247#comment-14984247
 ] 

Daniel Barclay (Drill) commented on DRILL-3954:
---

One problem revealed by increasing the numbers of regions beyond one is that 
one of the HBase tests violates the condition whose check yields the error 
message saying that column names "must be singular names" (which actually 
should say something like "simple names" (a standard term for the intended 
concept)).

(Additionally, the fact that changing the number of regions and thereby 
(presumably) fragments indicates that Drill's check for that column-name 
requirement is hacked-\-checking at the point of serialization or fragment 
splitting or whatever code imposes that requirement--rather than checking 
earlier when column names should be checked for all relevant requirements (so 
that, for example, the number of regions a table has doesn't affect whether the 
column name is checked).) 

> HBase tests use only 1 region, don't detect bug(s) in dummy-column 
> NullableIntVector creation/resolution
> 
>
> Key: DRILL-3954
> URL: https://issues.apache.org/jira/browse/DRILL-3954
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Reporter: Daniel Barclay (Drill)
>
> Currently, the HBase tests (e.g., {{TestHBaseFilterPushDown}}) use only one 
> region.
> That causes them to miss detecting a bug in creating and/or resolving dummy 
> fields ({{NullableIntVectors}} for referenced but non-existent fields) 
> somewhere between reading from HBase and wherever dummy fields are supposed 
> to be get resolved  (or maybe two separate bugs).
> Reproduction:
> In HBaseTestsSuite, change the line:
> {noformat}
> UTIL.startMiniHBaseCluster(1, 1);
> {noformat}
> to:
> {noformat}
> UTIL.startMiniHBaseCluster(1, 3);
> {noformat}
> and change the line:
> {noformat}
> TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 1);
> {noformat}
> to:
> {noformat}
> TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 3);
> {noformat}
> .
> Run unit test class {{TestHBaseFilterPushDown}}.
> Depending on which region gets processed first (it's non-deteministic), test 
> methods {{testFilterPushDownOrRowKeyEqualRangePred}} and 
> {{testFilterPushDownMultiColumns}} get exceptions like this:
> {noformat}
> java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 
> 0))
>   at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189)
>   at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211)
>   at io.netty.buffer.DrillBuf.getByte(DrillBuf.java:746)
>   at 
> org.apache.drill.exec.vector.UInt1Vector$Accessor.get(UInt1Vector.java:364)
>   at 
> org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isSet(NullableVarBinaryVector.java:391)
>   at 
> org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isNull(NullableVarBinaryVector.java:387)
>   at 
> org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:411)
>   at 
> org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:1)
>   at 
> org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:313)
>   at 
> org.apache.drill.exec.util.VectorUtil.showVectorAccessibleContent(VectorUtil.java:166)
>   at org.apache.drill.BaseTestQuery.printResult(BaseTestQuery.java:487)
>   at 
> org.apache.drill.hbase.BaseHBaseTest.printResultAndVerifyRowCount(BaseHBaseTest.java:95)
>   at 
> org.apache.drill.hbase.BaseHBaseTest.runHBaseSQLVerifyCount(BaseHBaseTest.java:91)
>   at 
> org.apache.drill.hbase.TestHBaseFilterPushDown.testFilterPushDownMultiColumns(TestHBaseFilterPushDown.java:592)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at java.lang.reflect.Method.invoke(Method.java:606)
>   at java.lang.reflect.Method.invoke(Method.java:606)
> {noformat}
> See DRILL-3955.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3955) Possible bug in creation of Drill columns for HBase column families

2015-10-31 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-3955:
-

Assignee: Daniel Barclay (Drill)

> Possible bug in creation of Drill columns for HBase column families
> ---
>
> Key: DRILL-3955
> URL: https://issues.apache.org/jira/browse/DRILL-3955
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> If all of the rows read by a given {{HBaseRecordReader}} have no HBase 
> columns in a given HBase column family, {{HBaseRecordReader}} doesn't create 
> a Drill column for that HBase column family.
> Later, in a {{ProjectRecordBatch}}'s {{setupNewSchema}}, because no Drill 
> column exists for that HBase column family, that {{setupNewSchema}} creates a 
> dummy Drill column using the usual {{NullableIntVector}} type.  In 
> particular, it is not a map vector as {{HBaseRecordReader}} creates when it 
> sees an HBase column family.
> Should {{HBaseRecordReader}} and/or something around setting up for reading 
> HBase (including setting up that {{ProjectRecordBatch}}) make sure that all 
> HBase column families are represented with map vectors so that 
> {{setupNewSchema}} doesn't create a dummy field of type {{NullableIntVector}}?
> The problem is that, currently, when an HBase table is read in two separate 
> fragments, one fragment (seeing rows with columns in the column family) can 
> get a map vector for the column family while the other (seeing only rows with 
> no columns in the column familar) can get the {{NullableIntVector}}.  
> Downstream code that receives the two batches ends up with an unresolved 
> conflict, yielding IndexOutOfBoundsExceptions as in DRILL-3954.
> It's not clear whether there is only one bug\--that downstream code doesn't 
> resolve {{NullableIntValue}} dummy fields right (DRILL-TBD)\--or two\--that 
> the HBase reading code should set up a Drill column for every HBase column 
> family (regardless of whether it has any columns in the rows that were read) 
> and that downstream code doesn't resolve {{NullableIntValue}} dummy fields 
> (resolution is applicable to sources other than just HBase).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2288) ScanBatch violates IterOutcome protocol for zero-row sources [was: missing JDBC metadata (schema) for 0-row results...]

2015-10-31 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2288:
--
Summary: ScanBatch violates IterOutcome protocol for zero-row sources [was: 
missing JDBC metadata (schema) for 0-row results...]  (was: missing JDBC 
metadata (schema) for 0-row results--ScanBatch violating IterOutcome protocol)

> ScanBatch violates IterOutcome protocol for zero-row sources [was: missing 
> JDBC metadata (schema) for 0-row results...]
> ---
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3998) Check skipping of .clear and .release in BaseTestQuery.printResult(...) (bug?)

2015-10-30 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3998:
-

 Summary: Check skipping of .clear and .release in 
BaseTestQuery.printResult(...) (bug?)
 Key: DRILL-3998
 URL: https://issues.apache.org/jira/browse/DRILL-3998
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Reporter: Daniel Barclay (Drill)


In {{BaseTestQuery.printResult(...)}}, if a loaded record batch has no records, 
the code skips calling not only the printout method but also 
{{RecordBatchLoader.clear()}} and {{QueryDataBatch.release()}} methods.  Is 
that correct?

(At some point in debugging DRILL-2288, that skipping of {{clear}} and 
{{release}} seemed to cause reporting of a memory leak.)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2288) missing JDBC metadata (schema) for 0-row results--ScanBatch violating IterOutcome protocol

2015-10-30 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983410#comment-14983410
 ] 

Daniel Barclay (Drill) commented on DRILL-2288:
---



Chain of bugs and problems encountered and (partially) addressed:

1.  {{ScanBatch.next()}} returned {{NONE}} without ever returning 
{{OK_NEW_SCHEMA}} for a source having zero rows (so downstream operators didn't 
get its schema, even for static-schema sources, or even get trigger to update 
their own schema).

2.  {{RecordBatch.IterOutcome}}, especially the allowed sequence of values, was 
not documented clearly (so developers didn't know correctly what to expect or 
provide).

3.  {{IteratorValidatorBatchIterator}} didn't validate the sequence of 
{{IterOutcome values}} (so developers weren't notified about incorrect results).

4.  {{UnionAllRecordBatch}} did not interpret {{NONE}} and {{OK_NEW_SCHEMA}} 
correctly (so it reported spurious/incorrect schema-change and/or 
empty-/non-empty input exceptions).

5.  {{ScanBatch.Mutator.isNewSchema()}} didn't handle a short-circuit OR 
{"{{||}}"} correctly in calling {{SchemaChangeCallBack.getSchemaChange()}} (so 
it didn't reset nested schema-change state, and so caused spurious 
{{OK_NEW_SCHEMA}} notifications and downstream exceptions).

6.  {{JsonRecordReader.ensureAtLeastOneField()}} didn't check whether any field 
already existed in the batch (so in that case it forcibly changed the type to 
{{NullableIntVector}}, causing schema changes and downstream exceptions). 
\[Note:  DRILL-2288 does not address other problems with {{NullableIntVector}} 
dummy columns from {{JsonRecordReader}}.]

7.  HBase tests used only one table region, ignoring known problems with 
multi-region HBase tables (so latent {{HBaseRecordReader}} problems were left 
undetected and unresolved.)   \[Note: DRILL-2288 addresses only one test table 
(increasing the number of regions on the other test tables exposes at least one 
other problem).]

8.  {{HBaseRecordReader}} didn't create a {{MapVector}} for every column family 
(so {{NullableIntVector}} dummy columns got created, causing spurious schema 
changes and downstream exceptions).

9.  Some {{RecordBatch}} classes didn't reset their record counts to zero 
({{OrderedPartitionRecordBatch.recordCount}}, 
{{ProjectRecordBatch.recordCount}}, and/or {{TopNBatch.recordCount}}) (so 
downstream code tried to access elements of (correctly) empty vectors, yielding 
{{IndexOutOfBoundException}} (with ~"... {{range (0, 0)}}") ).

10.  {{RecordBatchLoader}}'s record count was not reset to zero by 
{{UnorderedReceiverBatch}} (so, again, downstream code tried to access elements 
of (correctly) empty vectors, yielding {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}") ).

11.  {{MapVector.load(...)}} left some existing vectors empty, not matching the 
returned length and the length of sibling vectors (so 
{{MapVector.getObject(int)}} got {{IndexOutOfBoundException}} (with ~"... 
{{range (0, 0)}}").  \[Note: DRILL-2288 does not address the root problem.]


> missing JDBC metadata (schema) for 0-row results--ScanBatch violating 
> IterOutcome protocol
> --
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2288) missing JDBC metadata (schema) for 0-row results--ScanBatch violating IterOutcome protocol

2015-10-27 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2288:
--
Summary: missing JDBC metadata (schema) for 0-row results--ScanBatch 
violating IterOutcome protocol  (was: ResultSetMetaData not set for zero-row 
results (DatabaseMetaData.getColumns(...)) [ScanBatch problem])

> missing JDBC metadata (schema) for 0-row results--ScanBatch violating 
> IterOutcome protocol
> --
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3730) Change JDBC driver's DrillConnectionConfig to interface

2015-10-26 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726115#comment-14726115
 ] 

Daniel Barclay (Drill) edited comment on DRILL-3730 at 10/26/15 5:37 PM:
-

> I would rather we did not change things that are not broken.

[edit:  I probably should have started off by mentioning that I submitted this 
JIRA report then only because a reviewer pointed out a pre-existing (I think) 
TODO comment that didn't have a DRILL- annotation and wanted me to file a 
report and add that before he approved the commit.]


I'm trying to fix something that arguably _is_ broken:

{{TracingProxyDriver}} can trace calls across the rest of Drill's JDBC 
interface because it can create proxy classes wrapping Drill's implementation 
classes, since they are defined in terms of interfaces (e.g., JDBC's interface 
{{java.sql.Connection}} and Drill's derived interface 
{{org.apache.drill.jdbc.DrillConnection}}).  

However, because {{DrillConnectionConfig}} is a class, the tracing proxy can't 
create a proxy class wrapping it, and so can't trace calls to methods defined 
on {{DrillConnectionConfig}} as it can for methods defined on other Drill JDBC 
driver objects.

There doesn't seem to be any valid reason for that limitation and 
inconsistency.  (At deeper levels, we'd run into classes possibly with greater 
justification for staying/being classes, but this {{DrillConnectionConfig}} is 
in {{org.apache.drill.jdbc}}, not deeper in the system.)

Additionally, there's the simple inconsistency that most of the types (classes 
or interfaces) in the rest of the Drill JDBC interface, other than those that 
have to be classes (e.g., {{SQLException}} subclasses and the driver class), 
are interfaces, but {{DrillConnectionConfig}} is not.

> I really don't see a good reason for changing all the classes to interfaces. 

Please note that I'm proposing changing only one class in this JIRA report.

> One of the effects similar refactoring has had is that history is getting 
> lost. 

Perhaps there's a way to still record whatever part of history might actually 
be important without sacrificing functionality (re proxying in this case), 
usability (re this case's being in a published/external interface), and 
maintainability / easy testability?

What types/forms or cases of history loss are you thinking of?  




was (Author: dsbos):
> I would rather we did not change things that are not broken.

I'm trying to fix something that arguably _is_ broken:

{{TracingProxyDriver}} can trace calls across the rest of Drill's JDBC 
interface because it can create proxy classes wrapping Drill's implementation 
classes, since they are defined in terms of interfaces (e.g., JDBC's interface 
{{java.sql.Connection}} and Drill's derived interface 
{{org.apache.drill.jdbc.DrillConnection}}).  

However, because {{DrillConnectionConfig}} is a class, the tracing proxy can't 
create a proxy class wrapping it, and so can't trace calls to methods defined 
on {{DrillConnectionConfig}} as it can for methods defined on other Drill JDBC 
driver objects.

There doesn't seem to be any valid reason for that limitation and 
inconsistency.  (At deeper levels, we'd run into classes possibly with greater 
justification for staying/being classes, but this {{DrillConnectionConfig}} is 
in {{org.apache.drill.jdbc}}, not deeper in the system.)

Additionally, there's the simple inconsistency that most of the types (classes 
or interfaces) in the rest of the Drill JDBC interface, other than those that 
have to be classes (e.g., {{SQLException}} subclasses and the driver class), 
are interfaces, but {{DrillConnectionConfig}} is not.

> I really don't see a good reason for changing all the classes to interfaces. 

Please note that I'm proposing changing only one class in this JIRA report.

> One of the effects similar refactoring has had is that history is getting 
> lost. 

Perhaps there's a way to still record whatever part of history might actually 
be important without sacrificing functionality (re proxying in this case), 
usability (re this case's being in a published/external interface), and 
maintainability / easy testability?

What types/forms or cases of history loss are you thinking of?  



> Change JDBC driver's DrillConnectionConfig to interface
> ---
>
> Key: DRILL-3730
> URL: https://issues.apache.org/jira/browse/DRILL-3730
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
> Fix For: Future
>
>
> Change {{org.apache.drill.jdbc.DrillConnectionConfig}} (in Drill's published 
> interface for the JDBC driver) from being a class to being an interface.
> Move the implementation (including the inheritance from 
> {{net.hydromat

[jira] [Updated] (DRILL-3955) Possible bug in creation of Drill columns for HBase column families

2015-10-21 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3955:
--
Description: 
If all of the rows read by a given {{HBaseRecordReader}} have no HBase columns 
in a given HBase column family, {{HBaseRecordReader}} doesn't create a Drill 
column for that HBase column family.

Later, in a {{ProjectRecordBatch}}'s {{setupNewSchema}}, because no Drill 
column exists for that HBase column family, that {{setupNewSchema}} creates a 
dummy Drill column using the usual {{NullableIntVector}} type.  In particular, 
it is not a map vector as {{HBaseRecordReader}} creates when it sees an HBase 
column family.

Should {{HBaseRecordReader}} and/or something around setting up for reading 
HBase (including setting up that {{ProjectRecordBatch}}) make sure that all 
HBase column families are represented with map vectors so that 
{{setupNewSchema}} doesn't create a dummy field of type {{NullableIntVector}}?


The problem is that, currently, when an HBase table is read in two separate 
fragments, one fragment (seeing rows with columns in the column family) can get 
a map vector for the column family while the other (seeing only rows with no 
columns in the column familar) can get the {{NullableIntVector}}.  Downstream 
code that receives the two batches ends up with an unresolved conflict, 
yielding IndexOutOfBoundsExceptions as in DRILL-3954.

It's not clear whether there is only one bug\--that downstream code doesn't 
resolve {{NullableIntValue}} dummy fields right (DRILL-TBD)\--or two\--that the 
HBase reading code should set up a Drill column for every HBase column family 
(regardless of whether it has any columns in the rows that were read) and that 
downstream code doesn't resolve {{NullableIntValue}} dummy fields (resolution 
is applicable to sources other than just HBase).






  was:
If all of the rows read by a given {{HBaseRecordReader}} have no HBase columns 
in a given HBase column family, {{HBaseRecordReader}} doesn't create a Drill 
column for that HBase column family.

Later, in a {{ProjectRecordBatch}}'s {{setupNewSchema}}, because no Drill 
column exists for that HBase column family, that {{setupNewSchema}} creates a 
dummy Drill column using the usual {{NullableIntVector}} type.  In particular, 
it is not a map vector as {{HBaseRecordReader}} creates when it sees an HBase 
column family.

Should {{HBaseRecordReader}} and/or something around setting up for reading 
HBase (including setting up that {{ProjectRecordBatch}}) make sure that all 
HBase column families are represented with map vectors so that 
{{setupNewSchema}} doesn't create a dummy field of type {{NullableIntVector}}?


The problem is that, currently, when an HBase table is read in two separate 
fragments, one fragment (seeing rows with columns in the column family) can get 
a map vector for the column family while the other (seeing only rows with no 
columns in the column familar) can get the {{NullableIntVector}}.  Downstream 
code that receives the two batches ends up with an unresolved conflict, 
yielding IndexOutOfBoundsExceptions as in DRILL-3954.

It's not clear whether there is only one bug--that downstream code doesn't 
resolve {{NullableIntValue}} dummy fields right (DRILL-TBD)--or two--that the 
HBase reading code should set up a Drill column for every HBase column family 
(regardless of whether it has any columns in the rows that were read) and that 
downstream code doesn't resolve {{NullableIntValue}} dummy fields (resolution 
is applicable to sources other than just HBase).







> Possible bug in creation of Drill columns for HBase column families
> ---
>
> Key: DRILL-3955
> URL: https://issues.apache.org/jira/browse/DRILL-3955
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> If all of the rows read by a given {{HBaseRecordReader}} have no HBase 
> columns in a given HBase column family, {{HBaseRecordReader}} doesn't create 
> a Drill column for that HBase column family.
> Later, in a {{ProjectRecordBatch}}'s {{setupNewSchema}}, because no Drill 
> column exists for that HBase column family, that {{setupNewSchema}} creates a 
> dummy Drill column using the usual {{NullableIntVector}} type.  In 
> particular, it is not a map vector as {{HBaseRecordReader}} creates when it 
> sees an HBase column family.
> Should {{HBaseRecordReader}} and/or something around setting up for reading 
> HBase (including setting up that {{ProjectRecordBatch}}) make sure that all 
> HBase column families are represented with map vectors so that 
> {{setupNewSchema}} doesn't create a dummy field of type {{NullableIntVector}}?
> The problem is that, currently, when an HBase table is read in two separate 
> fragments, one fragment (seeing rows with 

[jira] [Commented] (DRILL-3949) new storage plugin config is not saved on OS X

2015-10-21 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14967846#comment-14967846
 ] 

Daniel Barclay (Drill) commented on DRILL-3949:
---

It seems that Drill shouldn't be storing that configuration information in 
{{/tmp}} by default in the first place.  (We're talking about configuration 
information, not temporary working files.)

How about storing that configuration information somewhere under the user's 
home directory, one typical place for such information?

> new storage plugin config is not saved on OS X
> --
>
> Key: DRILL-3949
> URL: https://issues.apache.org/jira/browse/DRILL-3949
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.2.0
> Environment: OS X 10.9.5
>Reporter: Kristine Hahn
>
> Drill 1.2.0 running in embedded mode removes /tmp/drill/sys.storage_plugins 
> on my OSX.  I created a new storage plugin, and it disappeared when I 
> rebooted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3954) HBase tests use only 1 region, don't detect bug(s) in dummy-column NullableIntVector creation/resolution

2015-10-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3954:
--
Description: 
Currently, the HBase tests (e.g., {{TestHBaseFilterPushDown}}) use only one 
region.

That causes them to miss detecting a bug in creating and/or resolving dummy 
fields ({{NullableIntVectors}} for referenced but non-existent fields) 
somewhere between reading from HBase and wherever dummy fields are supposed to 
be get resolved  (or maybe two separate bugs).

Reproduction:

In HBaseTestsSuite, change the line:
{noformat}
UTIL.startMiniHBaseCluster(1, 1);
{noformat}
to:
{noformat}
UTIL.startMiniHBaseCluster(1, 3);
{noformat}
and change the line:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 1);
{noformat}
to:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 3);
{noformat}
.

Run unit test class {{TestHBaseFilterPushDown}}.

Depending on which region gets processed first (it's non-deteministic), test 
methods {{testFilterPushDownOrRowKeyEqualRangePred}} and 
{{testFilterPushDownMultiColumns}} get exceptions like this:

{noformat}
java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189)
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211)
at io.netty.buffer.DrillBuf.getByte(DrillBuf.java:746)
at 
org.apache.drill.exec.vector.UInt1Vector$Accessor.get(UInt1Vector.java:364)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isSet(NullableVarBinaryVector.java:391)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isNull(NullableVarBinaryVector.java:387)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:411)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:1)
at 
org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:313)
at 
org.apache.drill.exec.util.VectorUtil.showVectorAccessibleContent(VectorUtil.java:166)
at org.apache.drill.BaseTestQuery.printResult(BaseTestQuery.java:487)
at 
org.apache.drill.hbase.BaseHBaseTest.printResultAndVerifyRowCount(BaseHBaseTest.java:95)
at 
org.apache.drill.hbase.BaseHBaseTest.runHBaseSQLVerifyCount(BaseHBaseTest.java:91)
at 
org.apache.drill.hbase.TestHBaseFilterPushDown.testFilterPushDownMultiColumns(TestHBaseFilterPushDown.java:592)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.lang.reflect.Method.invoke(Method.java:606)
{noformat}

See DRILL-3955.

  was:
Currently, the HBase tests (e.g., {{TestHBaseFilterPushDown}}) use only one 
region.

That causes them to miss detecting a bug in creating and/or resolving dummy 
fields ({{NullableIntVectors}} for referenced but non-existent fields) 
somewhere between reading from HBase and wherever dummy fields are supposed to 
be get resolved  (or maybe two separate bugs).

Reproduction:

In HBaseTestsSuite, change the line:
{noformat}
UTIL.startMiniHBaseCluster(1, 1);
{noformat}
to:
{noformat}
UTIL.startMiniHBaseCluster(1, 3);
{noformat}
and change the line:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 1);
{noformat}
to:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 3);
{noformat}
.

Run unit test class {{TestHBaseFilterPushDown}}.

Depending on which region gets processed first (it's non-deteministic), test 
methods {{testFilterPushDownOrRowKeyEqualRangePred}} and 
{{testFilterPushDownMultiColumns}} get exceptions like this:

{noformat}
java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189)
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211)
at io.netty.buffer.DrillBuf.getByte(DrillBuf.java:746)
at 
org.apache.drill.exec.vector.UInt1Vector$Accessor.get(UInt1Vector.java:364)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isSet(NullableVarBinaryVector.java:391)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isNull(NullableVarBinaryVector.java:387)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:411)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:1)
at 
org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:313)
at 
org.apache.drill.exec.util.VectorUtil.showVectorAccessibleContent(VectorUtil.java:166)
at org.apache.drill.BaseTestQuery.printResult(

[jira] [Created] (DRILL-3955) Possible bug in creation of Drill columns for HBase column families

2015-10-19 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3955:
-

 Summary: Possible bug in creation of Drill columns for HBase 
column families
 Key: DRILL-3955
 URL: https://issues.apache.org/jira/browse/DRILL-3955
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


If all of the rows read by a given {{HBaseRecordReader}} have no HBase columns 
in a given HBase column family, {{HBaseRecordReader}} doesn't create a Drill 
column for that HBase column family.

Later, in a {{ProjectRecordBatch}}'s {{setupNewSchema}}, because no Drill 
column exists for that HBase column family, that {{setupNewSchema}} creates a 
dummy Drill column using the usual {{NullableIntVector}} type.  In particular, 
it is not a map vector as {{HBaseRecordReader}} creates when it sees an HBase 
column family.

Should {{HBaseRecordReader}} and/or something around setting up for reading 
HBase (including setting up that {{ProjectRecordBatch}}) make sure that all 
HBase column families are represented with map vectors so that 
{{setupNewSchema}} doesn't create a dummy field of type {{NullableIntVector}}?


The problem is that, currently, when an HBase table is read in two separate 
fragments, one fragment (seeing rows with columns in the column family) can get 
a map vector for the column family while the other (seeing only rows with no 
columns in the column familar) can get the {{NullableIntVector}}.  Downstream 
code that receives the two batches ends up with an unresolved conflict, 
yielding IndexOutOfBoundsExceptions as in DRILL-3954.

It's not clear whether there is only one bug--that downstream code doesn't 
resolve {{NullableIntValue}} dummy fields right (DRILL-TBD)--or two--that the 
HBase reading code should set up a Drill column for every HBase column family 
(regardless of whether it has any columns in the rows that were read) and that 
downstream code doesn't resolve {{NullableIntValue}} dummy fields (resolution 
is applicable to sources other than just HBase).








--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3954) HBase tests use only 1 region, don't detect bug(s) in dummy-column NullableIntVector creation/resolution

2015-10-19 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3954:
--
Description: 
Currently, the HBase tests (e.g., {{TestHBaseFilterPushDown}}) use only one 
region.

That causes them to miss detecting a bug in creating and/or resolving dummy 
fields ({{NullableIntVectors}} for referenced but non-existent fields) 
somewhere between reading from HBase and wherever dummy fields are supposed to 
be get resolved  (or maybe two separate bugs).

Reproduction:

In HBaseTestsSuite, change the line:
{noformat}
UTIL.startMiniHBaseCluster(1, 1);
{noformat}
to:
{noformat}
UTIL.startMiniHBaseCluster(1, 3);
{noformat}
and change the line:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 1);
{noformat}
to:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 3);
{noformat}
.

Run unit test class {{TestHBaseFilterPushDown}}.

Depending on which region gets processed first (it's non-deteministic), test 
methods {{testFilterPushDownOrRowKeyEqualRangePred}} and 
{{testFilterPushDownMultiColumns}} get exceptions like this:

{noformat}
java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189)
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211)
at io.netty.buffer.DrillBuf.getByte(DrillBuf.java:746)
at 
org.apache.drill.exec.vector.UInt1Vector$Accessor.get(UInt1Vector.java:364)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isSet(NullableVarBinaryVector.java:391)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isNull(NullableVarBinaryVector.java:387)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:411)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:1)
at 
org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:313)
at 
org.apache.drill.exec.util.VectorUtil.showVectorAccessibleContent(VectorUtil.java:166)
at org.apache.drill.BaseTestQuery.printResult(BaseTestQuery.java:487)
at 
org.apache.drill.hbase.BaseHBaseTest.printResultAndVerifyRowCount(BaseHBaseTest.java:95)
at 
org.apache.drill.hbase.BaseHBaseTest.runHBaseSQLVerifyCount(BaseHBaseTest.java:91)
at 
org.apache.drill.hbase.TestHBaseFilterPushDown.testFilterPushDownMultiColumns(TestHBaseFilterPushDown.java:592)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.lang.reflect.Method.invoke(Method.java:606)
{noformat}

See DRILL-TBD.

  was:
Currently, the HBase tests (e.g., {{TestHBaseFilterPushDown}}) use only one 
region.

That causes them to miss detecting a bug in creating and/or resolving dummy 
fields ({{NullableIntVectors}} for referenced but non-existent fields) 
somewhere between reading from HBase and {{ProjectRecordBatch.setupNewSchema}} 
(or maybe two separate bugs).

Reproduction:

In HBaseTestsSuite, change the line:
{noformat}
UTIL.startMiniHBaseCluster(1, 1);
{noformat}
to:
{noformat}
UTIL.startMiniHBaseCluster(1, 3);
{noformat}
and change the line:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 1);
{noformat}
to:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 3);
{noformat}
.

Run unit test class {{TestHBaseFilterPushDown}}.

Depending on which region gets processed first (it's non-deteministic), test 
methods {{testFilterPushDownOrRowKeyEqualRangePred}} and 
{{testFilterPushDownMultiColumns}} get exceptions like this:

{noformat}
java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189)
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211)
at io.netty.buffer.DrillBuf.getByte(DrillBuf.java:746)
at 
org.apache.drill.exec.vector.UInt1Vector$Accessor.get(UInt1Vector.java:364)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isSet(NullableVarBinaryVector.java:391)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isNull(NullableVarBinaryVector.java:387)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:411)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:1)
at 
org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:313)
at 
org.apache.drill.exec.util.VectorUtil.showVectorAccessibleContent(VectorUtil.java:166)
at org.apache.drill.BaseTestQuery.printResult(BaseTestQuery.java

[jira] [Created] (DRILL-3954) HBase tests use only 1 region, don't detect bug(s) in dummy-column NullableIntVector creation/resolution

2015-10-19 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3954:
-

 Summary: HBase tests use only 1 region, don't detect bug(s) in 
dummy-column NullableIntVector creation/resolution
 Key: DRILL-3954
 URL: https://issues.apache.org/jira/browse/DRILL-3954
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - HBase
Reporter: Daniel Barclay (Drill)


Currently, the HBase tests (e.g., {{TestHBaseFilterPushDown}}) use only one 
region.

That causes them to miss detecting a bug in creating and/or resolving dummy 
fields ({{NullableIntVectors}} for referenced but non-existent fields) 
somewhere between reading from HBase and {{ProjectRecordBatch.setupNewSchema}} 
(or maybe two separate bugs).

Reproduction:

In HBaseTestsSuite, change the line:
{noformat}
UTIL.startMiniHBaseCluster(1, 1);
{noformat}
to:
{noformat}
UTIL.startMiniHBaseCluster(1, 3);
{noformat}
and change the line:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 1);
{noformat}
to:
{noformat}
TestTableGenerator.generateHBaseDataset1(admin, TEST_TABLE_1, 3);
{noformat}
.

Run unit test class {{TestHBaseFilterPushDown}}.

Depending on which region gets processed first (it's non-deteministic), test 
methods {{testFilterPushDownOrRowKeyEqualRangePred}} and 
{{testFilterPushDownMultiColumns}} get exceptions like this:

{noformat}
java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:189)
at io.netty.buffer.DrillBuf.chk(DrillBuf.java:211)
at io.netty.buffer.DrillBuf.getByte(DrillBuf.java:746)
at 
org.apache.drill.exec.vector.UInt1Vector$Accessor.get(UInt1Vector.java:364)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isSet(NullableVarBinaryVector.java:391)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.isNull(NullableVarBinaryVector.java:387)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:411)
at 
org.apache.drill.exec.vector.NullableVarBinaryVector$Accessor.getObject(NullableVarBinaryVector.java:1)
at 
org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:313)
at 
org.apache.drill.exec.util.VectorUtil.showVectorAccessibleContent(VectorUtil.java:166)
at org.apache.drill.BaseTestQuery.printResult(BaseTestQuery.java:487)
at 
org.apache.drill.hbase.BaseHBaseTest.printResultAndVerifyRowCount(BaseHBaseTest.java:95)
at 
org.apache.drill.hbase.BaseHBaseTest.runHBaseSQLVerifyCount(BaseHBaseTest.java:91)
at 
org.apache.drill.hbase.TestHBaseFilterPushDown.testFilterPushDownMultiColumns(TestHBaseFilterPushDown.java:592)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.lang.reflect.Method.invoke(Method.java:606)
{noformat}

See DRILL-TBD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3355) Implement ResultSetMetadata's getPrecision, getScale, getColumnDisplaySize (need RPC-level data)

2015-10-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3355:
--
Summary: Implement ResultSetMetadata's getPrecision, getScale, 
getColumnDisplaySize (need RPC-level data)  (was: Implement 
ResultSetMetadata.get{Precision,Scale,ColumnDisplaySize} (need RPC-level data))

> Implement ResultSetMetadata's getPrecision, getScale, getColumnDisplaySize 
> (need RPC-level data)
> 
>
> Key: DRILL-3355
> URL: https://issues.apache.org/jira/browse/DRILL-3355
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
> Fix For: Future
>
>
> JDBC ResultSetMetadata methods getPrecision(...), getScale(...), and 
> getColumnDisplaySize() are not implemented, currently because required data 
> is not available in the RPC-level data.
> The unavailable data includes:
> - string type lengths (N in VARCHAR(N), BINARY(N))
> - interval qualifier information (which units, leading digit precision, 
> fractional seconds precision)
> - datetime type fractional seconds precision
> (Whether an interval is a YEAR/MONTH interval or is a DAY/HOUR/MINUTE/SECOND 
> interval is available.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3805) Empty JSON on LHS UNION non empty JSON on RHS must return results

2015-10-08 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949163#comment-14949163
 ] 

Daniel Barclay (Drill) commented on DRILL-3805:
---

Sounds related to DRILL-2288.

> Empty JSON on LHS UNION non empty JSON on RHS must return results
> -
>
> Key: DRILL-3805
> URL: https://issues.apache.org/jira/browse/DRILL-3805
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Minor
> Fix For: 1.3.0
>
>
> When the input on LHS of UNION operator is empty and there is non empty input 
> on RHS of Union, we need to return the data from the RHS. Currently we return 
> SchemaChangeException.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select key1 from `empty.json` UNION select key1 
> from `fewRows.json`;
> Error: SYSTEM ERROR: SchemaChangeException: The left input of Union-All 
> should not come from an empty data source
> Fragment 0:0
> [Error Id: f0fcff87-f470-46a8-9733-316b7da1a87f on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3151) ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)

2015-10-08 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14949041#comment-14949041
 ] 

Daniel Barclay (Drill) commented on DRILL-3151:
---

Resolved except for remainder now reported in DRILL-3355.


> ResultSetMetaData not as specified by JDBC (null/dummy value, not ""/etc.)
> --
>
> Key: DRILL-3151
> URL: https://issues.apache.org/jira/browse/DRILL-3151
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DRILL-3151.3.patch.txt
>
>
> In Drill's JDBC driver, some ResultSetMetaData methods don't return what JDBC 
> specifies they should return.
> Some cases:
> {{getTableName(int)}}:
> - (JDBC says: {{table name or "" if not applicable}})
> - Drill returns {{null}} (instead of empty string or table name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getSchemaName(int)}}:
> - (JDBC says: {{schema name or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or schema name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{getCatalogName(int)}}:
> - (JDBC says: {{the name of the catalog for the table in which the given 
> column appears or "" if not applicable}})
> - Drill returns "{{\-\-UNKNOWN--}}" (instead of empty string or catalog name)
> - (Drill indicates "not applicable" even when from named table, e.g., for  
> "{{SELECT * FROM INFORMATION_SCHEMA.CATALOGS}}".)
> {{isSearchable(int)}}:
> - (JDBC says:  {{Indicates whether the designated column can be used in a 
> where clause.}})
> - Drill returns {{false}}.
> {{getColumnClassName(int}}:
> - (JDBC says: {{the fully-qualified name of the class in the Java programming 
> language that would be used by the method ResultSet.getObject to retrieve the 
> value in the specified column. This is the class name used for custom 
> mapping.}})
> - Drill returns "{{none}}" (instead of the correct class name).
> More cases:
> {{getColumnDisplaySize}}
> - (JDBC says (quite ambiguously): {{the normal maximum number of characters 
> allowed as the width of the designated column}})
> - Drill always returns {{10}}!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3903) Querying empty directory yield internal index-out-of-bounds error

2015-10-06 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3903:
-

 Summary: Querying empty directory yield internal 
index-out-of-bounds error
 Key: DRILL-3903
 URL: https://issues.apache.org/jira/browse/DRILL-3903
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Reporter: Daniel Barclay (Drill)
Assignee: Jacques Nadeau


Trying to use an empty directory as a table results in an internal 
IndexOutOfBounds error:

{noformat}
0: jdbc:drill:zk=localhost:2181> SELECT *   FROM 
`dfs`.`root`.`/tmp/empty_directory`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: 66ff61ed-ea41-4af9-87c5-f91480ef1b21 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=localhost:2181> 
{noformat}






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3902) Bad error message: core cause not included in text; maybe wrong kind

2015-10-06 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3902:
-

 Summary: Bad error message:  core cause not included in text; 
maybe wrong kind
 Key: DRILL-3902
 URL: https://issues.apache.org/jira/browse/DRILL-3902
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


When trying to use an empty directory as a table causes Drill to fail by 
hitting an IndexOutOfBoundsException, the final error message includes the text 
from the IndexOutOfBoundsException's getMessage()--but fails to mention 
IndexOutOfBoundsException itself (or equivalent information):

{noformat}
0: jdbc:drill:zk=localhost:2181> SELECT *   FROM 
`dfs`.`root`.`/tmp/empty_directory`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: 66ff61ed-ea41-4af9-87c5-f91480ef1b21 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=localhost:2181> 
{noformat}

Also, since this isn't a coherent/intentional validation error but an internal 
error, shouldn't this be a SYSTEM ERROR message?

(Does the SYSTEM ERROR case including the exception class name in the message?)

Daniel







--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2528) Drill-JDBC-All Jar uses outdated classes

2015-10-05 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944102#comment-14944102
 ] 

Daniel Barclay (Drill) commented on DRILL-2528:
---

Also, we've stopped using ProGuard for the JDBC-all Jar file.

> Drill-JDBC-All Jar uses outdated classes
> 
>
> Key: DRILL-2528
> URL: https://issues.apache.org/jira/browse/DRILL-2528
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 0.8.0
> Environment: RHEL 6.4
>Reporter: Kunal Khatua
>Assignee: DrillCommitter
> Fix For: 1.0.0
>
> Attachments: DRILL-2582.1.patch.txt
>
>
> Since the DrillResultSet.getQueryId() method was unavailable when using the 
> Drill-JDBC-All jar, I originally thought there were multiple copies of the 
> DrillResultSet class within the code base, but I see only one in GitHub. 
> However, when decompiling the two JDBC jar files, the drill-jdbc-all...jar 
> shows missing elements within the DrillResultSet class.
> The build creation process is using outdated source code (or dependencies), 
> as the following is missing in the DrillResultSet class
> import org.apache.drill.exec.proto.helper.QueryIdHelper;
>   public String getQueryId() {
> if (this.queryId != null) {
>   return QueryIdHelper.getQueryId(this.queryId);
> }
> return null;
>   }



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3885) Column alias "`f.c`" rejected if number of regions is > 1 in HBase unit tests

2015-10-01 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3885:
-

 Summary: Column alias "`f.c`" rejected if number of regions is > 1 
in HBase unit tests
 Key: DRILL-3885
 URL: https://issues.apache.org/jira/browse/DRILL-3885
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


Drill rejects the column alias {{`f.c`}}, because of its period character, in 
this query:

{noformat}
SELECT
  row_key, convert_from(tableName.f.c, 'UTF8') `f.c`
FROM
  hbase.`TestTable3` tableName
WHERE
  row_key LIKE '08%0' OR row_key LIKE '%70'
{noformat}

in unit test {{TestHBaseFilterPushDown.testFilterPushDownRowKeyLike}} if the 
number of regions used in {{HBaseTestsSuite}} is set to something greater than 
one.

One problem seems to be that the validation check is inconsistent, happening 
only if the data structure containing that alias get serialized and 
deserialized.

The rejection of that alias seems like a problem (at least from the SQL level), 
although it seems that it might be reasonable given some nearby code, 
suggesting that maybe names/expressions/something aren't encoded enough to 
handle name segments with periods. 

The exception stack trace is:
{noformat}
org.apache.drill.exec.rpc.RpcException: 
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
UnsupportedOperationException: Field references must be singular names.

Fragment 1:1

[Error Id: 34475f52-6f22-43be-9011-c31a84469781 on dev-linux2:31010]
at 
org.apache.drill.exec.rpc.RpcException.mapException(RpcException.java:60)
at 
org.apache.drill.exec.client.DrillClient$ListHoldingResultsListener.getResults(DrillClient.java:386)
at 
org.apache.drill.exec.client.DrillClient.runQuery(DrillClient.java:291)
at 
org.apache.drill.BaseTestQuery.testRunAndReturn(BaseTestQuery.java:292)
at 
org.apache.drill.BaseTestQuery.testSqlWithResults(BaseTestQuery.java:279)
at 
org.apache.drill.hbase.BaseHBaseTest.runHBaseSQLlWithResults(BaseHBaseTest.java:86)
at 
org.apache.drill.hbase.BaseHBaseTest.runHBaseSQLVerifyCount(BaseHBaseTest.java:90)
at 
org.apache.drill.hbase.TestHBaseFilterPushDown.testFilterPushDownRowKeyLike(TestHBaseFilterPushDown.java:466)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:606)
at java.lang.reflect.Method.invoke(Method.java:606)
Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
ERROR: UnsupportedOperationException: Field references must be singular names.

Fragment 1:1

[Error Id: 34475f52-6f22-43be-9011-c31a84469781 on dev-linux2:31010]
at 
org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
at 
org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
at 
org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:1)
at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
at 
org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:1)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at 
io.netty.channel.nio.NioEventLoop.proc

[jira] [Updated] (DRILL-1805) Colon in file simple name in directory causes view not found

2015-09-30 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-1805:
--
Summary: Colon in file simple name in directory causes view not found  
(was: view not found if view file directory contains child with colon in simple 
name)

> Colon in file simple name in directory causes view not found
> 
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not 
> found" errors) if the directory being scanned for view files contains a file 
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method {{testDRILL_811View}} in Drill's 
> {{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
> fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
> The cause is that Hadoop filesystem glob-pattern-matching code 
> ({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
> {{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
> pathname strings and relative URI-style {{Path}} strings.
> The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} 
> to get the raw final segment of the pathname and then passes that as the 
> second argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} 
> syntax) without encoding the raw segment into a relative URI/{{Path}} string 
> by prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, 
> String, String)}} does internally).  
> It seems that {{glob()}} should first use Path(String, String, String) to 
> handle that encoding and then call {{Path.Path(Path, Path)}}.
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1805) Colon in file simple name in directory causes view not found

2015-09-30 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939099#comment-14939099
 ] 

Daniel Barclay (Drill) commented on DRILL-1805:
---

The problem is not limited to view files. 

> Colon in file simple name in directory causes view not found
> 
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not 
> found" errors) if the directory being scanned for view files contains a file 
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method {{testDRILL_811View}} in Drill's 
> {{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
> fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
> The cause is that Hadoop filesystem glob-pattern-matching code 
> ({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
> {{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
> pathname strings and relative URI-style {{Path}} strings.
> The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} 
> to get the raw final segment of the pathname and then passes that as the 
> second argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} 
> syntax) without encoding the raw segment into a relative URI/{{Path}} string 
> by prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, 
> String, String)}} does internally).  
> It seems that {{glob()}} should first use Path(String, String, String) to 
> handle that encoding and then call {{Path.Path(Path, Path)}}.
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1805) view not found if view file directory contains child with colon in simple name

2015-09-30 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-1805:
--
Description: 
Scanning the file system for view files fails (resulting in "Table 'vv' not 
found" errors) if the directory being scanned for view files contains a file 
whose simple name (last pathname segment) contains a colon.

For example, the unit test method {{testDRILL_811View}} in Drill's 
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".

The cause is that Hadoop filesystem glob-pattern-matching code 
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
pathname strings and relative URI-style {{Path}} strings.

The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to 
get the raw final segment of the pathname and then passes that as the second 
argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} syntax) 
without encoding the raw segment into a relative URI/{{Path}} string by 
prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, String, 
String)}} does internally).  

It seems that {{glob()}} should first use Path(String, String, String) to 
handle that encoding and then call {{Path.Path(Path, Path)}}.

Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.  


  was:
Scanning the file system for view files fails (resulting in "Table 'vv' not 
found" errors) if the directory being scanned for view files contains a file 
whose simple name (last pathname segment) contains a colon.

For example, the unit test method {{testDRILL_811View}} in Drill's 
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".

The cause is that Hadoop filesystem glob-pattern-matching code 
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
pathname strings and relative URI-style {{Path}} strings.  [Note:  Changes to 
isolate tests from each other might su

The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to 
get the raw final segment of the pathname and passes that as the second 
argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} syntax) 
without encoding the raw segment into a relative URI/{{Path}} string by 
prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, String, 
String)}} does internally).

Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.  



> view not found if view file directory contains child with colon in simple name
> --
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not 
> found" errors) if the directory being scanned for view files contains a file 
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method {{testDRILL_811View}} in Drill's 
> {{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
> fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
> The cause is that Hadoop filesystem glob-pattern-matching code 
> ({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
> {{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
> pathname strings and relative URI-style {{Path}} strings.
> The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} 
> to get the raw final segment of the pathname and then passes that as the 
> second argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} 
> syntax) without encoding the raw segment into a relative URI/{{Path}} string 
> by prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, 
> String, String)}} does internally).  
> It seems that {{glob()}} should first use Path(String, String, String) to 
> handle that encoding and then call {{Path.Path(Path, Path)}}.
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1805) view not found if view file directory contains child with colon in simple name

2015-09-30 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-1805:
--
Description: 
Scanning the file system for view files fails (resulting in "Table 'vv' not 
found" errors) if the directory being scanned for view files contains a file 
whose simple name (last pathname segment) contains a colon.

For example, the unit test method {{testDRILL_811View}} in Drill's 
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".

The cause is that Hadoop filesystem glob-pattern-matching code 
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
pathname strings and relative URI-style {{Path}} strings.  [Note:  Changes to 
isolate tests from each other might su

The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to 
get the raw final segment of the pathname and passes that as the second 
argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} syntax) 
without encoding the raw segment into a relative URI/{{Path}} string by 
prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, String, 
String)}} does internally).

Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.  


  was:
Scanning the file system for view files fails (resulting in "Table 'vv' not 
found" errors) if the directory being scanned for view files contains a file 
whose simple name (last pathname segment) contains a colon.

For example, the unit test method {{testDRILL_811View}} in Drill's 
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".

The cause is that Hadoop filesystem glob-pattern-matching code 
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
pathname strings and relative URI-style {{Path}} strings.  [Note:  Changes to 
isolate tests from each other might su

The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to 
get the raw final segment of the pathname and passes that as the second 
argument to {{Path.Path(Path, String)}} (which (undocumentedly) takes 
URI/{{Path}} syntax) without encoding the raw segment into a relative 
URI/{{Path}} string by prepending "{{./}}" because of the colon (e.g., as 
{{Path.Path(String, String, String)}} does internally).

Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.  



> view not found if view file directory contains child with colon in simple name
> --
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not 
> found" errors) if the directory being scanned for view files contains a file 
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method {{testDRILL_811View}} in Drill's 
> {{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
> fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
> The cause is that Hadoop filesystem glob-pattern-matching code 
> ({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
> {{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
> pathname strings and relative URI-style {{Path}} strings.  [Note:  Changes to 
> isolate tests from each other might su
> The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} 
> to get the raw final segment of the pathname and passes that as the second 
> argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} syntax) 
> without encoding the raw segment into a relative URI/{{Path}} string by 
> prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, 
> String, String)}} does internally).
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1805) view not found if view file directory contains child with colon in simple name

2015-09-30 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-1805:
--
Description: 
Scanning the file system for view files fails (resulting in "Table 'vv' not 
found" errors) if the directory being scanned for view files contains a file 
whose simple name (last pathname segment) contains a colon.

For example, the unit test method {{testDRILL_811View}} in Drill's 
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".

The cause is that Hadoop filesystem glob-pattern-matching code 
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
pathname strings and relative URI-style {{Path}} strings.  [Note:  Changes to 
isolate tests from each other might su

The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to 
get the raw final segment of the pathname and passes that as the second 
argument to {{Path.Path(Path, String)}} (which (undocumentedly) takes 
URI/{{Path}} syntax) without encoding the raw segment into a relative 
URI/{{Path}} string by prepending "{{./}}" because of the colon (e.g., as 
{{Path.Path(String, String, String)}} does internally).

Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.  


  was:
Scanning the file system for view files fails (resulting in "Table 'vv' not 
found" errors) if the directory being scanned for view files contains a file 
whose simple name (last pathname segment) contains a colon.

For example, the unit test method testDRILL_811View in Drill's 
./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java fails 
if /tmp contains a file named like "aptitude-root.1528:JIsVaZ".

The cause is that Hadoop filesystem glob-pattern-matching code 
(org.apache.hadoop.fs.Globber's glob() and org.apache.hadoop.fs.Path's 
Path(Path,String)) mixes up relative file pathname strings and relative 
URI-style Path strings.

Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.  



> view not found if view file directory contains child with colon in simple name
> --
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not 
> found" errors) if the directory being scanned for view files contains a file 
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method {{testDRILL_811View}} in Drill's 
> {{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}} 
> fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
> The cause is that Hadoop filesystem glob-pattern-matching code 
> ({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling 
> {{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file 
> pathname strings and relative URI-style {{Path}} strings.  [Note:  Changes to 
> isolate tests from each other might su
> The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} 
> to get the raw final segment of the pathname and passes that as the second 
> argument to {{Path.Path(Path, String)}} (which (undocumentedly) takes 
> URI/{{Path}} syntax) without encoding the raw segment into a relative 
> URI/{{Path}} string by prepending "{{./}}" because of the colon (e.g., as 
> {{Path.Path(String, String, String)}} does internally).
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3861) Apparent uncontrolled format string error in table name error reporting

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936446#comment-14936446
 ] 

Daniel Barclay (Drill) commented on DRILL-3861:
---

A scan of the code didn't reveal the source of the above problem, but did 
reveal several cases of basically uncontrolled format strings and superfluous 
formatting ({{format}}/{{fprintf}} calls with no substitutions):

In {{ExpressionInterpreterTest.showValueVectorContent}}:
{noformat}
  System.out.printf(row + "th value: " + cellString + "\n");
{noformat}

In {{TestParquetWriter.runTestAndValidate}}:
{noformat}
  String validateQuery = String.format("SELECT %s FROM " + outputFile, 
validationSelection);
{noformat}

In {{TestAggregateFunctionsQuery.testDateAggFunction}}:
{noformat}
String result = String.format("MAX_DATE="+ t + "; " + "MIN_DATE=" + t1 + 
"\n");
{noformat}

In {{TestCTAS.ctasPartitionWithEmptyList}}:
{noformat}
  errorMsgTestHelper(ctasQuery,
  String.format("PARSE ERROR: Encountered \"AS\""));
{noformat}







> Apparent uncontrolled format string error in table name error reporting
> ---
>
> Key: DRILL-3861
> URL: https://issues.apache.org/jira/browse/DRILL-3861
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Reporter: Daniel Barclay (Drill)
>
> It seems that a data string is being used as a printf format string.
> In the following, note the percent character in name of the table file (which 
> does not exist, apparently trying to cause an expected no-such-table error) 
> and that the actual error mentions format conversion characters:
> {noformat}
> 0: jdbc:drill:zk=local> select * from `test%percent.json`;
> Sep 29, 2015 2:59:37 PM org.apache.calcite.sql.validate.SqlValidatorException 
> 
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
> 'test%percent.json' not found
> Sep 29, 2015 2:59:37 PM org.apache.calcite.runtime.CalciteException 
> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, 
> column 15 to line 1, column 33: Table 'test%percent.json' not found
> Error: SYSTEM ERROR: UnknownFormatConversionException: Conversion = 'p'
> [Error Id: 8025e561-6ba1-4045-bbaa-a96cafc7f719 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}
> (Selecting SQL Parser component because I _think_ table/file existing is 
> checked in validation called in or near the parsing step.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3864) TestBuilder "Unexpected column" message doesn't show records

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3864:
-

 Summary: TestBuilder "Unexpected column" message doesn't show 
records
 Key: DRILL-3864
 URL: https://issues.apache.org/jira/browse/DRILL-3864
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Reporter: Daniel Barclay (Drill)
Assignee: Jason Altekruse


When {{TestBuilder}} reports that the actual result set contains an unexpected 
column, it doesn't show any whole expected record (as 
it shows some expected record and some actual records for the "did not find 
expected record in result set" case).

Showing a couple of whole expected records, rather than just reporting the 
unexpected column name(s), would speed up diagnosis of test failures.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3863) TestBuilder.baseLineColumns(...) doesn't take net strings; parses somehow--can't test some names

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3863:
-

 Summary: TestBuilder.baseLineColumns(...) doesn't take net 
strings; parses somehow--can't test some names
 Key: DRILL-3863
 URL: https://issues.apache.org/jira/browse/DRILL-3863
 Project: Apache Drill
  Issue Type: Bug
  Components: Tools, Build & Test
Reporter: Daniel Barclay (Drill)
Assignee: Jason Altekruse


{{TestBuilder}}'s {{baseLineColumns(String...)}} method doesn't take the given 
strings as net column names, and instead tries to parse them somehow, but 
doesn't parse them as the SQL parser would (and that method's Javadoc 
documentation doesn't seem to say how the strings are parsed/interpreted or 
indicate any third way of specifying arbitrary net column names).

That means that certain column names _cannot be checked_ for (cannot be used in 
the result set being checked).

For example, in Drill, the SQL delimited identifier  "{{`Column B`}}"  
specifies a net column name of "{{Column B}}".  However, passing that net 
column name (that is, a {{String}} representing that net column name) to 
{{baseLineColumns}} results in a strange parsing error.  (See Test Class 1 and 
the error in Failure Trace 1.)

Checking whether {{baseLineColumns}} takes SQL-level syntax for column names 
rather than net column names (by passing a string including the back-quote 
characters of the delimited identifier) seems to indicate that 
{{baseLineColumns}} doesn't take that syntax that either.  (See Test Class 2 
and the three expected/returned records in Failure Trace 2.)

That seems to mean that it's impossible to use {{baseLineColumns}} to validate 
certain column names (including the fairly simple/common case of alias names 
containing spaces for output formatting purposes).


Test Class 1:
{noformat}
import org.junit.Test;

public class TestTEMPFileNameBugs extends BaseTestQuery {

  @Test
  public void test1() throws Exception {
testBuilder()
.sqlQuery( "SELECT * FROM ( VALUES (1, 2) ) AS T(column_a, `Column B`)" )
.unOrdered()
.baselineColumns("column_a", "Column B")
.baselineValues(1, 2)
.go();
  }
}
{noformat}

Failure Trace 1:
{noformat}
org.apache.drill.common.exceptions.ExpressionParsingException: Expression has 
syntax error! line 1:0:no viable alternative at input 'Column'
at 
org.apache.drill.common.expression.parser.ExprParser.displayRecognitionError(ExprParser.java:169)
at org.antlr.runtime.BaseRecognizer.reportError(BaseRecognizer.java:186)
at 
org.apache.drill.common.expression.parser.ExprParser.lookup(ExprParser.java:5163)
at 
org.apache.drill.common.expression.parser.ExprParser.atom(ExprParser.java:4370)
at 
org.apache.drill.common.expression.parser.ExprParser.unaryExpr(ExprParser.java:4252)
at 
org.apache.drill.common.expression.parser.ExprParser.xorExpr(ExprParser.java:3954)
at 
org.apache.drill.common.expression.parser.ExprParser.mulExpr(ExprParser.java:3821)
at 
org.apache.drill.common.expression.parser.ExprParser.addExpr(ExprParser.java:3689)
at 
org.apache.drill.common.expression.parser.ExprParser.relExpr(ExprParser.java:3564)
at 
org.apache.drill.common.expression.parser.ExprParser.equExpr(ExprParser.java:3436)
at 
org.apache.drill.common.expression.parser.ExprParser.andExpr(ExprParser.java:3310)
at 
org.apache.drill.common.expression.parser.ExprParser.orExpr(ExprParser.java:3185)
at 
org.apache.drill.common.expression.parser.ExprParser.condExpr(ExprParser.java:3110)
at 
org.apache.drill.common.expression.parser.ExprParser.expression(ExprParser.java:3041)
at 
org.apache.drill.common.expression.parser.ExprParser.parse(ExprParser.java:206)
at org.apache.drill.TestBuilder.parsePath(TestBuilder.java:202)
at org.apache.drill.TestBuilder.baselineColumns(TestBuilder.java:333)
at 
org.apache.drill.TestTEMPFileNameBugs.test1(TestTEMPFileNameBugs.java:30)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:606)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.lang.reflect.Method.invoke(Method.java:606)
{noformat}

Test Class 2:
{noformat}
import org.junit.Test;

public class TestTEMPFileNameBugs extends BaseTestQuery {

  @Test
  public void test1() throws Exception {
testBuilder()
.sqlQuery( "SELECT * FROM ( VALUES (1, 2) ) AS T(column_a, `Column B`)" )
.unOrdered()
.baselineColumns("column_a", "`Column B`")
.baselineValues(1, 2)
.go();
  }

}
{noformat}

Failure Trace 2:
{noformat}

java.lang.Exception: After matching 0 records, did not find expected record in 
result set: `Column B` : 2, `column_a` : 1, 


Some examples of expected records:`Column B` : 2, `column_a` : 1, 


 Some examples of records returned by th

[jira] [Comment Edited] (DRILL-3861) Apparent uncontrolled format string error in table name error reporting

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936057#comment-14936057
 ] 

Daniel Barclay (Drill) edited comment on DRILL-3861 at 9/29/15 10:56 PM:
-

This error feels like the same flavor of format-string error, but it has {#\} 
instead of {%\}:


{noformat}
0: jdbc:drill:zk=local> select * from `dfs.tmp`.`file:test#numbersign.json`;
Error: SYSTEM ERROR: IllegalFormatWidthException: 23


[Error Id: 056a4667-7d97-420a-9072-17c9d1c660c7 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}



was (Author: dsbos):
This error feels like the same flavor of format-string error, but it has {#\} 
instead of {\%}:


{noformat}
0: jdbc:drill:zk=local> select * from `dfs.tmp`.`file:test#numbersign.json`;
Error: SYSTEM ERROR: IllegalFormatWidthException: 23


[Error Id: 056a4667-7d97-420a-9072-17c9d1c660c7 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


> Apparent uncontrolled format string error in table name error reporting
> ---
>
> Key: DRILL-3861
> URL: https://issues.apache.org/jira/browse/DRILL-3861
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Reporter: Daniel Barclay (Drill)
>
> It seems that a data string is being used as a printf format string.
> In the following, note the percent character in name of the table file (which 
> does not exist, apparently trying to cause an expected no-such-table error) 
> and that the actual error mentions format conversion characters:
> {noformat}
> 0: jdbc:drill:zk=local> select * from `test%percent.json`;
> Sep 29, 2015 2:59:37 PM org.apache.calcite.sql.validate.SqlValidatorException 
> 
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
> 'test%percent.json' not found
> Sep 29, 2015 2:59:37 PM org.apache.calcite.runtime.CalciteException 
> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, 
> column 15 to line 1, column 33: Table 'test%percent.json' not found
> Error: SYSTEM ERROR: UnknownFormatConversionException: Conversion = 'p'
> [Error Id: 8025e561-6ba1-4045-bbaa-a96cafc7f719 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}
> (Selecting SQL Parser component because I _think_ table/file existing is 
> checked in validation called in or near the parsing step.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-3861) Apparent uncontrolled format string error in table name error reporting

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936057#comment-14936057
 ] 

Daniel Barclay (Drill) edited comment on DRILL-3861 at 9/29/15 10:56 PM:
-

This error feels like the same flavor of format-string error, but it has {#\} 
instead of {\%}:


{noformat}
0: jdbc:drill:zk=local> select * from `dfs.tmp`.`file:test#numbersign.json`;
Error: SYSTEM ERROR: IllegalFormatWidthException: 23


[Error Id: 056a4667-7d97-420a-9072-17c9d1c660c7 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}



was (Author: dsbos):
This error feels like the same flavor of format-string error, but it has {#} 
instead of {%}:


{noformat}
0: jdbc:drill:zk=local> select * from `dfs.tmp`.`file:test#numbersign.json`;
Error: SYSTEM ERROR: IllegalFormatWidthException: 23


[Error Id: 056a4667-7d97-420a-9072-17c9d1c660c7 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


> Apparent uncontrolled format string error in table name error reporting
> ---
>
> Key: DRILL-3861
> URL: https://issues.apache.org/jira/browse/DRILL-3861
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Reporter: Daniel Barclay (Drill)
>
> It seems that a data string is being used as a printf format string.
> In the following, note the percent character in name of the table file (which 
> does not exist, apparently trying to cause an expected no-such-table error) 
> and that the actual error mentions format conversion characters:
> {noformat}
> 0: jdbc:drill:zk=local> select * from `test%percent.json`;
> Sep 29, 2015 2:59:37 PM org.apache.calcite.sql.validate.SqlValidatorException 
> 
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
> 'test%percent.json' not found
> Sep 29, 2015 2:59:37 PM org.apache.calcite.runtime.CalciteException 
> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, 
> column 15 to line 1, column 33: Table 'test%percent.json' not found
> Error: SYSTEM ERROR: UnknownFormatConversionException: Conversion = 'p'
> [Error Id: 8025e561-6ba1-4045-bbaa-a96cafc7f719 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}
> (Selecting SQL Parser component because I _think_ table/file existing is 
> checked in validation called in or near the parsing step.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3861) Apparent uncontrolled format string error in table name error reporting

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14936057#comment-14936057
 ] 

Daniel Barclay (Drill) commented on DRILL-3861:
---

This error feels like the same flavor of format-string error, but it has {#} 
instead of {%}:


{noformat}
0: jdbc:drill:zk=local> select * from `dfs.tmp`.`file:test#numbersign.json`;
Error: SYSTEM ERROR: IllegalFormatWidthException: 23


[Error Id: 056a4667-7d97-420a-9072-17c9d1c660c7 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


> Apparent uncontrolled format string error in table name error reporting
> ---
>
> Key: DRILL-3861
> URL: https://issues.apache.org/jira/browse/DRILL-3861
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser
>Reporter: Daniel Barclay (Drill)
>
> It seems that a data string is being used as a printf format string.
> In the following, note the percent character in name of the table file (which 
> does not exist, apparently trying to cause an expected no-such-table error) 
> and that the actual error mentions format conversion characters:
> {noformat}
> 0: jdbc:drill:zk=local> select * from `test%percent.json`;
> Sep 29, 2015 2:59:37 PM org.apache.calcite.sql.validate.SqlValidatorException 
> 
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
> 'test%percent.json' not found
> Sep 29, 2015 2:59:37 PM org.apache.calcite.runtime.CalciteException 
> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, 
> column 15 to line 1, column 33: Table 'test%percent.json' not found
> Error: SYSTEM ERROR: UnknownFormatConversionException: Conversion = 'p'
> [Error Id: 8025e561-6ba1-4045-bbaa-a96cafc7f719 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}
> (Selecting SQL Parser component because I _think_ table/file existing is 
> checked in validation called in or near the parsing step.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3861) Apparent uncontrolled format string error in table name error reporting

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3861:
-

 Summary: Apparent uncontrolled format string error in table name 
error reporting
 Key: DRILL-3861
 URL: https://issues.apache.org/jira/browse/DRILL-3861
 Project: Apache Drill
  Issue Type: Bug
  Components: SQL Parser
Reporter: Daniel Barclay (Drill)


It seems that a data string is being used as a printf format string.

In the following, note the percent character in name of the table file (which 
does not exist, apparently trying to cause an expected no-such-table error) and 
that the actual error mentions format conversion characters:

{noformat}
0: jdbc:drill:zk=local> select * from `test%percent.json`;
Sep 29, 2015 2:59:37 PM org.apache.calcite.sql.validate.SqlValidatorException 

SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
'test%percent.json' not found
Sep 29, 2015 2:59:37 PM org.apache.calcite.runtime.CalciteException 
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 
15 to line 1, column 33: Table 'test%percent.json' not found
Error: SYSTEM ERROR: UnknownFormatConversionException: Conversion = 'p'


[Error Id: 8025e561-6ba1-4045-bbaa-a96cafc7f719 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}

(Selecting SQL Parser component because I _think_ table/file existing is 
checked in validation called in or near the parsing step.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3860) Delimited identifier `*` breaks in select list--acts like plain asterisk token

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3860:
-

 Summary: Delimited identifier `*` breaks in select list--acts like 
plain asterisk token
 Key: DRILL-3860
 URL: https://issues.apache.org/jira/browse/DRILL-3860
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


At least when it appears in a SELECT list, a delimited identifier whose body 
consists of a single asterisk ("{{`*`}}") is not treated consistently with 
other delimited identifiers (that is, specifying a column whose name matches 
the body ("{{*}}").)

For example, in the following, notice how in the first two queries, each select 
list delimited identifier selects the one expected column, but in the third 
query, instead of selecting the one expected column, it selected all columns 
(list the regular "{{*}}" in the fourth query):

{noformat}
0: jdbc:drill:zk=local> SELECT `a` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++
| a  |
++
| 1  |
++
1 row selected (0.132 seconds)
0: jdbc:drill:zk=local> SELECT `.` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++
| .  |
++
| 2  |
++
1 row selected (0.152 seconds)
0: jdbc:drill:zk=local> SELECT `*` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++++
| a  | .  | *  |
++++
| 1  | 2  | 3  |
++++
1 row selected (0.136 seconds)
0: jdbc:drill:zk=local> SELECT * FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++++
| a  | .  | *  |
++++
| 1  | 2  | 3  |
++++
1 row selected (0.128 seconds)
0: jdbc:drill:zk=local> 
{noformat}

Although this acts the same as if the SQL parser treated the delimited 
identifier {{`*`}} as a plain asterisk token, that does not seem to be the 
actual mechanism for this behavior.  (The problem seems to be further 
downstream.)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3860) Delimited identifier `*` breaks in select list--acts like plain asterisk token

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3860:
--
Description: 
At least when it appears in a SELECT list, a delimited identifier whose body 
consists of a single asterisk ("{{`\*`}}") is not treated consistently with 
other delimited identifiers (that is, specifying a column whose name matches 
the body ("{{\*}}").)

For example, in the following, notice how in the first two queries, each select 
list delimited identifier selects the one expected column, but in the third 
query, instead of selecting the one expected column, it selected all columns 
(list the regular "{{*}}" in the fourth query):

{noformat}
0: jdbc:drill:zk=local> SELECT `a` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++
| a  |
++
| 1  |
++
1 row selected (0.132 seconds)
0: jdbc:drill:zk=local> SELECT `.` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++
| .  |
++
| 2  |
++
1 row selected (0.152 seconds)
0: jdbc:drill:zk=local> SELECT `*` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++++
| a  | .  | *  |
++++
| 1  | 2  | 3  |
++++
1 row selected (0.136 seconds)
0: jdbc:drill:zk=local> SELECT * FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++++
| a  | .  | *  |
++++
| 1  | 2  | 3  |
++++
1 row selected (0.128 seconds)
0: jdbc:drill:zk=local> 
{noformat}

Although this acts the same as if the SQL parser treated the delimited 
identifier {{`\*`}} as a plain asterisk token, that does not seem to be the 
actual mechanism for this behavior.  (The problem seems to be further 
downstream.)



  was:
At least when it appears in a SELECT list, a delimited identifier whose body 
consists of a single asterisk ("{{`*`}}") is not treated consistently with 
other delimited identifiers (that is, specifying a column whose name matches 
the body ("{{*}}").)

For example, in the following, notice how in the first two queries, each select 
list delimited identifier selects the one expected column, but in the third 
query, instead of selecting the one expected column, it selected all columns 
(list the regular "{{*}}" in the fourth query):

{noformat}
0: jdbc:drill:zk=local> SELECT `a` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++
| a  |
++
| 1  |
++
1 row selected (0.132 seconds)
0: jdbc:drill:zk=local> SELECT `.` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++
| .  |
++
| 2  |
++
1 row selected (0.152 seconds)
0: jdbc:drill:zk=local> SELECT `*` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++++
| a  | .  | *  |
++++
| 1  | 2  | 3  |
++++
1 row selected (0.136 seconds)
0: jdbc:drill:zk=local> SELECT * FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
++++
| a  | .  | *  |
++++
| 1  | 2  | 3  |
++++
1 row selected (0.128 seconds)
0: jdbc:drill:zk=local> 
{noformat}

Although this acts the same as if the SQL parser treated the delimited 
identifier {{`*`}} as a plain asterisk token, that does not seem to be the 
actual mechanism for this behavior.  (The problem seems to be further 
downstream.)




> Delimited identifier `*` breaks in select list--acts like plain asterisk token
> --
>
> Key: DRILL-3860
> URL: https://issues.apache.org/jira/browse/DRILL-3860
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>
> At least when it appears in a SELECT list, a delimited identifier whose body 
> consists of a single asterisk ("{{`\*`}}") is not treated consistently with 
> other delimited identifiers (that is, specifying a column whose name matches 
> the body ("{{\*}}").)
> For example, in the following, notice how in the first two queries, each 
> select list delimited identifier selects the one expected column, but in the 
> third query, instead of selecting the one expected column, it selected all 
> columns (list the regular "{{*}}" in the fourth query):
> {noformat}
> 0: jdbc:drill:zk=local> SELECT `a` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
> ++
> | a  |
> ++
> | 1  |
> ++
> 1 row selected (0.132 seconds)
> 0: jdbc:drill:zk=local> SELECT `.` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
> ++
> | .  |
> ++
> | 2  |
> ++
> 1 row selected (0.152 seconds)
> 0: jdbc:drill:zk=local> SELECT `*` FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
> ++++
> | a  | .  | *  |
> ++++
> | 1  | 2  | 3  |
> ++++
> 1 row selected (0.136 seconds)
> 0: jdbc:drill:zk=local> SELECT * FROM (VALUES (1, 2, 3)) AS T(a, `.`, `*`);
> ++++
> | a  | .  | *  |
> ++++
> | 1  | 2  | 3  |
> ++++
> 1 row selected (0.128 seconds)
> 0: jdbc:drill:zk=local> 
> {noformat}
> Although this acts the same as if the SQL p

[jira] [Created] (DRILL-3859) Delimited identifier `*` breaks in aliases list--causes AssertionError saying "INTEGER"

2015-09-29 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3859:
-

 Summary: Delimited identifier `*` breaks in aliases list--causes 
AssertionError saying "INTEGER"
 Key: DRILL-3859
 URL: https://issues.apache.org/jira/browse/DRILL-3859
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)


When a delimited identifier whose body consists of a single asterisk 
("{{`*`}}") is used in a subquery aliases list and the containing query's 
select list refers to a non-existent column, Drill throws an assertion error 
(and its message says only "INTEGER").

For example, see the third query and its error message in the following:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM (VALUES (0, 0)) AS T(A, `*`);
+++
| A  | *  |
+++
| 0  | 0  |
+++
1 row selected (0.143 seconds)
0: jdbc:drill:zk=local> SELECT a FROM (VALUES (0, 0)) AS T(A, `*`);
++
| a  |
++
| 0  |
++
1 row selected (0.127 seconds)
0: jdbc:drill:zk=local> SELECT b FROM (VALUES (0, 0)) AS T(A, `*`);
Error: SYSTEM ERROR: AssertionError: INTEGER


[Error Id: 859d3ef9-b1e7-497b-b366-b64b2b592b69 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local>
{noformat}

It's not clear that the problem is in the SQL parser area (because another bug 
with {{`*`}} that _acts_ the same as a hypothetical parser problem strongly 
seems to be downstream of the parser).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3848) Increase timeout time on several tests that time out frequently.

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3848:
--
Fix Version/s: 1.3.0

> Increase timeout time on several tests that time out frequently.
> 
>
> Key: DRILL-3848
> URL: https://issues.apache.org/jira/browse/DRILL-3848
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.2.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
>
> Increase test timeout time a bit on: 
> - TestTpchDistributedConcurrent
> - TestExampleQueries
> - TestFunctionsQuery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3848) Increase timeout time on several tests that time out frequently.

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3848:
--
Component/s: Tools, Build & Test

> Increase timeout time on several tests that time out frequently.
> 
>
> Key: DRILL-3848
> URL: https://issues.apache.org/jira/browse/DRILL-3848
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.2.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
>
> Increase test timeout time a bit on: 
> - TestTpchDistributedConcurrent
> - TestExampleQueries
> - TestFunctionsQuery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3848) Increase timeout time on several tests that time out frequently.

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3848:
--
Affects Version/s: 1.2.0

> Increase timeout time on several tests that time out frequently.
> 
>
> Key: DRILL-3848
> URL: https://issues.apache.org/jira/browse/DRILL-3848
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.2.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.3.0
>
>
> Increase test timeout time a bit on: 
> - TestTpchDistributedConcurrent
> - TestExampleQueries
> - TestFunctionsQuery



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3659) UnionAllRecordBatch infers wrongly from next() IterOutcome values

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-3659:
-

Assignee: Daniel Barclay (Drill)  (was: Sean Hsuan-Yi Chu)

> UnionAllRecordBatch infers wrongly from next() IterOutcome values
> -
>
> Key: DRILL-3659
> URL: https://issues.apache.org/jira/browse/DRILL-3659
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.4.0
>
>
> When UnionAllRecordBatch uses IterOutcome values returned from the next() 
> method of upstream batches, it seems to be using those values wrongly (making 
> incorrect inferences about what they mean).
> In particular, some switch statements seem to check for NONE vs. 
> OK_NEW_SCHEMA in order to determine whether there are any rows (instead of 
> explicitly checking the number of rows).  However, OK_NEW_SCHEMA can be 
> returned even when there are zero rows.
> The apparent latent bug in the union code blocks the fix for DRILL-2288 
> (having ScanBatch return OK_NEW_SCHEMA for a zero-rows case in which is was 
> wrongly (per the IterOutcome protocol) returning NONE without first returning 
> OK_NEW_SCHEMA).
>  
> For details of IterOutcome values, see the Javadoc documentation of 
> RecordBatch.IterOutcome (after DRILL-3641 is merged; until then, see 
> https://github.com/apache/drill/pull/113).
> For an environment/code state that exposes the UnionAllRecordBatch problems, 
> see https://github.com/dsbos/incubator-drill/tree/bugs/WORK_2288_etc, which 
> includes:
> - a test that exposes the DRILL-2288 problem;
> - an enhanced IteratorValidatorBatchIterator, which now detects IterOutcome 
> value sequence violations; and
> - a fixed (though not-yet-cleaned) version of ScanBatch that fixes the 
> DRILL-2288 problem and thereby exposes the UnionAllRecordBatch problem 
> (several test methods in each of TestUnionAll and TestUnionDistinct fail).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3659) UnionAllRecordBatch infers wrongly from next() IterOutcome values

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934361#comment-14934361
 ] 

Daniel Barclay (Drill) commented on DRILL-3659:
---

That fix is now included in the fix for DRILL-2288.

> UnionAllRecordBatch infers wrongly from next() IterOutcome values
> -
>
> Key: DRILL-3659
> URL: https://issues.apache.org/jira/browse/DRILL-3659
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Daniel Barclay (Drill)
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.4.0
>
>
> When UnionAllRecordBatch uses IterOutcome values returned from the next() 
> method of upstream batches, it seems to be using those values wrongly (making 
> incorrect inferences about what they mean).
> In particular, some switch statements seem to check for NONE vs. 
> OK_NEW_SCHEMA in order to determine whether there are any rows (instead of 
> explicitly checking the number of rows).  However, OK_NEW_SCHEMA can be 
> returned even when there are zero rows.
> The apparent latent bug in the union code blocks the fix for DRILL-2288 
> (having ScanBatch return OK_NEW_SCHEMA for a zero-rows case in which is was 
> wrongly (per the IterOutcome protocol) returning NONE without first returning 
> OK_NEW_SCHEMA).
>  
> For details of IterOutcome values, see the Javadoc documentation of 
> RecordBatch.IterOutcome (after DRILL-3641 is merged; until then, see 
> https://github.com/apache/drill/pull/113).
> For an environment/code state that exposes the UnionAllRecordBatch problems, 
> see https://github.com/dsbos/incubator-drill/tree/bugs/WORK_2288_etc, which 
> includes:
> - a test that exposes the DRILL-2288 problem;
> - an enhanced IteratorValidatorBatchIterator, which now detects IterOutcome 
> value sequence violations; and
> - a fixed (though not-yet-cleaned) version of ScanBatch that fixes the 
> DRILL-2288 problem and thereby exposes the UnionAllRecordBatch problem 
> (several test methods in each of TestUnionAll and TestUnionDistinct fail).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1773) Issues when using JAVA code through Drill JDBC driver

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934326#comment-14934326
 ] 

Daniel Barclay (Drill) commented on DRILL-1773:
---

The reported/old default behavior of emitting Drill log messages (at DEBUG 
level and above) is fixed by DRILL-3589 (for the JDBC-all single-Jar-file 
version of the JDBC driver). Now the default is to not emit Drill log messages 
at all, although it does emit one SLF4J warning that logging is not configured. 
 

To configure SLF4J, for now see DRILL-3741, and eventually see the JDBC driver 
documentation.



> Issues when using JAVA code through Drill JDBC driver
> -
>
> Key: DRILL-1773
> URL: https://issues.apache.org/jira/browse/DRILL-1773
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 0.6.0, 0.7.0
> Environment: Tested on 0.6R3
>Reporter: Hao Zhu
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DrillHandler.patch, DrillJdbcExample.java
>
>
> When executing attached simple JAVA code through Drill JDBC driver(0..6 R3), 
> the query got executed and returned the correct result, however there are 2 
> issues:
> 1. It keeps printing DEBUG information.
> Is it default behavior or is there any way to disable DEBUG?
> eg:
> {code}
> 13:30:44.702 [Client-1] DEBUG i.n.u.i.JavassistTypeParameterMatcherGenerator 
> - Generated: 
> io.netty.util.internal.__matchers__.io.netty.buffer.ByteBufMatcher
> 13:30:44.706 [Client-1] DEBUG i.n.u.i.JavassistTypeParameterMatcherGenerator 
> - Generated: 
> io.netty.util.internal.__matchers__.org.apache.drill.exec.rpc.OutboundRpcMessageMatcher
> 13:30:44.708 [Client-1] DEBUG i.n.u.i.JavassistTypeParameterMatcherGenerator 
> - Generated: 
> io.netty.util.internal.__matchers__.org.apache.drill.exec.rpc.InboundRpcMessageMatcher
> 13:30:44.717 [Client-1] DEBUG io.netty.util.Recycler - 
> -Dio.netty.recycler.maxCapacity.default: 262144
> {code}
> 2. After the query finished, it seems not close the connection and did not 
> return to shell prompt. 
> I have to manually issue "ctrl-C" to stop it.
> {code}
> 13:31:11.239 [main-SendThread(xx.xx.xx.xx:5181)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x1497d1d0d040839 after 0ms
> 13:31:24.573 [main-SendThread(xx.xx.xx.xx:5181)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x1497d1d0d040839 after 1ms
> 13:31:37.906 [main-SendThread(xx.xx.xx.xx:5181)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x1497d1d0d040839 after 0ms
> ^CAdministrators-MacBook-Pro-40:xxx$ 
> {code}
> 
> The DrillJdbcExample.java is attached.
> Command to run:
> {code}
> javac DrillJdbcExample.java
> java DrillJdbcExample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-1773) Issues when using JAVA code through Drill JDBC driver

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-1773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14934305#comment-14934305
 ] 

Daniel Barclay (Drill) commented on DRILL-1773:
---

Now see comments below and DRILL-3741.

> Issues when using JAVA code through Drill JDBC driver
> -
>
> Key: DRILL-1773
> URL: https://issues.apache.org/jira/browse/DRILL-1773
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 0.6.0, 0.7.0
> Environment: Tested on 0.6R3
>Reporter: Hao Zhu
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: DrillHandler.patch, DrillJdbcExample.java
>
>
> When executing attached simple JAVA code through Drill JDBC driver(0..6 R3), 
> the query got executed and returned the correct result, however there are 2 
> issues:
> 1. It keeps printing DEBUG information.
> Is it default behavior or is there any way to disable DEBUG?
> eg:
> {code}
> 13:30:44.702 [Client-1] DEBUG i.n.u.i.JavassistTypeParameterMatcherGenerator 
> - Generated: 
> io.netty.util.internal.__matchers__.io.netty.buffer.ByteBufMatcher
> 13:30:44.706 [Client-1] DEBUG i.n.u.i.JavassistTypeParameterMatcherGenerator 
> - Generated: 
> io.netty.util.internal.__matchers__.org.apache.drill.exec.rpc.OutboundRpcMessageMatcher
> 13:30:44.708 [Client-1] DEBUG i.n.u.i.JavassistTypeParameterMatcherGenerator 
> - Generated: 
> io.netty.util.internal.__matchers__.org.apache.drill.exec.rpc.InboundRpcMessageMatcher
> 13:30:44.717 [Client-1] DEBUG io.netty.util.Recycler - 
> -Dio.netty.recycler.maxCapacity.default: 262144
> {code}
> 2. After the query finished, it seems not close the connection and did not 
> return to shell prompt. 
> I have to manually issue "ctrl-C" to stop it.
> {code}
> 13:31:11.239 [main-SendThread(xx.xx.xx.xx:5181)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x1497d1d0d040839 after 0ms
> 13:31:24.573 [main-SendThread(xx.xx.xx.xx:5181)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x1497d1d0d040839 after 1ms
> 13:31:37.906 [main-SendThread(xx.xx.xx.xx:5181)] DEBUG 
> org.apache.zookeeper.ClientCnxn - Got ping response for sessionid: 
> 0x1497d1d0d040839 after 0ms
> ^CAdministrators-MacBook-Pro-40:xxx$ 
> {code}
> 
> The DrillJdbcExample.java is attached.
> Command to run:
> {code}
> javac DrillJdbcExample.java
> java DrillJdbcExample
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3848) Increase timeout time on several tests that time out frequently.

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3848:
-

 Summary: Increase timeout time on several tests that time out 
frequently.
 Key: DRILL-3848
 URL: https://issues.apache.org/jira/browse/DRILL-3848
 Project: Apache Drill
  Issue Type: Bug
Reporter: Daniel Barclay (Drill)
Assignee: Daniel Barclay (Drill)


Increase test timeout time a bit on: 
- TestTpchDistributedConcurrent
- TestExampleQueries
- TestFunctionsQuery




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2288) ResultSetMetaData not set for zero-row results (DatabaseMetaData.getColumns(...)) [ScanBatch problem]

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14700581#comment-14700581
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2288 at 9/28/15 9:49 PM:


[Deleted]


was (Author: dsbos):
Currently blocked by DRILL-3659.

Mostly-done code is currently in 
https://github.com/dsbos/incubator-drill/tree/bugs/WORK_2288_etc.

> ResultSetMetaData not set for zero-row results 
> (DatabaseMetaData.getColumns(...)) [ScanBatch problem]
> -
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2288) ResultSetMetaData not set for zero-row results (DatabaseMetaData.getColumns(...)) [ScanBatch problem]

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2288:
--
Summary: ResultSetMetaData not set for zero-row results 
(DatabaseMetaData.getColumns(...)) [ScanBatch problem]  (was: ResultSetMetaData 
not set for zero-row result  (DatabaseMetaData.getColumns(...)) [ScanBatch 
problem])

> ResultSetMetaData not set for zero-row results 
> (DatabaseMetaData.getColumns(...)) [ScanBatch problem]
> -
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2288) ResultSetMetaData not set for zero-row result (DatabaseMetaData.getColumns(...)) [ScanBatch problem]

2015-09-28 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933588#comment-14933588
 ] 

Daniel Barclay (Drill) commented on DRILL-2288:
---

Found probable cause of ScanBatch.next()'s returning NONE instead of 
OK_NEW_SCHEMA:

For some JSON files (probably those containing maps), the JSON reader reports a 
spurious schema change at the end of each such JSON file.

That in turn happened because the method to check and reset the schema-changed 
status of nested items was called in the right-hand side of a OR operator 
(||)--so when the left-hand-side condition (the schema-changed status at the 
top level) was already true, the nested status wasn't cleared as it should have 
been.




> ResultSetMetaData not set for zero-row result  
> (DatabaseMetaData.getColumns(...)) [ScanBatch problem]
> -
>
> Key: DRILL-2288
> URL: https://issues.apache.org/jira/browse/DRILL-2288
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
> Attachments: Drill2288NoResultSetMetadataWhenZeroRowsTest.java
>
>
> The ResultSetMetaData object from getMetadata() of a ResultSet is not set up 
> (getColumnCount() returns zero, and trying to access any other metadata 
> throws IndexOutOfBoundsException) for a result set with zero rows, at least 
> for one from DatabaseMetaData.getColumns(...).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3843) Any query returns empty results when we have a filter on a non-existent directory

2015-09-25 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908687#comment-14908687
 ] 

Daniel Barclay (Drill) commented on DRILL-3843:
---

Why is a zero expected rather than zero rows?  

If directory "blah" is really non-existent (your report doesn't actually say), 
wouldn't the value  of the WHERE clause always be false, and therefore wouldn't 
the result have zero rows?



> Any query returns empty results when we have a filter on a non-existent 
> directory
> -
>
> Key: DRILL-3843
> URL: https://issues.apache.org/jira/browse/DRILL-3843
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Rahul Challapalli
>
> git.commit.id.abbrev=3c89b30
> The below query should return 0 instead it returns no results
> {code}
> select l_shipdate from lineitem where dir0='blah';
> +-+
> | l_shipdate  |
> +-+
> +-+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (DRILL-2769) many(?) JDBC methods throw non-SQLException exceptions (e.g., UnsupportedOperationException, RuntimeException)

2015-09-25 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2769:
--
Comment: was deleted

(was: Review is at https://reviews.apache.org/r/37857.)

> many(?) JDBC methods throw non-SQLException exceptions (e.g., 
> UnsupportedOperationException, RuntimeException)
> --
>
> Key: DRILL-2769
> URL: https://issues.apache.org/jira/browse/DRILL-2769
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> It seems that many JDBC methods throw exceptions of type 
> {{UnsupportedOperationException}} or {{RuntimeException}} to indicate that 
> they are not applicable (e.g., Drill's implementation of 
> {{Connection.commit()}}, since Drill isn't transactional) or not implemented 
> yet (and some throw other {{RuntimeException}}s to indicate other problems ).
> However, these methods should be throwing exceptions of type {{SQLException}} 
> (or subclasses thereof). 
> The JDBC pattern is to throw {{SQLException}}s, not {{RuntimeException}}s, so 
> JDBC client code is not likely to handle {{RuntimeException}}s well.
> (For example, it is suspected that {{Connection.commit()}}'s throwing of 
> {{UnsupportedOperationException}} is causing a hang in the JDBC client 
> Spotfire.)
> JDBC does provide a {{SQLFeatureNotSupportedException}}.  However, it is 
> specified to be for when "the JDBC driver does does not support an optional 
> JDBC feature."  It's not clear how risky it would be to use this exception 
> when Drill does not support a _non_-optional JDBC feature. 
> (Possibly, some methods that can't really do what JDBC specifies might need 
> to just return silently without throwing any exception.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2769) many(?) JDBC methods throw non-SQLException exceptions (e.g., UnsupportedOperationException, RuntimeException)

2015-09-25 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908343#comment-14908343
 ] 

Daniel Barclay (Drill) commented on DRILL-2769:
---

Pull request: https://github.com/apache/drill/pull/171  (see DRILL-2489).

> many(?) JDBC methods throw non-SQLException exceptions (e.g., 
> UnsupportedOperationException, RuntimeException)
> --
>
> Key: DRILL-2769
> URL: https://issues.apache.org/jira/browse/DRILL-2769
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> It seems that many JDBC methods throw exceptions of type 
> {{UnsupportedOperationException}} or {{RuntimeException}} to indicate that 
> they are not applicable (e.g., Drill's implementation of 
> {{Connection.commit()}}, since Drill isn't transactional) or not implemented 
> yet (and some throw other {{RuntimeException}}s to indicate other problems ).
> However, these methods should be throwing exceptions of type {{SQLException}} 
> (or subclasses thereof). 
> The JDBC pattern is to throw {{SQLException}}s, not {{RuntimeException}}s, so 
> JDBC client code is not likely to handle {{RuntimeException}}s well.
> (For example, it is suspected that {{Connection.commit()}}'s throwing of 
> {{UnsupportedOperationException}} is causing a hang in the JDBC client 
> Spotfire.)
> JDBC does provide a {{SQLFeatureNotSupportedException}}.  However, it is 
> specified to be for when "the JDBC driver does does not support an optional 
> JDBC feature."  It's not clear how risky it would be to use this exception 
> when Drill does not support a _non_-optional JDBC feature. 
> (Possibly, some methods that can't really do what JDBC specifies might need 
> to just return silently without throwing any exception.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2489) Accessing Connection, Statement, PreparedStatement after they are closed should throw a SQLException

2015-09-25 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717555#comment-14717555
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2489 at 9/25/15 5:11 PM:


[Deleted]


was (Author: dsbos):
Patch depends on patches for 3347, 3566, and 3661.

> Accessing Connection, Statement, PreparedStatement after they are closed 
> should throw a SQLException
> 
>
> Key: DRILL-2489
> URL: https://issues.apache.org/jira/browse/DRILL-2489
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=7b4c887
> According to JDBC spec we should throw a SQLException when we access methods 
> on a closed Connection, Statement, or PreparedStatement. Drill is currently 
> not doing it. 
> I can raise multiple JIRA's if the developer wishes to work on them 
> independently



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (DRILL-2489) Accessing Connection, Statement, PreparedStatement after they are closed should throw a SQLException

2015-09-25 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14707244#comment-14707244
 ] 

Daniel Barclay (Drill) edited comment on DRILL-2489 at 9/25/15 5:10 PM:


[Deleted]


was (Author: dsbos):
Review is at https://reviews.apache.org/r/37685.

> Accessing Connection, Statement, PreparedStatement after they are closed 
> should throw a SQLException
> 
>
> Key: DRILL-2489
> URL: https://issues.apache.org/jira/browse/DRILL-2489
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=7b4c887
> According to JDBC spec we should throw a SQLException when we access methods 
> on a closed Connection, Statement, or PreparedStatement. Drill is currently 
> not doing it. 
> I can raise multiple JIRA's if the developer wishes to work on them 
> independently



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3818) Error when DISTINCT and GROUP BY is used in avro or json

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3818:
--
Assignee: Jacques Nadeau  (was: Daniel Barclay (Drill))

> Error when DISTINCT and GROUP BY is used in avro or json
> 
>
> Key: DRILL-3818
> URL: https://issues.apache.org/jira/browse/DRILL-3818
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC, SQL Parser
>Affects Versions: 1.1.0, 1.2.0
> Environment: Linux Mint 17.1
> java version "1.7.0_80"
> Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
>Reporter: Philip Deegan
>Assignee: Jacques Nadeau
> Fix For: 1.2.0
>
>
> Data
> {noformat}
> { "a": { "b": { "c": "d" }, "e": 2 }}
> {noformat}
> Query
> {noformat}
> select DISTINCT(t.a.b.c), MAX(t.a.e)  FROM dfs.`json.json` t GROUP BY t.a.b.c 
> LIMIT 1;
> {noformat}
> Occurs on 1.1.0 and incubator-drill master
> {noformat}
> +---+
> | commit_id |
> +---+
> | 9f54aac33df3e783c0192ab56c7e1313dbc823fa  |
> +---+
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:86)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:737)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: VALIDATION 
> ERROR: java.lang.NullPointerException
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86)
>  

[jira] [Commented] (DRILL-3818) Error when DISTINCT and GROUP BY is used in avro or json

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905477#comment-14905477
 ] 

Daniel Barclay (Drill) commented on DRILL-3818:
---

Stack trace from server log ({{sqlline.log}} from {{drill-embeded}}):


{noformat}
org.apache.drill.common.exceptions.UserException: VALIDATION ERROR: 
java.lang.NullPointerException


[Error Id: d1ea15ce-e0dc-4ee8-afaf-ff9c970ffb1d ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:524)
 ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:181)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) 
[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_72]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_72]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
Caused by: org.apache.calcite.tools.ValidationException: 
java.lang.NullPointerException
at 
org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:179) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.prepare.PlannerImpl.validateAndGetType(PlannerImpl.java:188) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:447)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:190)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:159)
 ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178)
 [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT]
... 5 common frames omitted
Caused by: java.lang.NullPointerException: null
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.getGroupExprs(AggregatingSelectScope.java:142)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.checkAggregateExpr(AggregatingSelectScope.java:221)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.getOperandScope(AggregatingSelectScope.java:206)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:48)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:32)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:130) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.expand(SqlValidatorImpl.java:4067)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorUtil.analyzeGroupExpr(SqlValidatorUtil.java:455)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorUtil.convertGroupSet(SqlValidatorUtil.java:426)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorUtil.analyzeGroupItem(SqlValidatorUtil.java:402)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.AggregatingSelectScope.(AggregatingSelectScope.java:97)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.registerQuery(SqlValidatorImpl.java:2216)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.registerQuery(SqlValidatorImpl.java:2121)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:835)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:551)
 ~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
at 
org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:177) 
~[calcite-core-1.4.0-drill-r2.jar:1.4.0-drill-r2]
... 10 common frames omitted
{noformat}



> Error when DISTINCT and GROUP BY is used in avro or json
> --

[jira] [Updated] (DRILL-3818) Error when DISTINCT and GROUP BY is used in avro or json

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3818:
--
Component/s: (was: Client - JDBC)
 SQL Parser

> Error when DISTINCT and GROUP BY is used in avro or json
> 
>
> Key: DRILL-3818
> URL: https://issues.apache.org/jira/browse/DRILL-3818
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC, SQL Parser
>Affects Versions: 1.1.0, 1.2.0
> Environment: Linux Mint 17.1
> java version "1.7.0_80"
> Java(TM) SE Runtime Environment (build 1.7.0_80-b15)
> Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode)
>Reporter: Philip Deegan
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> Data
> {noformat}
> { "a": { "b": { "c": "d" }, "e": 2 }}
> {noformat}
> Query
> {noformat}
> select DISTINCT(t.a.b.c), MAX(t.a.e)  FROM dfs.`json.json` t GROUP BY t.a.b.c 
> LIMIT 1;
> {noformat}
> Occurs on 1.1.0 and incubator-drill master
> {noformat}
> +---+
> | commit_id |
> +---+
> | 9f54aac33df3e783c0192ab56c7e1313dbc823fa  |
> +---+
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:290)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1359)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:74)
>   at 
> net.hydromatic.avatica.AvaticaConnection.executeQueryInternal(AvaticaConnection.java:404)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeQueryInternal(AvaticaStatement.java:351)
>   at 
> net.hydromatic.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:338)
>   at 
> net.hydromatic.avatica.AvaticaStatement.execute(AvaticaStatement.java:69)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.execute(DrillStatementImpl.java:86)
>   at sqlline.Commands.execute(Commands.java:841)
>   at sqlline.Commands.sql(Commands.java:751)
>   at sqlline.SqlLine.dispatch(SqlLine.java:737)
>   at sqlline.SqlLine.begin(SqlLine.java:612)
>   at sqlline.SqlLine.start(SqlLine.java:366)
>   at sqlline.SqlLine.main(SqlLine.java:259)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: VALIDATION 
> ERROR: java.lang.NullPointerException
> [Error Id: bb826851-d8cb-46f5-96c0-1ed01d3d8c45 on philix:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:118)
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handleReponse(UserClient.java:110)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:47)
>   at 
> org.apache.drill.exec.rpc.BasicClientWithConnection.handle(BasicClientWithConnection.java:32)
>   at org.apache.drill.exec.rpc.RpcBus.handle(RpcBus.java:61)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:233)
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:205)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:89)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:254)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:242)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
>   at 
> io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapte

[jira] [Updated] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3778:
--
Assignee: Jason Altekruse  (was: Daniel Barclay (Drill))

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-3778:
-

Assignee: Daniel Barclay (Drill)  (was: Aditya Kishore)

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3778:
--
Assignee: Jason Altekruse  (was: Aditya Kishore)

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Jason Altekruse
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) reassigned DRILL-3778:
-

Assignee: Daniel Barclay (Drill)  (was: Jason Altekruse)

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3815) unknown suffixes .not_json and .json_not treated differently (multi-file case)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905243#comment-14905243
 ] 

Daniel Barclay (Drill) commented on DRILL-3815:
---

I haven't changed any storage plug-in configuration from the defaults.

> unknown suffixes .not_json and .json_not treated differently (multi-file case)
> --
>
> Key: DRILL-3815
> URL: https://issues.apache.org/jira/browse/DRILL-3815
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Daniel Barclay (Drill)
>
> In scanning a directory subtree used as a table, unknown filename extensions 
> seem to be treated differently depending on whether they're similar to known 
> file extensions.  The behavior suggests that Drill checks whether a file name 
> _contains_ an extension's string rather than _ending_ with it. 
> For example, given these subtrees with almost identical leaf file names:
> {noformat}
> $ find /tmp/testext_xx_json/
> /tmp/testext_xx_json/
> /tmp/testext_xx_json/voter2.not_json
> /tmp/testext_xx_json/voter1.json
> $ find /tmp/testext_json_xx/
> /tmp/testext_json_xx/
> /tmp/testext_json_xx/voter1.json
> /tmp/testext_json_xx/voter2.json_not
> $ 
> {noformat}
> the results of trying to use them as tables differs:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_xx_json`;
> Sep 21, 2015 11:41:50 AM 
> org.apache.calcite.sql.validate.SqlValidatorException 
> ...
> Error: VALIDATION ERROR: From line 1, column 17 to line 1, column 25: Table 
> 'dfs.tmp.testext_xx_json' not found
> [Error Id: 6fe41deb-0e39-43f6-beca-de27b39d276b on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_json_xx`;
> +---+
> | onecf |
> +---+
> | {"name":"someName1"}  |
> | {"name":"someName2"}  |
> +---+
> 2 rows selected (0.149 seconds)
> {noformat}
> (Other probing seems to indicate that there is also some sensitivity to 
> whether the extension contains an underscore character.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3778) Add rest of DRILL-3160 (making JDBC Javadoc available)

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3778:
--
Assignee: Aditya Kishore  (was: Daniel Barclay (Drill))

> Add rest of DRILL-3160 (making JDBC Javadoc available)
> --
>
> Key: DRILL-3778
> URL: https://issues.apache.org/jira/browse/DRILL-3778
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Daniel Barclay (Drill)
>Assignee: Aditya Kishore
> Fix For: 1.2.0
>
>
> Apply changes for DRILL-3160 (making JDBC Javadoc available) that were missed 
> by unsynchronized merge (Javadoc configuration, JDBC Javadoc additions and 
> adjustments).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3741) Document logging configuration for JDBC-all

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14904963#comment-14904963
 ] 

Daniel Barclay (Drill) commented on DRILL-3741:
---

Notes for documentation:

- Internally, Drill uses [SLF4J|http://www.slf4j.org/], which can log through 
different logging back ends.
- Drill's JDBC-all Jar file does not contain any logging back end for SLF4J.  
(That avoid's interfering with the calling application's choice of which back 
end to use.)  (A warning about not finding 
{{org.slf4j.impl.StaticLoggerBinder}} indicates that SLF4J didn't find a back 
end.)
- A logging back end for SLF4J is typically added by adding the back end's Jar 
files to the class path and configuring the back end appropriately.
- For example, to add the [Logback|http://logback.qos.ch/] back end, add 
Logback's  {{logback-core}} and {{logback-classic}} Jar files to the class path 
and a {{logback.xml}} Logback configuration file in some classpath root (Jar 
file or directory).







> Document logging configuration for JDBC-all
> ---
>
> Key: DRILL-3741
> URL: https://issues.apache.org/jira/browse/DRILL-3741
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC, Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Daniel Barclay (Drill)
>
> Add some documentation about how to configure logging when using the JDBC-all 
> Jar file.
> (Presumably, the user needs to select an SLF4J back end, put its Jar file on 
> the class path somewhere, and configure it however that specific back end 
> supports configuration, and we link to SLF4J documentation for details.)
> Probably have something in the Javadoc for class 
> {{org.apache.drill.jdbc.Driver}} or for {{package org.apache.drill.jdbc}} and 
> something in or near the Drill site documentation page 
> https://drill.apache.org/docs/using-the-jdbc-driver/], and have them refer to 
> each other (so that from whichever starting point, the users easily find the 
> other documentation).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3485) Doc. site JDBC page(s) should at least point to JDBC Javadoc in source

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3485:
--
Summary: Doc. site JDBC page(s) should at least point to JDBC Javadoc in 
source  (was: Doc. site JDBC pages should at least point to JDBC Javadoc in 
source)

> Doc. site JDBC page(s) should at least point to JDBC Javadoc in source
> --
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3486) Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's available

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3486:
--
Summary: Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. 
once it's available  (was: JDBC doc. pages should link to JDBC driver Javadoc 
doc. once available)

> Doc. site JDBC page(s) should link to JDBC driver Javadoc doc. once it's 
> available
> --
>
> Key: DRILL-3486
> URL: https://issues.apache.org/jira/browse/DRILL-3486
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
>
> The Drill documentation site's JDBC pages should have a link to a copy of the 
> driver's generated Javadoc documentation once we start generating it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3485) Doc. site JDBC pages should at least point to JDBC Javadoc in source

2015-09-23 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3485:
--
Summary: Doc. site JDBC pages should at least point to JDBC Javadoc in 
source  (was: Doc. JDBC pages should at least point to JDBC Javadoc in source)

> Doc. site JDBC pages should at least point to JDBC Javadoc in source
> 
>
> Key: DRILL-3485
> URL: https://issues.apache.org/jira/browse/DRILL-3485
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Documentation
>Reporter: Daniel Barclay (Drill)
>Assignee: Bridget Bevens
>
> We don't yet generate and publish Javadoc documentation for Drill's JDBC 
> driver, and therefore the Drill documentation site's JDBC pages can't yet 
> link to generated Javadoc documentation as they eventually should.
> However, we have already written Javadoc source documentation for much of the 
> Drill-specific behavior and extensions in the JDBC interface.
> Since that documentation already exists, we should point users to it somehow 
> (until we provide its information to the users normally, as generated Javadoc 
> documentation).
> Therefore, in the interim, the Drill documentation site's JDBC pages should 
> at least point to the source code at 
> [https://github.com/apache/drill/tree/master/exec/jdbc/src/main/java/org/apache/drill/jdbc].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-22 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903742#comment-14903742
 ] 

Daniel Barclay (Drill) commented on DRILL-3822:
---

> The question is why this fix [1] is causing issues in this situation.

I think it's because we're not constraining PathScanner more narrowly so
that it uses its own class loader (the class loader that loaded it (the
one created by SQuirreL for driver isolation)) rather than the thread's
context class loader (which SQuirreL apparently didn't change from the
default--but should it have?).

> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-22 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14903744#comment-14903744
 ] 

Daniel Barclay (Drill) commented on DRILL-3822:
---

I'm making a patch and build to test with SQuirreL.

> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3822) PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL

2015-09-22 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3822:
--
Description: 
git.commit.id.abbrev=3c89b30

I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
application.  I got the following error when trying to connect to the drill 
data source:

{noformat}
ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
Unexpected Error occurred attempting to open an SQL connection.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
found for key 'drill.exec'
Full error message is in the attached file. 
{noformat}

We turned on logging and found that the jdbc-all Jar file's 
{{drill-module.conf}} file was not being found (explaining why the 
configuration key {{drill.exec}} wasn't found).

After further investigation, it seems that {{PathScanner}} directly uses the 
system class loader, bypassing the context class loader.

(After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
SQuirreL's "additional class paths" (presumably being loaded by a special class 
loader) to being copied into SQuirreL's Jar file directory (and therefore 
loaded by the system class loader), SQuirreL worked. (Apparently, 
{{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
Jar file and load it, so the later reference to {{drill.exec}} no longer 
failed.) 

Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
were some recent changes to {{PathScanner}} related to class loaders.)


  was:
git.commit.id.abbrev=3c89b30

I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against squirrel sql 
application.  I got the following error when trying to connect to the drill 
data source:

ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
Unexpected Error occurred attempting to open an SQL connection.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
found for key 'drill.exec'
Full error message is in the attached file. 

We turned on logging and found that the jdbc-all jar files drill-module.conf 
file was not being read.  After further investigation, it seems that the 
PathScanner only looks for files under the main class path and not the 
secondary class path.  After putting the drill-jdbc-all-1.2.0-SNAPSHOT.jar 
under squirrel's main class path, squirrel was able to load the configuration 
modules. 
This works correctly in drill-1.1.


> PathScanner fails to find jdbc-all's drill-module.conf in SQuirreL
> --
>
> Key: DRILL-3822
> URL: https://issues.apache.org/jira/browse/DRILL-3822
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Daniel Barclay (Drill)
> Attachments: squirrel.log
>
>
> git.commit.id.abbrev=3c89b30
> I used the latest drill-jdbc-all-1.2.0-SNAPSHOT.jar against the SQuirreL SQL 
> application.  I got the following error when trying to connect to the drill 
> data source:
> {noformat}
> ERROR net.sourceforge.squirrel_sql.client.gui.db.ConnectToAliasCallBack  - 
> Unexpected Error occurred attempting to open an SQL connection.
> java.util.concurrent.ExecutionException: java.lang.RuntimeException: 
> oadd.com.typesafe.config.ConfigException$Missing: No configuration setting 
> found for key 'drill.exec'
> Full error message is in the attached file. 
> {noformat}
> We turned on logging and found that the jdbc-all Jar file's 
> {{drill-module.conf}} file was not being found (explaining why the 
> configuration key {{drill.exec}} wasn't found).
> After further investigation, it seems that {{PathScanner}} directly uses the 
> system class loader, bypassing the context class loader.
> (After drill-jdbc-all-1.2.0-SNAPSHOT.jar was changed from being listed in 
> SQuirreL's "additional class paths" (presumably being loaded by a special 
> class loader) to being copied into SQuirreL's Jar file directory (and 
> therefore loaded by the system class loader), SQuirreL worked. (Apparently, 
> {{PathScanner}} was then able to find  {{drill-module.conf}} in the JDBC-all 
> Jar file and load it, so the later reference to {{drill.exec}} no longer 
> failed.) 
> Also, SQuirreL works correctly with drill-1.1's JDBC-all Jar file, and there 
> were some recent changes to {{PathScanner}} related to class loaders.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-1343) Drill should time out after short time if a storage plugin is unresponsive.

2015-09-22 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-1343:
--
Summary: Drill should time out after short time if a storage plugin is 
unresponsive.  (was: Drill should timeout after short time if a storage plugin 
is unresponsive.)

> Drill should time out after short time if a storage plugin is unresponsive.
> ---
>
> Key: DRILL-1343
> URL: https://issues.apache.org/jira/browse/DRILL-1343
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Zhiyong Liu
>Assignee: Steven Phillips
>Priority: Minor
> Fix For: 1.2.0
>
> Attachments: DRILL-1343.1.patch, DRILL-1343.2.patch, 
> DRILL-1343.3.patch
>
>
> git.commit.id.abbrev=654c879
> git.commit.id=654c879f7caa13925edca911de1b59d04d8f1a8b
> Start drillbit and sqlline with a schema specified, e.g.,
> sqlline -n admin -p admin -u 
> "jdbc:drill:schema=dfs.TpcHMulti;zk=10.10.30.104:5181,10.10.30.105:5181,10.10.30.106:5181"
> Execute one of the following:
> show tables;
> select * from INFORMATION_SCHEMA.`TABLES`;
> The commands hang forever.  No exception was thrown in the log file.
> Note that if using zk=local, the second query works with no hanging problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3816) weird file-extension recognition behavior in directory subtree scanning

2015-09-21 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3816:
--
Description: 
In scanning of directory subtrees for files, recognition of known vs. unknown 
file extensions seems really screwy (not following any apparent pattern). 

For example:
- a suffix of {{.jsxon_not}}, as expected, is not recognized as a JSON file
- a suffix of {{.jsoxn_not}} unexpectedly _is_ taken as JSON
- a suffix of .{{jsonx_not}}, as expected, is not recognized as a JSON file

(Creating a directory containing only a non-empty JSON file ending with 
{{.json}} and another non-empty JSON file ending with one of the above suffixes 
sometimes reads both JSON files and sometimes reports a (presumably) expected 
error because of the mixed file extensions).)

The result sometimes seems to also depend on the rest of the filename, 
presumably related to the order of listing of files.  (It's not clear if it 
depends only on the order after filename sorting, or also depends on the order 
file names are listed the by OS.)

Here are more data points (using a JSON file named {{voter1.json}}): 

- with {{voter2.xjson_not}} - read, as JSON
- with {{voter2.jxson_not}} - read, as JSON
- with {{voter2.jsxon_not}} - causes expected error
- with {{voter2.jsoxn_not}} - read, as JSON
- with {{voter2.jsonx_not}} - causes expected error
- with {{voter2.json_xnot}} - read, as JSON
- with {{voter2.json_nxot}} - read, as JSON
- with {{voter2.json_noxt}} - read, as JSON
- with {{voter2.json_notx}} - read, as JSON
- with {{voter2.jsonxnot}}  - read, as JSON
- with {{voter2.jsonxot}}   - read, as JSON
- with {{voter2.jsoxot}}- causes expected error
- with {{voter2.jxsxoxn}}   - read, as JSON
- with {{voter2.xjxsxoxn}}  - read, as JSON
- with {{voter2.xjxsxoxnx}} - causes expected error
- with {{voter2.xjxxoxn}}   - read, as JSON
- with {{voter2.xjxxxn}}- read, as JSON
- with {{voter2.n}} - read, as JSON
- with {{voter2.}}  - read, as JSON
- with {{voter2.xxx}}   - read, as JSON
- with {{voter2.xx}}- read, as JSON
- with {{voter2.x}} - read, as JSON
- with {{voter2.}}  - causes expected error
- with {{voter2.x}} - read, as JSON
- with {{voter2.xx}}- read, as JSON




  was:
In scanning of directory subtrees for files, recognition of known vs. unknown 
file extensions seems really screwy (not following any apparent pattern). 

For example:
- a suffix of {{.jsxon_not}}, as expected, is not recognized as a JSON file
- a suffix of {{.jsoxn_not}} unexpectedly _is_ taken as JSON
- a suffix of .{{jsonx_not}}, as expected, is not recognized as a JSON file

(Creating a directory containing only a non-empty JSON file ending with 
{{.json}} and another non-empty JSON file ending with one of the above suffixes 
sometimes reads both JSON files and sometimes reports a (presumably) expected 
error because of the mixed file extensions).)

The result sometimes seems to also depend on the rest of the filename, 
presumably related to the order of listing of files.  (It's not clear if it 
depends only on the order after filename sorting, or also depends on the order 
file names are listed the by OS.)

Here are more data points (using a JSON file named {{voter1.json}}): 

- with {{voter2.xjson_not}} - read, as JSON
- with {{voter2.jxson_not}} - read, as JSON
- with {{voter2.jsxon_not}} - causes expected error
- with {{voter2.jsoxn_not}} - read, as JSON
- with {{voter2.jsonx_not}} - causes expected error
- with {{voter2.json_xnot}} - read, as JSON
- with {{voter2.json_nxot}} - read, as JSON
- with {{voter2.json_noxt}} - read, as JSON
- with {{voter2.json_notx}} - read, as JSON
- with {{voter2.jsonxnot}}  - read, as JSON
- with {{voter2.jsonxot}}   - read, as JSON
- with {{voter2.jsoxot}}- causes expected error
- with {{voter2.jxsxoxn}}   - read, as JSON
- with {{voter2.xjxsxoxn}}  - read, as JSON
- with {{voter2.xjxsxoxnx}} - causes expected error
- with {{voter2.xjxxoxn}}   - read, as JSON
- with {{voter2.xjxxxn}- read, as JSON
- with {{voter2.n} - read, as JSON
- with {{voter2.}  - read, as JSON
- with {{voter2.xxx}}   - read, as JSON
- with {{voter2.xx}}- read, as JSON
- with {{voter2.x}} - read, as JSON
- with {{voter2.}}  - causes expected error
- with {{voter2.x - read, as JSON
- with {{voter2.xx- read, as JSON





> weird file-extension recognition behavior in directory subtree scanning
> ---
>
> Key: DRILL-3816
> URL: https://issues.apache.org/jira/browse/DRILL-3816
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Reporter: Daniel Barclay (Drill)
>Assignee: Jacques Nadeau
>
> In scanning of directory subtrees for 

[jira] [Created] (DRILL-3816) weird file-extension recognition behavior in directory subtree scanning

2015-09-21 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3816:
-

 Summary: weird file-extension recognition behavior in directory 
subtree scanning
 Key: DRILL-3816
 URL: https://issues.apache.org/jira/browse/DRILL-3816
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Reporter: Daniel Barclay (Drill)
Assignee: Jacques Nadeau


In scanning of directory subtrees for files, recognition of known vs. unknown 
file extensions seems really screwy (not following any apparent pattern). 

For example:
- a suffix of {{.jsxon_not}}, as expected, is not recognized as a JSON file
- a suffix of {{.jsoxn_not}} unexpectedly _is_ taken as JSON
- a suffix of .{{jsonx_not}}, as expected, is not recognized as a JSON file

(Creating a directory containing only a non-empty JSON file ending with 
{{.json}} and another non-empty JSON file ending with one of the above suffixes 
sometimes reads both JSON files and sometimes reports a (presumably) expected 
error because of the mixed file extensions).)

The result sometimes seems to also depend on the rest of the filename, 
presumably related to the order of listing of files.  (It's not clear if it 
depends only on the order after filename sorting, or also depends on the order 
file names are listed the by OS.)

Here are more data points (using a JSON file named {{voter1.json}}): 

- with {{voter2.xjson_not}} - read, as JSON
- with {{voter2.jxson_not}} - read, as JSON
- with {{voter2.jsxon_not}} - causes expected error
- with {{voter2.jsoxn_not}} - read, as JSON
- with {{voter2.jsonx_not}} - causes expected error
- with {{voter2.json_xnot}} - read, as JSON
- with {{voter2.json_nxot}} - read, as JSON
- with {{voter2.json_noxt}} - read, as JSON
- with {{voter2.json_notx}} - read, as JSON
- with {{voter2.jsonxnot}}  - read, as JSON
- with {{voter2.jsonxot}}   - read, as JSON
- with {{voter2.jsoxot}}- causes expected error
- with {{voter2.jxsxoxn}}   - read, as JSON
- with {{voter2.xjxsxoxn}}  - read, as JSON
- with {{voter2.xjxsxoxnx}} - causes expected error
- with {{voter2.xjxxoxn}}   - read, as JSON
- with {{voter2.xjxxxn}- read, as JSON
- with {{voter2.n} - read, as JSON
- with {{voter2.}  - read, as JSON
- with {{voter2.xxx}}   - read, as JSON
- with {{voter2.xx}}- read, as JSON
- with {{voter2.x}} - read, as JSON
- with {{voter2.}}  - causes expected error
- with {{voter2.x - read, as JSON
- with {{voter2.xx- read, as JSON






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3815) unknown suffixes .not_json and .json_not treated differently (multi-file case)

2015-09-21 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3815:
-

 Summary: unknown suffixes .not_json and .json_not treated 
differently (multi-file case)
 Key: DRILL-3815
 URL: https://issues.apache.org/jira/browse/DRILL-3815
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Other
Reporter: Daniel Barclay (Drill)
Assignee: Jacques Nadeau


In scanning a directory subtree used as a table, unknown filename extensions 
seem to be treated differently depending on whether they're similar to known 
file extensions.  The behavior suggests that Drill checks whether a file name 
_contains_ an extension's string rather than _ending_ with it. 

For example, given these subtrees with almost identical leaf file names:

{noformat}
$ find /tmp/testext_xx_json/
/tmp/testext_xx_json/
/tmp/testext_xx_json/voter2.not_json
/tmp/testext_xx_json/voter1.json
$ find /tmp/testext_json_xx/
/tmp/testext_json_xx/
/tmp/testext_json_xx/voter1.json
/tmp/testext_json_xx/voter2.json_not
$ 
{noformat}

the results of trying to use them as tables differs:

{noformat}
0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_xx_json`;
Sep 21, 2015 11:41:50 AM org.apache.calcite.sql.validate.SqlValidatorException 

...
Error: VALIDATION ERROR: From line 1, column 17 to line 1, column 25: Table 
'dfs.tmp.testext_xx_json' not found


[Error Id: 6fe41deb-0e39-43f6-beca-de27b39d276b on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_json_xx`;
+---+
| onecf |
+---+
| {"name":"someName1"}  |
| {"name":"someName2"}  |
+---+
2 rows selected (0.149 seconds)
{noformat}

(Other probing seems to indicate that there is also some sensitivity to whether 
the extension contains an underscore character.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3814) Directory containing only unrecognized files reported as not found vs. taken as empty table

2015-09-20 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3814:
-

 Summary: Directory containing only unrecognized files reported as 
not found vs. taken as empty table
 Key: DRILL-3814
 URL: https://issues.apache.org/jira/browse/DRILL-3814
 Project: Apache Drill
  Issue Type: Bug
  Components: SQL Parser, Storage - Other
Reporter: Daniel Barclay (Drill)
Assignee: Aman Sinha


A directory subtree all of whose descendent files have unrecognized extensions 
is reported as non-existent rather treated as a table with zero rows.

Is this intended? 

(The error message is the exact same error message that results if the user 
gets a directory name wrong and refers to a non-existent directory, making the 
message really confusing and misleading.)

For example, for directory {{/tmp/unrecognized_files_directory}} containing 
only file {{/tmp/unrecognized_files_directory/junk.junk}}:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM 
`dfs`.`tmp`.`unrecognized_files_directory`;
Sep 20, 2015 11:16:34 PM org.apache.calcite.sql.validate.SqlValidatorException 

SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
'dfs.tmp.unrecognized_files_directory' not found
Sep 20, 2015 11:16:34 PM org.apache.calcite.runtime.CalciteException 
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 
15 to line 1, column 19: Table 'dfs.tmp.unrecognized_files_directory' not found
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 
'dfs.tmp.unrecognized_files_directory' not found


[Error Id: 0ce9ba05-7f62-4063-a2c0-7d2b4f1f7967 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}

Notice how that is the same message as for a non-existent directory:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`no_such_directory`;
Sep 20, 2015 11:17:12 PM org.apache.calcite.sql.validate.SqlValidatorException 

SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
'dfs.tmp.no_such_directory' not found
Sep 20, 2015 11:17:12 PM org.apache.calcite.runtime.CalciteException 
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 
15 to line 1, column 19: Table 'dfs.tmp.no_such_directory' not found
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 19: Table 
'dfs.tmp.no_such_directory' not found


[Error Id: 49f423f1-5dfe-4435-8b72-78e0b80e on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-3813) table from directory subtree having no descendent files fails with index error

2015-09-20 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-3813:
--
Description: 
Trying to use as a table a directory subtree that has no descendent files (but 
zero or more descendent directories) yields what seems to be a partially 
handled index out-of-bounds condition.  

For example, with {{/tmp/empty_directory}} being an empty directory:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`empty_directory`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: 747425c9-5350-4813-9f0d-ecf580e15101 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


Also, with {{/tmp/no_child_files_subtree}} having two child directories and a 
grandchild directory, but not descendent files:

0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`no_child_files_subtree`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: abc90424-8434-4403-b44b-0ba69ef43151 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


A directory subtree having no files was expected to be taken as a table with no 
rows (and a null schema).


  was:
Trying to use as a table a directory subtree that has no descendent files (but 
zero or more descendent directories) yields what seems to be a partially 
handled index out-of-bounds condition.  

For example, with {{/tmp/empty_directory}} being an empty directory:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`empty_directory`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: 747425c9-5350-4813-9f0d-ecf580e15101 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


Also, with {{/tmp/no_child_files_subtree}} having two child directories and a 
grandchild directory, but not descendent files:

0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`no_child_files_subtree`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: abc90424-8434-4403-b44b-0ba69ef43151 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 

{noformat}



> table from directory subtree having no descendent files fails with index error
> --
>
> Key: DRILL-3813
> URL: https://issues.apache.org/jira/browse/DRILL-3813
> Project: Apache Drill
>  Issue Type: Bug
>  Components: SQL Parser, Storage - Other
>Reporter: Daniel Barclay (Drill)
>Assignee: Aman Sinha
>
> Trying to use as a table a directory subtree that has no descendent files 
> (but zero or more descendent directories) yields what seems to be a partially 
> handled index out-of-bounds condition.  
> For example, with {{/tmp/empty_directory}} being an empty directory:
> {noformat}
> 0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`empty_directory`;
> Error: VALIDATION ERROR: Index: 0, Size: 0
> [Error Id: 747425c9-5350-4813-9f0d-ecf580e15101 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}
> Also, with {{/tmp/no_child_files_subtree}} having two child directories and a 
> grandchild directory, but not descendent files:
> 0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`no_child_files_subtree`;
> Error: VALIDATION ERROR: Index: 0, Size: 0
> [Error Id: abc90424-8434-4403-b44b-0ba69ef43151 on dev-linux2:31010] 
> (state=,code=0)
> 0: jdbc:drill:zk=local> 
> {noformat}
> A directory subtree having no files was expected to be taken as a table with 
> no rows (and a null schema).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3813) table from directory subtree having no descendent files fails with index error

2015-09-20 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3813:
-

 Summary: table from directory subtree having no descendent files 
fails with index error
 Key: DRILL-3813
 URL: https://issues.apache.org/jira/browse/DRILL-3813
 Project: Apache Drill
  Issue Type: Bug
  Components: SQL Parser, Storage - Other
Reporter: Daniel Barclay (Drill)
Assignee: Aman Sinha


Trying to use as a table a directory subtree that has no descendent files (but 
zero or more descendent directories) yields what seems to be a partially 
handled index out-of-bounds condition.  

For example, with {{/tmp/empty_directory}} being an empty directory:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`empty_directory`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: 747425c9-5350-4813-9f0d-ecf580e15101 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 
{noformat}


Also, with {{/tmp/no_child_files_subtree}} having two child directories and a 
grandchild directory, but not descendent files:

0: jdbc:drill:zk=local> SELECT * FROM `dfs`.`tmp`.`no_child_files_subtree`;
Error: VALIDATION ERROR: Index: 0, Size: 0


[Error Id: abc90424-8434-4403-b44b-0ba69ef43151 on dev-linux2:31010] 
(state=,code=0)
0: jdbc:drill:zk=local> 

{noformat}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (DRILL-3812) message for invalid compound name doesn't identify part that's bad

2015-09-20 Thread Daniel Barclay (Drill) (JIRA)
Daniel Barclay (Drill) created DRILL-3812:
-

 Summary: message for invalid compound name doesn't identify part 
that's bad
 Key: DRILL-3812
 URL: https://issues.apache.org/jira/browse/DRILL-3812
 Project: Apache Drill
  Issue Type: Bug
  Components: SQL Parser
Reporter: Daniel Barclay (Drill)
Assignee: Aman Sinha


When a compound name (e.g., {{schema.subschema.table}}) is invalid, the error 
message doesn't where it went bad (e.g., which part referred to something 
unknown and/or non-existent).  For example, see the query and the "VALIDATION 
ERROR ..." line in the following:

{noformat}
0: jdbc:drill:zk=local> SELECT * FROM `dfs.NoSuchSchema`.`empty_directory`;
Sep 20, 2015 10:38:24 PM org.apache.calcite.sql.validate.SqlValidatorException 

SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 
'dfs.NoSuchSchema.empty_directory' not found
Sep 20, 2015 10:38:24 PM org.apache.calcite.runtime.CalciteException 
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 
15 to line 1, column 32: Table 'dfs.NoSuchSchema.empty_directory' not found
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 32: Table 
'dfs.NoSuchSchema.empty_directory' not found


[Error Id: 2a298c8e-2923-4744-8f78-b0cf36c83799 on dev-linux2:31010] 
(state=,code=0)
{noformat}

A better error message would say that {{dfs.NoSuchSchema}} was not found (or 
that no {{NoSuchSchema}} was found in schema {{dfs}}).





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-3805) Empty JSON on LHS UNION non empty JSON on RHS must return results

2015-09-18 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14876616#comment-14876616
 ] 

Daniel Barclay (Drill) commented on DRILL-3805:
---

Seems likely to be related to DRILL-2288 and DRILL-3659.

> Empty JSON on LHS UNION non empty JSON on RHS must return results
> -
>
> Key: DRILL-3805
> URL: https://issues.apache.org/jira/browse/DRILL-3805
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
>Priority: Minor
>
> When the input on LHS of UNION operator is empty and there is non empty input 
> on RHS of Union, we need to return the data from the RHS. Currently we return 
> SchemaChangeException.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> select key1 from `empty.json` UNION select key1 
> from `fewRows.json`;
> Error: SYSTEM ERROR: SchemaChangeException: The left input of Union-All 
> should not come from an empty data source
> Fragment 0:0
> [Error Id: f0fcff87-f470-46a8-9733-316b7da1a87f on centos-02.qa.lab:31010] 
> (state=,code=0)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2451) JDBC : Connection.commit throws an UnsupportedOperationException

2015-09-17 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804887#comment-14804887
 ] 

Daniel Barclay (Drill) commented on DRILL-2451:
---

See the documentation (the Javadoc comment on 
DrillConnection.setAutoCommit(boolean)).

> JDBC : Connection.commit throws an UnsupportedOperationException
> 
>
> Key: DRILL-2451
> URL: https://issues.apache.org/jira/browse/DRILL-2451
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Rahul Challapalli
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=e92db23
> Currently drill throws an UnsupportedOperationException when we call "commit" 
> on the Connection object. 
> I am not exactly sure what "commit" should do in the context of drill. But at 
> the very least doing nothing is better than throwing the above exception 
> since a few analytic tools might be using this method.
> Below is the documentation from the JDBC spec :
> {code}
> void commit() throws SQLException - Makes all changes made since the previous 
> commit/rollback permanent and releases any database locks currently held by 
> this Connection object. This method should be used only when auto-commit mode 
> has been disabled.
> Throws:
> SQLException - if a database access error occurs, this method is called while 
> participating in a distributed transaction, if this method is called on a 
> closed connection or this Connection object is in auto-commit mode
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (DRILL-2451) JDBC : Connection.commit throws an UnsupportedOperationException

2015-09-17 Thread Daniel Barclay (Drill) (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14804886#comment-14804886
 ] 

Daniel Barclay (Drill) commented on DRILL-2451:
---

See the documentation (the Javadoc comment on DrillConnection.commit()).

> JDBC : Connection.commit throws an UnsupportedOperationException
> 
>
> Key: DRILL-2451
> URL: https://issues.apache.org/jira/browse/DRILL-2451
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Rahul Challapalli
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=e92db23
> Currently drill throws an UnsupportedOperationException when we call "commit" 
> on the Connection object. 
> I am not exactly sure what "commit" should do in the context of drill. But at 
> the very least doing nothing is better than throwing the above exception 
> since a few analytic tools might be using this method.
> Below is the documentation from the JDBC spec :
> {code}
> void commit() throws SQLException - Makes all changes made since the previous 
> commit/rollback permanent and releases any database locks currently held by 
> this Connection object. This method should be used only when auto-commit mode 
> has been disabled.
> Throws:
> SQLException - if a database access error occurs, this method is called while 
> participating in a distributed transaction, if this method is called on a 
> closed connection or this Connection object is in auto-commit mode
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-2482) JDBC : calling getObject when the actual column type is 'NVARCHAR' results in NoClassDefFoundError

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-2482.
---
Resolution: Fixed

> JDBC : calling getObject when the actual column type is 'NVARCHAR' results in 
> NoClassDefFoundError
> --
>
> Key: DRILL-2482
> URL: https://issues.apache.org/jira/browse/DRILL-2482
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Rahul Challapalli
>Assignee: Daniel Barclay (Drill)
>Priority: Blocker
> Fix For: 1.2.0
>
>
> git.commit.id.abbrev=7b4c887
> I tried to call getObject(i) on a column which is of type varchar, drill 
> failed with the below error :
> {code}
> Exception in thread "main" java.lang.NoClassDefFoundError: 
> org/apache/hadoop/io/Text
>   at 
> org.apache.drill.exec.vector.VarCharVector$Accessor.getObject(VarCharVector.java:407)
>   at 
> org.apache.drill.exec.vector.NullableVarCharVector$Accessor.getObject(NullableVarCharVector.java:386)
>   at 
> org.apache.drill.exec.vector.accessor.NullableVarCharAccessor.getObject(NullableVarCharAccessor.java:98)
>   at 
> org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:137)
>   at 
> org.apache.drill.jdbc.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:136)
>   at 
> net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351)
>   at Dummy.testComplexQuery(Dummy.java:94)
>   at Dummy.main(Dummy.java:30)
> Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.Text
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   ... 8 more
> {code}
> When the underlying type is a primitive, the getObject call succeeds



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (DRILL-3658) Missing org.apache.hadoop in the JDBC jar

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-3658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) resolved DRILL-3658.
---
Resolution: Fixed

> Missing org.apache.hadoop in the JDBC jar
> -
>
> Key: DRILL-3658
> URL: https://issues.apache.org/jira/browse/DRILL-3658
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - JDBC
>Reporter: Piotr Sokólski
>Assignee: Daniel Barclay (Drill)
>Priority: Blocker
> Fix For: 1.2.0
>
>
> java.lang.ClassNotFoundException: local.org.apache.hadoop.io.Text is thrown 
> while trying to access a text field from a result set returned from Drill 
> while using the drill-jdbc-all.jar



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (DRILL-2625) org.apache.drill.common.StackTrace should follow standard stacktrace format

2015-09-15 Thread Daniel Barclay (Drill) (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Barclay (Drill) updated DRILL-2625:
--
Description: 
org.apache.drill.common.StackTrace uses a different textual format than JDK's 
standard format for stack traces.

It should probably use the standard format so that its stack trace output can 
be used by tools that already can parse the standard format to provide 
functionality such as displaying the corresponding source.

(After correcting for DRILL-2624, StackTrace formats stack traces like this:

org.apache.drill.common.StackTrace.:1
org.apache.drill.exec.server.Drillbit.run:20
org.apache.drill.jdbc.DrillConnectionImpl.:232

The normal form is like this:
{noformat}
at 
org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
at 
org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
at com.google.common.io.Closeables.close(Closeables.java:77)
{noformat}
)



  was:
org.apache.drill.common.StackTrace uses a different textual format than JDK's 
standard format for stack traces.

It should probably use the standard format so that its stack trace output can 
be used by tools that already can parse the standard format to provide 
functionality such as displaying the corresponding source.

(After correcting for DRILL-2624, StackTrace formats stack traces like this:

org.apache.drill.common.StackTrace.:1
org.apache.drill.exec.server.Drillbit.run:20
org.apache.drill.jdbc.DrillConnectionImpl.:232

The normal form is like this:
at 
org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
at 
org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
at com.google.common.io.Closeables.close(Closeables.java:77)
)




> org.apache.drill.common.StackTrace should follow standard stacktrace format
> ---
>
> Key: DRILL-2625
> URL: https://issues.apache.org/jira/browse/DRILL-2625
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.8.0
>Reporter: Daniel Barclay (Drill)
>Assignee: Chris Westin
> Fix For: 1.2.0
>
>
> org.apache.drill.common.StackTrace uses a different textual format than JDK's 
> standard format for stack traces.
> It should probably use the standard format so that its stack trace output can 
> be used by tools that already can parse the standard format to provide 
> functionality such as displaying the corresponding source.
> (After correcting for DRILL-2624, StackTrace formats stack traces like this:
> org.apache.drill.common.StackTrace.:1
> org.apache.drill.exec.server.Drillbit.run:20
> org.apache.drill.jdbc.DrillConnectionImpl.:232
> The normal form is like this:
> {noformat}
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator.close(TopLevelAllocator.java:162)
>   at 
> org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:75)
>   at com.google.common.io.Closeables.close(Closeables.java:77)
> {noformat}
> )



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   4   5   6   7   8   9   10   >