[jira] [Resolved] (DRILL-5440) Sqlline is not connecting to Hive database

2017-04-26 Thread Abhishek Girish (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish resolved DRILL-5440.

Resolution: Invalid

> Sqlline is not connecting to Hive database
> --
>
> Key: DRILL-5440
> URL: https://issues.apache.org/jira/browse/DRILL-5440
> Project: Apache Drill
>  Issue Type: Task
>  Components: Functions - Drill
>Affects Versions: 1.10.0
> Environment: OS: Redhat Linux 6.7, HDP 2.5.3, Kerberos enabled, 
> Hardware: VmWare
>Reporter: Parag Darji
>Priority: Minor
>  Labels: newbie
> Fix For: 1.10.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Sqlline is not showing hive database.
> {code}
> sqlline -u "jdbc:drill:schema:hiveremote;drillbit=host1.fqdn;auth=kerberos"
> apache drill 1.10.0
> "what ever the mind of man can conceive and believe, drill can query"
> 0: jdbc:drill:schema:hiveremote> show schemas;
> +-+
> | SCHEMA_NAME |
> +-+
> | INFORMATION_SCHEMA  |
> | sys |
> +-+
> 2 rows selected (2.074 seconds)
> 0: jdbc:drill:schema:hiveremote> show databases;
> +-+
> | SCHEMA_NAME |
> +-+
> | INFORMATION_SCHEMA  |
> | sys |
> +-+
> 2 rows selected (0.226 seconds)
> 0: jdbc:drill:schema:hiveremote>  !connect  jdbc:mysql://hive.fqdn:10010/hive
> Enter username for jdbc:mysql://hive.fqdn:10010/hive:
> {code}
> I've kerberos enabled so I tried below command, but it's giving auth error.
> {code}
> sqlline --maxWidth=1 -u 
> "jdbc:drill:schema=hive:drillbit=host1.fqdn;auth=kerberos;principal=drill/host1.f...@lab.com;keytab:/home/drill/.keytab/drill.keytab"
> error: Error: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.NonTransientRpcException: 
> javax.security.sasl.SaslException: Authentication failed: Authentication 
> failed. Incorrect credentials? [Caused by javax.security.sasl.SaslException: 
> Authentication failed. Incorrect credentials?] (state=,code=0)
> java.sql.SQLException: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.NonTransientRpcException: 
> javax.security.sasl.SaslException: Authentication failed: Authentication 
> failed. Incorrect credentials? [Caused by javax.security.sasl.SaslException: 
> Authentication failed. Incorrect credentials?]
> {code}
> When I try to use username/password, it doesn't return anything, just sits 
> there for hours
> {code}
> drill@:/home/drill>  sqlline -u "jdbc:drill:drillbit=host1.fqdn;auth=kerberos"
> apache drill 1.10.0
> "drill baby drill"
> 0: jdbc:drill:drillbit=> !connect
> Usage: connect[driver]
> 0: jdbc:drill:drillbit=> !connect  jdbc:mysql://hivenode.fqdn:10010/hive test 
> test
> {code}
> Below query doesn't return anything:
> {code}
> sqlline -u 
> "jdbc:mysql://hive.fqdn:10010/default;zk=host1.fqdn:2181;service_name=drill;service_host=host2.fqdn;keytab=/home/drill/.keytab/drill.keytab";user="drill/host2.f...@lab.com"
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5448) ODBC client crashed when user does not have access to text formatted hive table

2017-04-26 Thread Krystal (JIRA)
Krystal created DRILL-5448:
--

 Summary: ODBC client crashed when user does not have access to 
text formatted hive table
 Key: DRILL-5448
 URL: https://issues.apache.org/jira/browse/DRILL-5448
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - C++
Affects Versions: 1.10.0, 1.11.0
Reporter: Krystal


While many connections are connecting to the drillbit, odbc client crashed with 
"Segmentation Fault" while executing a select from a hive table in text format 
that the user does not have access to.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5447) Managed External Sort : Unable to allocate sv2 vector

2017-04-26 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-5447:


 Summary: Managed External Sort : Unable to allocate sv2 vector
 Key: DRILL-5447
 URL: https://issues.apache.org/jira/browse/DRILL-5447
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.10.0
Reporter: Rahul Challapalli
Assignee: Paul Rogers


git.commit.id.abbrev=3e8b01d

Dataset :
{code}
Every records contains a repeated type with 2000 elements. 
The repeated type contains varchars of length 250 for the first 2000 records 
and single character strings for the next 2000 records
The above pattern is repeated a few types
{code}

The below query fails
{code}
ALTER SESSION SET `exec.sort.disable_managed` = false;
alter session set `planner.width.max_per_node` = 1;
alter session set `planner.disable_exchanges` = true;
alter session set `planner.width.max_per_query` = 1;
select count(*) from (select * from (select id, flatten(str_list) str from 
dfs.`/drill/testdata/resource-manager/flatten-large-small.json`) d order by 
d.str) d1 where d1.id=0;

Error: RESOURCE ERROR: Unable to allocate sv2 buffer

Fragment 0:0

[Error Id: 9e45c293-ab26-489d-a90e-25da96004f15 on qa-node190.qa.lab:31010] 
(state=,code=0)
{code}

Exception from the logs
{code}
[Error Id: 9e45c293-ab26-489d-a90e-25da96004f15 ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
 ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.newSV2(ExternalSortBatch.java:1463)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.makeSelectionVector(ExternalSortBatch.java:799)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch(ExternalSortBatch.java:856)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch(ExternalSortBatch.java:618)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:660)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:93)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecor

[jira] [Created] (DRILL-5446) Offset Vector in VariableLengthVectors may waste up to 256KB per value vector

2017-04-26 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-5446:
---

 Summary: Offset Vector in VariableLengthVectors may waste up to 
256KB per value vector
 Key: DRILL-5446
 URL: https://issues.apache.org/jira/browse/DRILL-5446
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Relational Operators
Affects Versions: 1.10.0
Reporter: Boaz Ben-Zvi
Assignee: Boaz Ben-Zvi
 Fix For: 1.11.0


In exec/vector/src/main/codegen/templates/VariableLengthVectors.java -- the 
implementation uses an "offset vector" to note the BEGINNING of each variable 
length element. In order to find the length (i.e. the END of the element), need 
to look at the FOLLOWING element. 
  This requires the "offset vector" to have ONE MORE entry than the total 
number of elements -- in order to find the END of the LAST element.
  Some places in the code (e.g., the hash table) use the maximum number of 
elements - 64K ( = 65536 ).  And each entry in the "offset vector" is 4-byte 
UInt4, hence looks like needing 256KB. 
  However because of that "ONE MORE", the code in this case allocates for 
65537, thus (rounding to next power of 2) allocating 512KB, where half is not 
used  
 (And this is per each varchar value vector, per each batch; e.g., in the qa 
test Functional/aggregates/tpcds_variants/text/aggregate25.q where there are 10 
key columns, each hash-table batch is wasting 2.5MB !).

Possible fix: change the logic in VariableLengthVectors.java to keep the END 
point of each variable length element - the first element's beginning is always 
ZERO, so it need not be kept.

 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] drill pull request #820: DRILL-5391: CTAS: make folder and file permission c...

2017-04-26 Thread arina-ielchiieva
GitHub user arina-ielchiieva opened a pull request:

https://github.com/apache/drill/pull/820

DRILL-5391: CTAS: make folder and file permission configurable

1. Added new configuration option exec.persistent_table.umask which 
defaults to 002. User can modify this option on session or system level. If 
umask was set incorrectly, default umask will be used (002) and error will be 
logged.
2. Refactored `StorageStrategy` to use umask.
3. Added appropriate unit tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arina-ielchiieva/drill DRILL-5391

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/820.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #820


commit 3ed005864cdac8d7d04d61307fe1493b38e1ea97
Author: Arina Ielchiieva 
Date:   2017-04-26T13:27:19Z

DRILL-5391: CTAS: make folder and file permission configurable




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Jars for BaseTestQuery

2017-04-26 Thread François Méthot
Hi Paul,

  Thanks for your detailed comments. I am keeping your email as a reference.

We tend to favor the System Level testing first because they are easier to
maintain and refactoring code.  After trying to use the System level test,
we tried pure JUnit test approach and we realize that some of the
builder/constructor/factory for objects (DrillBuf, BufferLedger) that are
required by some function are package protected.

I will revisit our test revamp based on your comments once 1.11 is released
(DRILL-5323 and DRILL-5318).


François





On Thu, Apr 20, 2017 at 7:11 PM, Paul Rogers  wrote:

> Hi François,
>
> You raised two issues, I’ll address both.
>
> First, it is true that Maven’s model is that test code is not packaged, it
> is visible only to the maven module in which the test code resides. As you
> point out, this is an inconvenience in multiple-module projects such as
> Drill. Drill gets around the problem by minimizing unit testing; most
> testing outside of java-exec is done via system tests: running all of Drill
> and throwing queries at it.
>
> Drill, at present, has no support for reusing tests outside of their home
> module. It would be great if someone volunteers to solve the problem. Here
> are two references: [1], [2]
>
> Second, you mentioned you want to unit test a storage plugin. Here, it is
> necessary to understand how Drill’s usage of the term “unit test" differs
> from common industry usage. In the industry, a “unit test” would be one
> where you test your reader in isolation. Specially, give it an operator
> definition (the so-called “physical operator” or “sub scan POP” in Drill.)
> You’d then grab data and verify that the returned data batches are correct.
>
> Similarly, for the planning side of the plugin, you’d let Drill plan the
> query, then verify that the plan JSON is as you expect it to be.
>
> Drill, however, uses “unit test” to mean a system-level test written using
> JUnit. That is, most Drill tests run a query and examine the results. The
> BaseTestQuery class you mentioned is a JUnit test, but it is a system level
> test: it starts up an embedded Drillbit to which you can send queries. It
> has helper classes that let you examine results o the entire query (not
> just of your reader.) If you construct the correct SQL, your query can
> include nothing but a scan and the screen operator. Still, this approach
> introduces many layers between your test and your reader. (I call it trying
> to fix a watch while wearing oven mitts.)
>
> There are two recent additions to Drill’s test tools that may be of
> interest. First, we have a simpler way to run system tests based on a
> “cluster test fixture”. BaseTestQuery provides very poor control over
> boot-time configuration, but the test fixture gives you much better
> control. Plus, the new fixture lets you reuse the “TestBuilder” classes
> from BestTestQuery while also providing very easy ways to run queries, time
> results and so on. Check out the package-info in [3] and the example test
> in [4]. Unfortunately, this code has the same Maven packaging issues as
> described above.
>
> Of course, even the simplified test fixture is still a system test. We are
> in the process of checking in a new set of “sub-operator” unit test
> fixtures that enable true unit tests: you test only your code. See
> DRILL-5323 and DRILL-5318. Those PRs will be followed by a complete set of
> tests for the sort operator. I can point you to my personal dev branch if
> you want a preview.
>
> With these tools, you can set up to run just your own reader, then set up
> expected results and validate that things work as expected. Unit tests let
> you verify behavior at a very fine grain: verify each kind of column data
> type, verify filters you wish to push and so on. This is important because
> Drill suffers from a very large number of minor bugs: bugs that are hard to
> find using system tests, but which become obvious when using true unit
> tests.
>
> The in-flight version of the test framework was built for an “internal”
> operator (the sort.) Some work will be required to extend the tests to work
> with a reader (and to refactor the reader so it does not depend on a
> running Drillbit.) This is a worthwhile effort that I can help with if you
> want to go this route.
>
> Thanks,
>
> - Paul
>
> [1] http://stackoverflow.com/questions/14722873/sharing-
> src-test-classes-between-modules-in-a-multi-module-maven-project
> [2] http://maven.apache.org/guides/mini/guide-attached-tests.html
> [3] https://github.com/apache/drill/blob/master/exec/java-
> exec/src/test/java/org/apache/drill/test/package-info.java
> [4] https://github.com/apache/drill/blob/master/exec/java-
> exec/src/test/java/org/apache/drill/test/ExampleTest.java
>
>
> > On Apr 20, 2017, at 11:23 AM, François Méthot 
> wrote:
> >
> > Hi,
> >
> >   I need to develop unit test of our storage plugins and if possible I
> > would like to borrow from the tests done in "TestCsvHeader.java" and

[GitHub] drill issue #804: DRILL-5405: Add missing operator types

2017-04-26 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/804
  
@sudheeshkatkam thanks for pointing out on the discussion. I will close 
this PR but I'll update Jira to reflect discussion points.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #804: DRILL-5405: Add missing operator types

2017-04-26 Thread arina-ielchiieva
Github user arina-ielchiieva closed the pull request at:

https://github.com/apache/drill/pull/804


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---