Re: Is it possible to delegate data joins and filtering to the datasource ?

2017-04-12 Thread Muhammad Gelbana
I have done it. Thanks a lot Weijie and all of you for your time.

*-*
*Muhammad Gelbana*
http://www.linkedin.com/in/mgelbana

On Thu, Apr 6, 2017 at 3:15 PM, weijie tong  wrote:

> some tips:
> 1. you need to know the RexInputRef index relationship between the
>  JoinRel's  and its inputs's  .
>
> join ( 1,2 ,3,4,5)
>
> left input(1,2,3) right input (1,2)
>
> 1,2,3,  ===> left input (1 ,2,3)
>
> 4,5 >right input (1,2)
>
> 2. you capture the index map relationship  when you iterate over your
> JoinRelNode of your defined Rule( CartesianProductJoinRule) , and store
> these index mapping data in your defined BGroupScan( name convention of my
> last example )
> this mapping struct may be:  destination index  ->( source
> ScanRel  :  source Index) .
> to 1 example data ,the struct will be:
> 1 ==>(left scan1   : 1)
> 2 ==>(left scan1  : 2)
> 3 ==>(left scan1  : 3)
> 4 ==>(right scan2  : 1)
> 5 ==>(right scan2  : 2)
>
> 3. you define another Rule (match Project RelNode)which depends on the
> index mapping data of your last step . At this rule you pick the final
> output project's index and pick its mapped index by the mapping struct,
> then you find the final output column name and related tables.
>
>
>
>
> On Tue, Apr 4, 2017 at 1:51 AM, Muhammad Gelbana 
> wrote:
>
> > I've succeeded, theoretically, in what I wanted to do because I had to
> send
> > the selected columns manually to my datasource. Would someone please tell
> > me how can I identify the selected columns in the join ? I searched a lot
> > without success.
> >
> > *-*
> > *Muhammad Gelbana*
> > http://www.linkedin.com/in/mgelbana
> >
> > On Sat, Apr 1, 2017 at 1:43 AM, Muhammad Gelbana 
> > wrote:
> >
> > > So I intend to use this constructor for the new *RelNode*:
> > *org.apache.drill.exec.planner.logical.DrillScanRel.
> > DrillScanRel(RelOptCluster,
> > > RelTraitSet, RelOptTable, GroupScan, RelDataType, List)*
> > >
> > > How can I provide it's parameters ?
> > >
> > >1. *RelOptCluster*: Can I pass *DrillJoinRel.getCluster()* ?
> > >
> > >2. *RelTraitSet*: Can I pass *DrillJoinRel.getTraitSet()* ?
> > >
> > >3. *RelOptTable*: I assume I can use this factory method
> > (*org.apache.calcite.prepare.RelOptTableImpl.create(RelOptSchema,
> > >RelDataType, Table, Path)*). Any hints of how I can provide these
> > >parameters too ? Should I just go ahead and manually create a new
> > instance
> > >of each parameter ?
> > >
> > >4. *GroupScan*: I understand I have to create a new implementation
> > >class for this one so now questions here so far.
> > >
> > >5. *RelDataType*: This one is confusing. Because I understand that
> for
> > >*DrillJoinRel.transformTo(newRel)* to work, I have to provide a
> > >*newRel* instance that has a *RelDataType* instance with the same
> > >amount of fields and compatible types (i.e. this is mandated by
> > *org.apache.calcite.plan.RelOptUtil.verifyTypeEquivalence(RelNode,
> > >RelNode, Object)*). Why couldn't I provide a *RelDataType* with
> > >a different set of fields ? How can I resolve this ?
> > >
> > >6. *List*: I assume I can call this method and pass my
> > >columns names to it, one by one. (i.e.
> > >*org.apache.drill.common.expression.SchemaPath.
> > getCompoundPath(String...)*
> > >)
> > >
> > > Thanks.
> > >
> > > *-*
> > > *Muhammad Gelbana*
> > > http://www.linkedin.com/in/mgelbana
> > >
> > > On Fri, Mar 31, 2017 at 1:59 PM, weijie tong 
> > > wrote:
> > >
> > >> your code seems right , just to implement the 'call.transformTo()'
> ,but
> > >> the
> > >> left detail , maybe I think I can't express the left things so
> > precisely,
> > >> just as @Paul Rogers mentioned the plugin detail is a little trivial.
> > >>
> > >> 1.  drillScanRel.getGroupScan  .
> > >> 2. you need to extend the AbstractGroupScan ,and let it holds some
> > >> information about your storage . This defined GroupScan just call it
> > >> AGroupScan corresponds to a joint scan RelNode. Then you can define
> > >> another
> > >> GroupScan called BGroupScan which extends AGroupScan, The BGroupScan
> > acts
> > >> as a aggregate container which holds the two joint AGroupScan.
> > >> 3 . The new DrillScanRel has the same RowType as the JoinRel. The
> > >> requirement and exmple of transforming between two different RelNodes
> > can
> > >> be found from other codes. This DrillScanRel's GroupScan is the
> > >> BGroupScan.
> > >> This new DrillScanRel is the one applys to the code
> > >>  `call.transformTo()`.
> > >>
> > >> maybe the picture below may help you  understand my idea:
> > >>
> > >>
> > >>  ---Scan (AGroupScan)
> > >> suppose the initial RelNode tree is : Project Join --|
> > >>
> > >>   |   ---Scan (AGroupScan)
> > >>
> > >>   |
> > >>
> > >>  \|/
> > >> after applied this rule ,the final tree is: Project-Scan (
> > BGroupScan
> > >> (
> > >> List(AGroupScan ,AGro

[GitHub] drill pull request #815: DRILL-5424: Fix IOBE for reverse function

2017-04-12 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/815#discussion_r92811
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/expr/fn/impl/TestStringFunctions.java
 ---
@@ -273,4 +273,13 @@ public void testSplit() throws Exception {
 .run();
   }
 
+  @Test
+  public void testReverse() throws Exception {
+testBuilder()
+  .sqlQuery("select reverse(reverse(n_comment)) words from 
cp.`tpch/nation.parquet`")
+  .unOrdered()
+  .sqlBaselineQuery("select n_comment words from 
cp.`tpch/nation.parquet`")
+  .build()
+  .run();
+  }
--- End diff --

Unfortunately this unit test does not ensure that reverse function works 
correctly.
Let's replace this unit test with two unit tests:
1. test reverse result for one row like in `testSplit()` test.
2. test that reverse doesn't fail for table with several long varchars.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5431) Support SSL

2017-04-12 Thread Sudheesh Katkam (JIRA)
Sudheesh Katkam created DRILL-5431:
--

 Summary: Support SSL
 Key: DRILL-5431
 URL: https://issues.apache.org/jira/browse/DRILL-5431
 Project: Apache Drill
  Issue Type: New Feature
  Components: Client - Java, Client - ODBC
Reporter: Sudheesh Katkam


Support SSL between Drillbit and JDBC/ODBC drivers. Drill already supports 
HTTPS for web traffic.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5432) Want a memory format for PCAP files

2017-04-12 Thread Ted Dunning (JIRA)
Ted Dunning created DRILL-5432:
--

 Summary: Want a memory format for PCAP files
 Key: DRILL-5432
 URL: https://issues.apache.org/jira/browse/DRILL-5432
 Project: Apache Drill
  Issue Type: New Feature
Reporter: Ted Dunning


PCAP files [1] are the de facto standard for storing network capture data. In 
security and protocol applications, it is very common to want to extract 
particular packets from a capture for further analysis.

At a first level, it is desirable to query and filter by source and destination 
IP and port or by protocol. Beyond that, however, it would be very useful to be 
able to group packets by TCP session and eventually to look at packet contents. 
For now, however, the most critical requirement is that we should be able to 
scan captures at very high speed.

I previously wrote a (kind of working) proof of concept for a PCAP decoder that 
did lazy deserialization and could traverse hundreds of MB of PCAP data per 
second per core. This compares to roughly 2-3 MB/s for widely available 
Apache-compatible open source PCAP decoders.

This JIRA covers the integration and extension of that proof of concept as a 
Drill file format.

Initial work is available at https://github.com/mapr-demos/pcap-query


[1] https://en.wikipedia.org/wiki/Pcap



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5433) Authentication failed: Server requires authentication using [kerberos, plain]

2017-04-12 Thread Parag Darji (JIRA)
Parag Darji created DRILL-5433:
--

 Summary: Authentication failed: Server requires authentication 
using [kerberos, plain]
 Key: DRILL-5433
 URL: https://issues.apache.org/jira/browse/DRILL-5433
 Project: Apache Drill
  Issue Type: Task
  Components: Functions - Drill
Affects Versions: 1.10.0
 Environment: OS: Redhat Linux 6.7, HDP 2.5.3, Kerberos enabled, 
Hardware: VmWare
Reporter: Parag Darji
Priority: Minor
 Fix For: 1.10.0


I've setup Apace drill 1.10.0 on RHEL 6.7, HDP 2.5.3, kerberos enabled
I'm getting below error while running "drill-conf" or sqlline as user "drill" 
which is configured in the "drill-override.conf" file. 

drill@host:/opt/drill/bin>  drill-conf
Error: Failure in connecting to Drill: 
org.apache.drill.exec.rpc.NonTransientRpcException: 
javax.security.sasl.SaslException: Authentication failed: Server requires 
authentication using [kerberos, plain]. Insufficient credentials? [Caused by 
javax.security.sasl.SaslException: Server requires authentication using 
[kerberos, plain]. Insufficient credentials?] (state=,code=0)
java.sql.SQLException: Failure in connecting to Drill: 
org.apache.drill.exec.rpc.NonTransientRpcException: 
javax.security.sasl.SaslException: Authentication failed: Server requires 
authentication using [kerberos, plain]. Insufficient credentials? [Caused by 
javax.security.sasl.SaslException: Server requires authentication using 
[kerberos, plain]. Insufficient credentials?]
at 
org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:166)
at 
org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:72)
at 
org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69)
at 
org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:143)
at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213)
at sqlline.Commands.connect(Commands.java:1083)
at sqlline.Commands.connect(Commands.java:1015)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36)
at sqlline.SqlLine.dispatch(SqlLine.java:742)
at sqlline.SqlLine.initArgs(SqlLine.java:528)
at sqlline.SqlLine.begin(SqlLine.java:596)
at sqlline.SqlLine.start(SqlLine.java:375)
at sqlline.SqlLine.main(SqlLine.java:268)
Caused by: org.apache.drill.exec.rpc.NonTransientRpcException: 
javax.security.sasl.SaslException: Authentication failed: Server requires 
authentication using [kerberos, plain]. Insufficient credentials? [Caused by 
javax.security.sasl.SaslException: Server requires authentication using 
[kerberos, plain]. Insufficient credentials?]
at 
org.apache.drill.exec.rpc.user.UserClient.connect(UserClient.java:157)
at 
org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:432)
at 
org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:379)
at 
org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:157)
... 18 more
Caused by: javax.security.sasl.SaslException: Authentication failed: Server 
requires authentication using [kerberos, plain]. Insufficient credentials? 
[Caused by javax.security.sasl.SaslException: Server requires authentication 
using [kerberos, plain]. Insufficient credentials?]
at 
org.apache.drill.exec.rpc.user.UserClient$3.mapException(UserClient.java:204)
at 
org.apache.drill.exec.rpc.user.UserClient$3.mapException(UserClient.java:197)
at 
com.google.common.util.concurrent.AbstractCheckedFuture.checkedGet(AbstractCheckedFuture.java:85)
at 
org.apache.drill.exec.rpc.user.UserClient.connect(UserClient.java:155)
... 21 more
Caused by: javax.security.sasl.SaslException: Server requires authentication 
using [kerberos, plain]. Insufficient credentials?
at 
org.apache.drill.exec.rpc.user.UserClient.getAuthenticatorFactory(UserClient.java:285)
at 
org.apache.drill.exec.rpc.user.UserClient.authenticate(UserClient.java:216)
... 22 more
apache drill 1.10.0
"this isn't your grandfather's sql"

Same error when running below command:
sqlline --maxWidth=1 -u 
"jdbc:drill:drillbit=host1.fqdn;auth=kerberos;principal=drill/lad...@lab.com"


"Drill" user has has valid keytab/ticket.
The Drill UI is working fine with local authentication.

drill-override.c

[GitHub] drill pull request #817: DRILL-5429: Cache tableStats per query for MapR DB ...

2017-04-12 Thread ppadma
GitHub user ppadma opened a pull request:

https://github.com/apache/drill/pull/817

DRILL-5429: Cache tableStats per query for MapR DB JSON Tables



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppadma/drill DRILL-5429

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/817.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #817


commit c04c7d5acbb182a0152885711817cbe72ff6582a
Author: Padma Penumarthy 
Date:   2017-04-11T23:34:14Z

DRILL-5429: Cache tableStats per query for MapR DB JSON Tables




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (DRILL-5434) IllegalStateException: Memory was leaked by query.

2017-04-12 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-5434:
-

 Summary: IllegalStateException: Memory was leaked by query.
 Key: DRILL-5434
 URL: https://issues.apache.org/jira/browse/DRILL-5434
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.11.0
Reporter: Khurram Faraaz


Issue a long running COUNT query.
While the query is being executed, stop the foreman drillbit, ./drillbit.sh stop
A memory leak is reported in the drillbit.log

Apache Drill 1.11.0 
git.commit.id.abbrev=06e1522

Stack trace from drillbit.log
{noformat}
2017-04-13 06:14:36,828 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.drill.exec.work.foreman.Foreman - Query text for query id 
2710e8b2-d4dc-1bee-016e-b69fd4966916: SELECT COUNT(*) FROM `twoKeyJsn.json`
2017-04-13 06:14:36,929 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,929 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,929 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,929 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,930 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,930 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,930 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,932 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.exec.store.dfs.FileSelection - FileSelection.getStatuses() took 0 ms, 
numFiles: 1
2017-04-13 06:14:36,934 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Time: 2ms total, 2.102992ms avg, 2ms max.
2017-04-13 06:14:36,934 [2710e8b2-d4dc-1bee-016e-b69fd4966916:foreman] INFO  
o.a.d.e.s.schedule.BlockMapBuilder - Get block maps: Executed 1 out of 1 using 
1 threads. Earliest start: 0.555000 μs, Latest start: 0.555000 μs, Average 
start: 0.555000 μs .
2017-04-13 06:14:36,949 [2710e8b2-d4dc-1bee-016e-b69fd4966916:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2710e8b2-d4dc-1bee-016e-b69fd4966916:0:0: 
State change requested AWAITING_ALLOCATION --> RUNNING
2017-04-13 06:14:36,949 [2710e8b2-d4dc-1bee-016e-b69fd4966916:frag:0:0] INFO  
o.a.d.e.w.f.FragmentStatusReporter - 2710e8b2-d4dc-1bee-016e-b69fd4966916:0:0: 
State to report: RUNNING
Thu Apr 13 06:14:40 UTC 2017 Terminating drillbit pid 5107
2017-04-13 06:14:40,756 [Drillbit-ShutdownHook#0] INFO  
o.apache.drill.exec.server.Drillbit - Received shutdown request.
2017-04-13 06:14:47,819 [pool-169-thread-2] INFO  
o.a.drill.exec.rpc.data.DataServer - closed eventLoopGroup 
io.netty.channel.nio.NioEventLoopGroup@4f3ee67c in 1024 ms
2017-04-13 06:14:47,819 [pool-169-thread-2] INFO  
o.a.drill.exec.service.ServiceEngine - closed dataPool in 1024 ms
2017-04-13 06:14:49,806 [Drillbit-ShutdownHook#0] WARN  
o.apache.drill.exec.work.WorkManager - Closing WorkManager but there are 1 
running fragments.
2017-04-13 06:14:49,807 [2710e8b2-d4dc-1bee-016e-b69fd4966916:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2710e8b2-d4dc-1bee-016e-b69fd4966916:0:0: 
State change requested RUNNING --> FAILED
2017-04-13 06:14:49,807 [Drillbit-ShutdownHook#0] INFO  
o.a.drill.exec.compile.CodeCompiler - Stats: code gen count: 6964, cache miss 
count: 335, hit rate: 95%
2017-04-13 06:14:49,807 [2710e8b2-d4dc-1bee-016e-b69fd4966916:frag:0:0] INFO  
o.a.d.e.w.fragment.FragmentExecutor - 2710e8b2-d4dc-1bee-016e-b69fd4966916:0:0: 
State change requested FAILED --> FINISHED
2017-04-13 06:14:49,809 [2710e8b2-d4dc-1bee-016e-b69fd4966916:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException

Fragment 0:0

[Error Id: 94817261-98a9-4153-8b3a-2d9c95d80cc1 on centos-01.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
NullPointerException

Fragment 0:0

[Error Id: 94817261-98a9-4153-8b3a-2d9c95d80cc1 on centos-01.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
 ~[drill-common-1.11.0-SNAPSHOT.jar:1.11.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:293)
 [drill-java-exec-1.11.0-SNAPSHOT.jar:1.11.0