Querying nested data

2015-05-21 Thread Patrick Grandjean
I am following the "Getting Started" guide. I could initialize drill in embedded mode and execute a query on the sample employee.json, which returned what was expected. I then tried querying a more complex JSON with more nested data, by using sales_transactions.json: 0: jdbc:drill:zk=local> selec

Re: Auto-splitting delimitted files

2015-05-21 Thread Yousef Lasi
We do expect to use MapRFS at some point so data locality will be available to Drill once that happens. In the interim, we're trying to leverage Drill to pre-process large data sets. As an example, we're creating a view into a join across 4 large files (the largest of which is 20 GB). This join

Re: Querying nested data

2015-05-21 Thread Abhishek Girish
Hey Patrick, It looks like there was a scheme change across records. Going by the error message, it looks like Drill first encountered a value of type float8 and expected the remainder of records to also have values of type float8 for that field. But it then encountered a value of type BigInt. Ca

Re: Querying nested data

2015-05-21 Thread Patrick Grandjean
Thank you Abhishek! It works now. On Thu, May 21, 2015 at 10:04 AM, Abhishek Girish wrote: > Hey Patrick, > > It looks like there was a scheme change across records. Going by the error > message, it looks like Drill first encountered a value of type float8 and > expected the remainder of records

AW: Columnar data model for JSON stored in HBase column?

2015-05-21 Thread MOIS Martin (MORPHO)
Hello, some time passed by since I have written the email below. Meanwhile I have taken a closer look at the Apache Drill sources and I can answer a few questions myself: * The HBase plugin performs for each query a Scan over all columns requested by the query (refer to class org.apa

Drill and ElasticSearch support?

2015-05-21 Thread Virinchi Garimella
Hi, Does Drill support ElasticSearch / Logstash? Virin -- Sent from myMail app for Android

Re: Connection timeout 1.0.0

2015-05-21 Thread Andries Engelbrecht
What are in the log files when you run drill-embedded? Does the drillbit actually come up? The drill-embedded script should start sqlline and you should end up in the sqlline shell. If you want to start with sqlline instead of drill-embedded use the following ./sqlline -u jdbc:drill:zk=local NO

Drill and Joins support?

2015-05-21 Thread Virinchi Garimella
Hello, How does Drill support various types of Joins, like inner, outer etc. Regards Virin -- Sent from myMail app for Android

Re: Auto-splitting delimitted files

2015-05-21 Thread Ted Dunning
Can you publish the test queries and associated logical and physical plans? On Thu, May 21, 2015 at 7:06 AM, Yousef Lasi wrote: > We do expect to use MapRFS at some point so data locality will be > available to Drill once that happens. In the interim, we're trying to > leverage Drill to pre-pr

Re: Drill and Joins support?

2015-05-21 Thread Bob Rumsby
http://drill.apache.org/docs/select-from/ On Thu, May 21, 2015 at 8:53 AM, Virinchi Garimella wrote: > > Hello, > How does Drill support various types of Joins, like inner, outer etc. > Regards > Virin > -- > Sent from myMail app for Android

Re: Drill and Joins support?

2015-05-21 Thread Abhishek Girish
Hello, Please refer to http://drill.apache.org/docs/select-from/ -Abhishek On Thursday, May 21, 2015, Virinchi Garimella wrote: > > Hello, > How does Drill support various types of Joins, like inner, outer etc. > Regards > Virin > -- > Sent from myMail app for Android

Re: Drill and ElasticSearch support?

2015-05-21 Thread Tomer Shiran
Not yet. We've discussed this and it's certainly something we would like to add. On Thu, May 21, 2015 at 8:39 AM, Virinchi Garimella wrote: > > Hi, > Does Drill support ElasticSearch / Logstash? > Virin > -- > Sent from myMail app for Android

Re: Drill and ElasticSearch support?

2015-05-21 Thread AnilKumar B
Hi Virin, As of now, Drill doesn't support Elastic Search. If you are interested in contributing to Drill, then you are most welcome. You can take reference of storage plugins like HBase/Mongo/Hive etc. Thanks & Regards, B Anil Kumar. On Thu, May 21, 2015 at 9:09 PM, Virinchi Garimella wrote:

Re: Auto-splitting delimitted files

2015-05-21 Thread Yousef Lasi
I've sent the full JSON profile of the query in a separate mail message. May 21 2015 12:16 PM, "Ted Dunning" wrote: > Can you publish the test queries and associated logical and physical plans? > > On Thu, May 21, 2015 at 7:06 AM, Yousef Lasi wrote: > >> We do expect to use MapRFS at some poi

[newbie]: how to query HDFS

2015-05-21 Thread Alan Miller
First off, this is my first attempt at drill, (BTW: congratulations on the release ;-) so perhaps I misunderstood something I want to query my parquet files on HDFS. I setup the 1.0 release on a machine (node1) that already had CDH5 and a working Zookeeper. With the hdfs storage plugin config bel

Re: [newbie]: how to query HDFS

2015-05-21 Thread Andries Engelbrecht
Alan, I don't think the path is correct in your query, it is best to set up workspaces in the HDFS plugin http://drill.apache.org/docs/file-system-storage-plugin/ See if that works. --Andries > On May 21, 2015, at 2:04 PM, Alan Miller wrote: > > First off, this is my first attempt at drill

Hands on the new features of Drill 1.0.

2015-05-21 Thread Hao Zhu
Hi Team, Just sharing below 2 hands-on experience to explain the new features of Drill 1.0: - Impersonation - Chaining View

RE: [newbie]: how to query HDFS

2015-05-21 Thread Alan Miller
I tried that initially, but since it didn't work I tried to simplify it as much as possible. Are saying it "should" work?. I mean all I need to do is point the connection parameter to a different namenode, right?

Re: [newbie]: how to query HDFS

2015-05-21 Thread Abhishek Girish
Hi Alan, What you are attempting to do wouldn't work. Without a drillbit running on the remote cluster, there is no way I see we can access that file system from Drill. If you'd like to connect to a remote cluster (cluster B), the options I see are (1) Install Drill on cluster B and use a local

Re: [newbie]: how to query HDFS

2015-05-21 Thread Tomer Shiran
You don't need a drillbit on the cluster. It will be faster (data locality etc.) but you can just run a drillbit on your client and access any remote cluster (or even join data from multiple clusters). It looks like you've created a new storage plugin. I would recommend copying the entire JSON con

Digging into query planning and joins in Drill

2015-05-21 Thread Hao Zhu
Hi Team, Sharing 2 research articles: 1. Drill Workshop -- Control Query Parallelization This article explains the meaning of planner.slice_target, planner.width.max_per_node, planner.width.max_per_query by examples, and show how