from:"Tomer Shiran"

Re: Manta Object Store Support

2016-05-06 Thread Tomer Shiran

Does Manta have a Hadoop FileSystem API implementation? That's what Drill
uses for S3, HDFS, MapR-FS, Azure Blob Storage, etc. You could potentially
write a Drill storage plugin, but you get a lot for free if you already
have the file system implementation.

On Fri, May 6, 2016 at 9:43 AM, Elijah Zupancic  wrote:

> I'm trying to get started contributing to Apache Drill. I've got the
> project checked out and it is building to my satisfaction. Right now, I'm
> trying to add support for the open source object store Manta (
> https://github.com/joyent/manta). I thought that this would be a good
> learning project.
>
> Initially, I want to add support in the same way that S3 has support.
> However, I can't seem to find a reference to the S3 storage driver in the
> code base. Is the s3 storage driver part of a different project? How would
> you suggest that I get started?
>
> Thank you,
> Elijah Zupancic
>

Re: [MapR Answers] thatte.hiten...@gmail.com asked "Apache Drill not installing on windows"

2015-12-20 Thread Tomer Shiran

Tutorial on how to install on Windows:
http://www.dremio.com/blog/installing-apache-drill-on-microsoft-windows/

Tried to post on the forum site you linked below but got this error:

Page Not Found

The page you were trying to reach was not found on this TeamHub server. If
you feel you're seeing this page in error, please contact the site
administrators.

On Sun, Dec 20, 2015 at 10:32 AM, Ted Dunning  wrote:

> Can somebody answer this?  I haven't ever installed on windows.
>
>
> -- Forwarded message --
> From: 
> Date: Sat, Dec 19, 2015 at 11:23 PM
> Subject: [MapR Answers] thatte.hiten...@gmail.com asked "Apache Drill not
> installing on windows"
> To: ted.dunn...@gmail.com
>
>
> MapR Answers
>
> Hi Ted Dunning,
>
> The question *Apache Drill not installing on windows*
> <
> http://answers.mapr.com/questions/167491/apache-drill-not-installing-on-windows.html
> >
> has been asked by thatte.hiten...@gmail.com on MapR Answers
>
> *I am installing a standard drill installation. All pre-requisites are met.
> Yes, when i use 32 or 64 bit driver, it gives the following error. [Mapr] [
> Drill] (20) Failure occured while trying to connect to
> local=localhost:31010 Please help with this. *
>
> Click here to answer thatte.hiten...@gmail.com's question
> <
> http://answers.mapr.com/questions/167491/apache-drill-not-installing-on-windows.html
> >
> If you do not wish to receive these notifications you can update your
> Notification
> Settings
> <http://answers.mapr.com/users/16/TedDunning/preferences.html#/notifyTab>.
>



-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: [VOTE] Release Apache Drill 1.3.0 (rc3)

2015-11-19 Thread Tomer Shiran

+1

Downloaded and ran Drill in embedded mode. Ran some joins and group by
queries on MongoDB.


On Tue, Nov 17, 2015 at 10:27 PM, Jacques Nadeau  wrote:

> Hey Everybody,
>
> I'd like to propose a new release candidate of Apache Drill, version
> 1.3.0.  This is the fourth release candidate (rc3).  This addresses some
> issues identified in the the third release candidate including some issues
> with missing S3 dependencies, Avro deserialization issues and Parquet file
> writer metadata additions. Note that this doesn't directly address
> DRILL-4070, an issue that sunk the last candidate. Based on additional
> conversations on the JIRA, the plan is provide a separate migration tool
> for people to rewrite their Parquet footers.
>
> The tarball artifacts are hosted at [2] and the maven artifacts are hosted
> at [3]. This release candidate is based on commit
> cc127ff4ac6272d2cb1b602890c0b7c503ea2062 located at [4].
>
> The vote will be open for 72 hours ending at 10PM Pacific, November 20,
> 2015.
>
> [ ] +1
> [ ] +0
> [ ] -1
>
> thanks,
> Jacques
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12332946
> [2]http://people.apache.org/~jacques/apache-drill-1.3.0.rc3/
> [3]
> https://repository.apache.org/content/repositories/orgapachedrill-1016/
> [4] https://github.com/jacques-n/drill/tree/drill-1.3.0
>



-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: No s3 access without jars from hadoop

2015-11-14 Thread Tomer Shiran

When you say "problem" - it seems like more work was required previously
(with s3n). Or am I misunderstanding?



On Fri, Nov 13, 2015 at 9:55 AM, Nathan Griffith 
wrote:

> Hi all,
>
> Currently if you want to use s3a in Drill 1.3.0 you need to copy a
> couple files from the Hadoop binary distribution into jars/3rdparty.
> For s3n you need those *and* the old jets3t dependency.
>
> Just wanted to highlight this problem since it seems like an important
> but easy fix. The JIRA I filed about it is here:
> https://issues.apache.org/jira/browse/DRILL-4063
>
> Best,
> Nathan
>



-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: [VOTE] Release Apache Drill 1.3.0 (rc1)

2015-11-07 Thread Tomer Shiran

+1

Downloaded and ran in embedded mode on a local MongoDB cluster. Executed 
various queries on nested data in the MongoDB cluster.



> On Nov 6, 2015, at 10:15 PM, Jacques Nadeau  wrote:
> 
> Hey Everybody,
> 
> I'd like to propose a new release candidate of Apache Drill, version
> 1.3.0.  This is the second release candidate (rc1).  This addresses some
> issues identified in the first release candidate including some test
> threading issues, Parquet dictionary read issue with binary values, Joda
> upgrade to avoid a concurrency bug and a couple other small things.
> 
> The tarball artifacts are hosted at [2] and the maven artifacts are hosted
> at [3]. This release candidate is based on commit
> f17ebd2fbf2bf5ba201a635f5b6ec3615afe2305 located at [4].
> 
> The vote will be open for 72 hours ending at 10PM Pacific, November 9, 2015.
> 
> [ ] +1
> [ ] +0
> [ ] -1
> 
> thanks,
> Jacques
> 
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12332946[2]
> http://people.apache.org/~jacques/apache-drill-1.3.0.rc1/
> [3] https://repository.apache.org/content/repositories/orgapachedrill-1014/
> [4] https://github.com/jacques-n/drill/tree/drill-1.3.0-rc1
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio

[jira] [Created] (DRILL-3978) Exceptions when using JDBC driver

2015-10-26 Thread Tomer Shiran (JIRA)

Tomer Shiran created DRILL-3978:
---

 Summary: Exceptions when using JDBC driver
 Key: DRILL-3978
 URL: https://issues.apache.org/jira/browse/DRILL-3978
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.2.0
Reporter: Tomer Shiran


I was using the JDBC driver and 0xDBE as the client. The query was: 

{code}
SELECT * FROM dfs.yelp.`business.json`
{code}

Some of the columns resulted in a Java exception string instead of the actual 
value. For example, in the hours column:


java.lang.NoClassDefFoundError: Could not initialize class 
oadd.org.apache.drill.exec.util.JsonStringHashMap
at 
oadd.org.apache.drill.exec.vector.complex.MapVector$Accessor.getObject(MapVector.java:310)
at 
oadd.org.apache.drill.exec.vector.accessor.GenericAccessor.getObject(GenericAccessor.java:44)
at 
oadd.org.apache.drill.exec.vector.accessor.BoundCheckingAccessor.getObject(BoundCheckingAccessor.java:148)
at 
org.apache.drill.jdbc.impl.TypeConvertingSqlAccessor.getObject(TypeConvertingSqlAccessor.java:795)
at 
org.apache.drill.jdbc.impl.AvaticaDrillSqlAccessor.getObject(AvaticaDrillSqlAccessor.java:179)
at 
oadd.net.hydromatic.avatica.AvaticaResultSet.getObject(AvaticaResultSet.java:351)
at 
org.apache.drill.jdbc.impl.DrillResultSetImpl.getObject(DrillResultSetImpl.java:401)
at 
com.intellij.database.remote.jdbc.impl.RemoteResultSetImpl.getCurrentRow(RemoteResultSetImpl.java:1253)
at 
com.intellij.database.remote.jdbc.impl.RemoteResultSetImpl.getObjects(RemoteResultSetImpl.java:1211)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:323)
at sun.rmi.transport.Transport$1.run(Transport.java:200)
at sun.rmi.transport.Transport$1.run(Transport.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$254(TCPTransport.java:683)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler$$Lambda$2/1667444585.run(Unknown
 Source)
at java.security.AccessController.doPrivileged(Native Method)
at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Re: [ANNOUNCE] Release of Apache Drill 1.2.0

2015-10-17 Thread Tomer Shiran

Congrats! Nice work by the Drill community!!



> On Oct 17, 2015, at 6:34 AM, Abdel Hakim Deneche  
> wrote:
> 
> It is my pleasure to announce the release of Apache Drill 1.2.0.
> 
> This release of Drill fixes many issues and introduces a number of
> enhancements, including the following ones:
> 
> - Support for JDBC data sources, such as MySQL, through a new JDBC Storage
> plugin
> - Partition pruning improvements
> - Five new SQL window functions
> - HTTPS support for Web Console operations
> - Parquet metadata caching to improve query performance on a large number
> of files
> - DROP TABLE command
> 
> The source and binary artifacts are available at [1]
> Review a complete list of fixes and enhancements at [2]
> 
> Thanks to everyone in the community who contributed in this release.
> 
> [1] http://drill.apache.org/download/
> [2]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12332042&projectId=12313820
> 
> -- 
> 
> Abdelhakim Deneche
> 
> Software Engineer
> 
>  
> 
> 
> Now Available - Free Hadoop On-Demand Training
>

Parquet read optimization

2015-10-14 Thread Tomer Shiran

Why is the Parquet read optimization not on by default?

Optimizing Reads of Parquet-Backed Tables

Use the store.hive.optimize_scan_with_native_readers option to optimize
reads of Parquet-backed external tables from Hive. When set to TRUE, this
option uses Drill native readers instead of the Hive Serde interface,
resulting in more performant queries of Parquet-backed external tables.
(Drill 1.2 and later)

Set the store.hive.optimize_scan_with_native_readers option as described in
the section, "Planning and Execution Options"
.

Re: Who is coming to Strata NYC?

2015-09-22 Thread Tomer Shiran

Strata is next week. Looking forward to seeing many of you in person. If
you haven't already added your name to the list, and you will be in town on
Tuesday evening, please go ahead and do that.

Time: Tuesday 7pm
Location: Pio Pio, 604 10th Ave b/t 44th St & 43rd St Hell's Kitchen,
Midtown West

On Thu, Sep 17, 2015 at 8:44 PM, Edmon Begoli  wrote:

> Let's plan on this. I would like to meet you guys, discuss some ideas, just
> chat, have dinner, whatever ...
>
> I created a Google Doc, listed you Tomer, myself, please sign up and
> suggest time and location.
>
>
> https://docs.google.com/document/d/10czCOI3HePghnXa_siIL2P8A7kDnak2zM2-558vdkT0/edit?usp=docslist_api
>
> On Friday, September 4, 2015, Tomer Shiran  wrote:
>
> > I'll be at Strata. Jacques Nadeau will be there too. (We're actually
> doing
> > a "Apache Drill Bootcamp" on the 29th.)
> >
> > Would be great to get together. Maybe dinner? Let's see how many people
> > want to join and we can figure it out.
> >
> > Thanks,
> > Tomer
> >
> > On Fri, Sep 4, 2015 at 12:57 PM, Edmon Begoli  > > wrote:
> >
> > > I am planning on going.
> > >
> > > Maybe we can have a little gathering.
> > >
> >
> >
> >
> > --
> > Tomer Shiran
> > CEO and Co-Founder, Dremio
> >
>



-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: Who is coming to Strata NYC?

2015-09-17 Thread Tomer Shiran

How about dinner on Tuesday?

On Thu, Sep 17, 2015 at 8:44 PM, Edmon Begoli  wrote:

> Let's plan on this. I would like to meet you guys, discuss some ideas, just
> chat, have dinner, whatever ...
>
> I created a Google Doc, listed you Tomer, myself, please sign up and
> suggest time and location.
>
>
> https://docs.google.com/document/d/10czCOI3HePghnXa_siIL2P8A7kDnak2zM2-558vdkT0/edit?usp=docslist_api
>
> On Friday, September 4, 2015, Tomer Shiran  wrote:
>
> > I'll be at Strata. Jacques Nadeau will be there too. (We're actually
> doing
> > a "Apache Drill Bootcamp" on the 29th.)
> >
> > Would be great to get together. Maybe dinner? Let's see how many people
> > want to join and we can figure it out.
> >
> > Thanks,
> > Tomer
> >
> > On Fri, Sep 4, 2015 at 12:57 PM, Edmon Begoli  > > wrote:
> >
> > > I am planning on going.
> > >
> > > Maybe we can have a little gathering.
> > >
> >
> >
> >
> > --
> > Tomer Shiran
> > CEO and Co-Founder, Dremio
> >
>



-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: Drill Sql Max row size

2015-09-08 Thread Tomer Shiran

That won't be a problem. There's actually no limit on how many records/rows you 
can have.  

> On Sep 8, 2015, at 2:00 AM, Sudip Mukherjee  wrote:
> 
> Hi,
> I have somewhere around a million records. But columns are less in numbers 
> (up to 10)
> 
> Thanks,
> Sudip
> 
> -Original Message-
> From: Jacques Nadeau [mailto:jacq...@dremio.com] 
> Sent: 08 September 2015 AM 06:58
> To: dev@drill.apache.org
> Subject: Re: Drill Sql Max row size
> 
> Generally, no. That being said, Drill will probably struggle if you start 
> reading records where one or more cells is greater than a few hundred 
> kilobytes (or mbs) or more than several hundred columns/fields. What size 
> records are you working with?
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Mon, Sep 7, 2015 at 4:49 AM, Sudip Mukherjee 
> wrote:
> 
>> Hi Devs,
>> 
>> Is there a max row limit which I can configure while pulling data from 
>> underlying datasource? If there is a large data-set would drill fetch 
>> like page by page?
>> 
>> Thanks,
>> Sudip
>> 
>> 
>> 
>> ***Legal Disclaimer***
>> "This communication may contain confidential and privileged material 
>> for the sole use of the intended recipient. Any unauthorized review, 
>> use or distribution by others is strictly prohibited. If you have 
>> received the message by mistake, please advise the sender by reply 
>> email and delete the message. Thank you."
>> **
> 
> 
> 
> ***Legal Disclaimer***
> "This communication may contain confidential and privileged material for the
> sole use of the intended recipient. Any unauthorized review, use or 
> distribution
> by others is strictly prohibited. If you have received the message by mistake,
> please advise the sender by reply email and delete the message. Thank you."
> **

Re: Who is coming to Strata NYC?

2015-09-04 Thread Tomer Shiran

I'll be at Strata. Jacques Nadeau will be there too. (We're actually doing
a "Apache Drill Bootcamp" on the 29th.)

Would be great to get together. Maybe dinner? Let's see how many people
want to join and we can figure it out.

Thanks,
Tomer

On Fri, Sep 4, 2015 at 12:57 PM, Edmon Begoli  wrote:

> I am planning on going.
>
> Maybe we can have a little gathering.
>

-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: Apache drill jdbc driver - can i connect to a drillbit?

2015-09-03 Thread Tomer Shiran

If you want to connect to a random drillbit in the cluster you would use
ZooKeeper in the connection URL:

jdbc:drill:zk=/drill/

If you want to connect to a specific drillbit you could specify that
directly by replacing "zk=" with "drillbit="

On Thu, Sep 3, 2015 at 12:28 AM, Rajkumar Singh  wrote:

> This is a sample code snippet to connect to drill using Drill-Jdbc-all
> Driver.
>
> Class.forName("org.apache.drill.jdbc.Driver");
> Connection connection =DriverManager.getConnection("jdbc:drill:zk=
> node3.mynode.com:5181/drill/my_cluster_com-drillbits");
> Statement st = connection.createStatement();
> ResultSet rs = st.executeQuery("SELECT * from cp.`employee`");
> while(rs.next()){
> System.out.println(rs.getString(1));
> }
>
>
> Rajkumar Singh
> MapR Technologies
>
>
> > On Sep 3, 2015, at 12:50 PM, Sudip Mukherjee 
> wrote:
> >
> > Hi Devs,
> >
> > Is there way to connect a drillbit using the jdbc driver. Could you
> please point me to an example if there is one?
> >
> > Thanks,
> > Sudip
> >
> >
> >
> > ***Legal Disclaimer***
> > "This communication may contain confidential and privileged material for
> the
> > sole use of the intended recipient. Any unauthorized review, use or
> distribution
> > by others is strictly prohibited. If you have received the message by
> mistake,
> > please advise the sender by reply email and delete the message. Thank
> you."
> > **
>
>


-- 
Tomer Shiran
CEO and Co-Founder, Dremio

Re: Cross Origin REST API calls in embedded mode (for testing)

2015-08-30 Thread Tomer Shiran

Can you post the script that you're using? Is this a client-side JavaScript?

On Sun, Aug 30, 2015 at 12:51 PM, Nira Amit  wrote:

> Hello Drill developers,
> I'm implementing a module that reads data from Drill and feeds it into a
> visualization tool. I have a working embedded Drill installation with my
> data on it and when I query it with curl it works just fine:
>
> curl --header "Content-type: application/json" --request POST --data
> @post.json http://localhost:8047/query.json
>
> where post.json contatins:
> {"queryType" : "SQL", "query" : "select * from
>  vzb.dev.`/bigdata/parquet_basic_indicators` limit 1"}
>
> returns:
> {
>   "columns" : [ "TIME", "GEO", "GDP_PER_CAP", "LEX", "POP", "GEO_NAME",
> "GEO_CAT", "GEO_REGION" ],
>   "rows" : [ {
> "GEO" : "world",
> "POP" : "5858793283",
> "GDP_PER_CAP" : "8780.9",
> "GEO_NAME" : "World",
> "GEO_CAT" : "planet",
> "TIME" : "1990",
> "GEO_REGION" : "\r",
> "LEX" : "65.76"
>   } ]
> }
>
> however, if I try to invoke such a Post request from a script, I'm getting:
> XMLHttpRequest cannot load http://localhost:8047/query.json. Origin
> file://
> is not allowed by Access-Control-Allow-Origin.
>
> So my question is: can I configure the embedded server to accept the
> request? I can rebuild Drill in my local environment if necessary, I just
> need to know what to change in the code/configuration for it to work.
>
> Thanks!
> Nira.
>



-- 
Tomer Shiran

Re: Replacing the S3 Client

2015-05-21 Thread Tomer Shiran

Drill uses what's known as the Hadoop 'FileSystem API'. Drill can connect to 
any local or cloud storage system that has a library conforming to this API.

> On May 21, 2015, at 9:48 AM, Derek Rabindran  wrote:
> 
> Hi,
> 
> Is it possible to replace jets3t, the library used to interact with S3,
> with another client?  How does Apache Drill interact with S3 using jets3t?
> 
> Thanks
> 
> -- 
> - Derek Rabindran

Re: [VOTE] Release Apache Drill 1.0.0 (rc1)

2015-05-17 Thread Tomer Shiran

+1 (binding)

Downloaded binary
Ran Drill in embedded mode with the new alias (bin/drill-embedded)
Ran some queries on the Yelp dataset on local files (Mac) and MongoDB
Found that David is the most popular name on Yelp and that reviews are more
often useful than funny or cool (based on Yelp votes)

Congrats!

On Fri, May 15, 2015 at 8:03 PM, Jacques Nadeau  wrote:

> Hey Everybody,
>
> I'm happy to propose a new release of Apache Drill, version 1.0.0.  This is
> the second release candidate (rc1).  It includes a few issues found earlier
> today and covers a total of 228 JIRAs*.
>
> The vote will be open for 72 hours ending at 8pm Pacific, May 18, 2015.
>
> [ ] +1
> [ ] +0
> [ ] -1
>
> thanks,
> Jacques
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12325568
> [2] http://people.apache.org/~jacques/apache-drill-1.0.0.rc1/
>
>
> *Note, the previous rc0 vote email undercounting the number of closed
> JIRAs.
>

Re: Things are looking pretty good, let's do a 1.0 vote soon

2015-05-12 Thread Tomer Shiran

Sounds good

On Tue, May 12, 2015 at 1:56 PM, Jacques Nadeau  wrote:

> It looks like we've gotten a number of stability and performance
> improvements into the codebase since 0.9.  I'd like to suggest we start a
> 1.0 vote soon.  I'm happy to be release manager again.  If everybody thinks
> this sounds good, I'll try to cut a release candidate in the next day or
> two.
>
> thanks,
> Jacques
>

Re: git based web sites available

2015-05-03 Thread Tomer Shiran

Thanks Ted. I will move our site to this system. That way it's just git for 
everything...

> On May 3, 2015, at 9:59 AM, Ted Dunning  wrote:
> 
> This blog from infra might be interesting in that it can simplify Drill's
> two version control stance.
> 
> https://blogs.apache.org/infra/entry/git_based_websites_available

Re: Should we make dir* columns only exist when requested?

2015-04-23 Thread Tomer Shiran

+1 to adding the filename (needed this last week, I had .json files 
and wanted to join with another table)
+1 to using an array dirs[]
+1 to not having it in * (but would "select dirs, *" work?)



> On Apr 23, 2015, at 7:00 PM, Steven Phillips  wrote:
> 
> What you are showing for the current behavior seems wrong to me:
> 
> $ tree mytdir
> mytdir
> └── mysdir
>└── myFile.json
> 
> $ cat mytdir/mysdir/myFile.json
> {a:1,b:2,c:3}
> {a:4,b:5,c:6}
> 
> 0: jdbc:drill:> select * from `mytdir/mysdir/myFile.json`;
> ++++
> | a  | b  | c  |
> ++++
> | 1  | 2  | 3  |
> | 4  | 5  | 6  |
> ++++
> 2 rows selected (0.274 seconds)
> 0: jdbc:drill:> select * from `mytdir/mysdir/myFile.json`;
> ++++
> | a  | b  | c  |
> ++++
> | 1  | 2  | 3  |
> | 4  | 5  | 6  |
> ++++
> 2 rows selected (0.152 seconds)
> 0: jdbc:drill:> select * from `/mytdir/mysdir`;
> ++++
> | a  | b  | c  |
> ++++
> | 1  | 2  | 3  |
> | 4  | 5  | 6  |
> ++++
> 2 rows selected (0.157 seconds)
> 0: jdbc:drill:> select * from `mytdir`;
> +++++
> |dir0| a  | b  | c  |
> +++++
> | mysdir | 1  | 2  | 3  |
> | mysdir | 4  | 5  | 6  |
> +++++
> 
> I don't know why in your example, you are getting a dir0 directory when
> selecting a specific file. These directories should only be included when
> the specified table is a directory which contains subdirectories. Any query
> to a specific file or to a directory that only contains regular files
> should not return dir* columns.
> I think this is the correct behavior.
> 
> The fact that `mytidir` and `mytdir/mysdir` have different columns is not a
> problem, because they are different tables.
> 
> I do think Daniel's idea of adding the file name as well makes sense. I'm
> also open to Ted's idea for return a dir array instead of individual
> columns.
> 
> On Thu, Apr 23, 2015 at 6:36 PM, Julian Hyde  wrote:
> 
>>> Ted wrote:
>>> 
>>> For one thing, I can make a really slow version of [find] !
>> 
>> Why does it have to be slow? Seriously, so many of the tools we use
>> daily have quasi-query facilities (find, git log, du, ps, netstat) and
>> we cobble together queries using complex options and pipelines of unix
>> commands. Relational algebra is a potentially MORE efficient.
>> 
>> I find myself writing ' ... | sort | uniq -c | sort -nr' almost daily
>> and wish I could write ' ... order by count(*) desc'.
>> 
>>> On Thu, Apr 23, 2015 at 6:27 PM, Julian Hyde  wrote:
>>> +1 to returning directories as context. Very useful feature. Could be
>>> used to return context for other adapters (e.g. an adapter that
>>> concatenates all versions of versioned logfiles).
>>> 
>>> +1 making dir an array, per Ted's suggestion
>>> 
>>> I think dir should not appear in *; thus you'd have to write
>>> 
>>>  select dir, * from `/mytdir/mysdir/myfile.json`
>>> 
>>> This behavior is analogous to Oracle's ROWID. It is not a column as
>>> such, but a system function that you can apply to a row.
>>> 
>>> You need to allow qualifiers:
>>> 
>>>  select x.dir, x.*, y.dir, y.* from `/mytdir/mysdir/myfile.json` as
>>> x, `/mytdir/mysdir/myfile2.json` as y
>>> 
>>> and
>>> 
>>>  select dir from `/mytdir/mysdir/myfile.json` as x,
>>> `/mytdir/mysdir/myfile2.json` as y
>>> 
>>> would be illegal because dir is ambiguous.
>>> 
>>> You should make dir a reserved word (like ROWID).
>>> 
>>> On Thu, Apr 23, 2015 at 5:12 PM, Ted Dunning 
>> wrote:
 Great point.
 
 Having the file name itself is very handy.
 
 
 For one thing, I can make a really slow version of [find] !
 
 (seriously, I would love this)
 
 
 On Thu, Apr 23, 2015 at 7:48 PM, rahul challapalli <
 challapallira...@gmail.com> wrote:
 
> I am also under the opinion that we should not assume knowledge on the
>> user
> front for data discovery. So we should either have 'dir' columns in
>> 'select
> *' or support a variation that Ted suggested.
> Also the folder names compliment the actual data in some cases.
> 
> - Rahul
> 
> On Thu, Apr 23, 2015 at 4:38 PM, Daniel Barclay >> 
> wrote:
> 
>> Regarding the use case in which the user stores information in
>> pathnames:
>> 
>> Since Drill supports that use case partially, shouldn't

Re: [DISCUSS] improve physical plan formatting

2015-04-13 Thread Tomer Shiran

+1


On Fri, Apr 10, 2015 at 4:23 PM, Steven Phillips 
wrote:

> I find the current format for physical plans, where we indent for every
> operator, but keep both sides of the join at the same indentation the same,
> is very difficult to read, especially when we have several joins.
>
> I think it would be easier to read if we kept sequential operators at the
> same indentation, but increased indentation only when there is a join, and
> include some marks to show the connection between the sides of the join and
> the join operator.
>
> I created a gist with an example:
>
> https://gist.github.com/StevenMPhillips/74a6bf655175aabd14c0
>
>
>
> --
>  Steven Phillips
>  Software Engineer
>
>  mapr.com
>

Re: [VOTE] Release Apache Drill 0.8.0 (rc1)

2015-03-28 Thread Tomer Shiran

I think it's ok to ship.



> On Mar 28, 2015, at 11:13 AM, Jinfeng Ni  wrote:
> 
> +1
> 
> * Download src tar ball and run full build on Mac OS twice.  Run several
> sample queries. All are good.
> * Verified md5 / sha1 checksum
> 
> For DRILL-2458, I feel this is not a show stopper, since it is marked as
> targeting for 0.9. We have many JIRA opened, including incorrect result
> issues, and we do not intend to fix all of them in this 0.8 release.
> 
> 
> 
>> On Sat, Mar 28, 2015 at 10:53 AM, Aman Sinha  wrote:
>> 
>> It looks like DRILL-2458 is tagged for 0.9, not 0.8, so the patch never got
>> included in the 0.8 release.   I think we should either include it (and
>> verify it resolves the issue) or disable the default setting...what do you
>> think ?
>> I am changing my vote to +0  at this time since there is a workaround.
>> 
>> Aman
>> 
>>> On Sat, Mar 28, 2015 at 10:24 AM, Aman Sinha  wrote:
>>> 
>>> Yes, disabling the mux exchange resolves the problem.  I would be more
>>> comfortable changing the default value of  this setting to false since it
>>> is a common enough scenario, but am open to discussion...
>>> 
>>> On Sat, Mar 28, 2015 at 10:14 AM, Jacques Nadeau 
>>> wrote:
>>> 
 If you disable the mux exchange, (ALTER SESSION SET
 `planner.enable_mux_exchange` = FALSE), does it still occur?  If it
 doesn't, do you still think this should block the release?
 
 On Sat, Mar 28, 2015 at 10:10 AM, Aman Sinha 
>> wrote:
 
> -1 for me.  Downloaded the tar file and installed on my macbook.  Ran
> several complex queries successfully.  However, found an issue with
>> CTAS
> where we end up creating an extra hash value column..this was one of 2
> things reported in DRILL-2458 and is not fully fixed.
> 
> Aman
> 
> On Fri, Mar 27, 2015 at 8:08 PM, Steven Phillips <
 sphill...@maprtech.com>
> wrote:
> 
>> +1 (binding)
>> 
>> On Fri, Mar 27, 2015 at 7:39 PM, Aditya 
 wrote:
>> 
>>> +1 (binding).
>>> 
>>> # Signatures, checksum verified.
>>> # Built from sources, ran unit tests on Windows with Java 1.7.
>>> # Started Drill in embedded and distributed mode on a Windows
 machine,
>> ran
>>> sample queries from CP.
>>> # Configured HBase plugin with HBase 0.98.10, ran few queries
>> joined
> with
>>> CSV and JSON data.
>>> 
>>> On Fri, Mar 27, 2015 at 5:47 PM, Parth Chandra <
 pchan...@maprtech.com>
>>> wrote:
>>> 
 +1 binding.
 Built on Linux from src. Started drillbit and ran a few test
 queries
>> from
 sqlline. Looks good.
 
 On Wed, Mar 25, 2015 at 11:49 PM, Jacques Nadeau <
 jacq...@apache.org
>> 
 wrote:
 
> Good evening,
> 
> I would like to propose the release of Apache Drill, version
 0.8.0.
>>> This
> is the second candidate for release (rc1).  This includes a
 number
> of
> stability fixes over the previous candidate after discovery
>> that
> the
> in-progress query status was unavailable.
> 
> This release includes >230 resolved JIRAs [1].  Note that some
 of
> the
 fixes
> in this release are not yet merged into master.  As such, I
>> have
> yet
>> to
> close some bugs.  I will close them once these changes are
 merged
>> into
> master.
> 
> The artifacts are hosted at [2].
> 
> This is on the 0.8.0 release branch as shown at [3].  For
> reference,
>>> the
> previous release candidate was based on the git hash
> f1b59ed4467ddaf75bc986ec095a20d6c28e9d15.
> 
> The vote will be open for 72 hours, ending Midnight Pacific,
 March
>> 29,
 2015
> 
> [ ] +1
> [ ] +0
> [ ] -1
> 
> 
> Thank you,
> Jacques
> 
> [1]
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12328812
> [2] http://people.apache.org/~jacques/apache-drill-0.8.0.rc1/
> [3]
>> https://git-wip-us.apache.org/repos/asf?p=drill.git;a=shortlog;h=refs/heads/0.8.0
>> 
>> 
>> 
>> --
>> Steven Phillips
>> Software Engineer
>> 
>> mapr.com
>>

Re: [VOTE] Release Apache Drill 0.8.0 (rc1)

2015-03-26 Thread Tomer Shiran

+1

Downloaded the binaries, ran some queries (inner queries, group by, ...) in
local mode and validated some of the results.

On Wed, Mar 25, 2015 at 11:49 PM, Jacques Nadeau  wrote:

> Good evening,
>
> I would like to propose the release of Apache Drill, version 0.8.0.  This
> is the second candidate for release (rc1).  This includes a number of
> stability fixes over the previous candidate after discovery that the
> in-progress query status was unavailable.
>
> This release includes >230 resolved JIRAs [1].  Note that some of the fixes
> in this release are not yet merged into master.  As such, I have yet to
> close some bugs.  I will close them once these changes are merged into
> master.
>
> The artifacts are hosted at [2].
>
> This is on the 0.8.0 release branch as shown at [3].  For reference, the
> previous release candidate was based on the git hash
> f1b59ed4467ddaf75bc986ec095a20d6c28e9d15.
>
> The vote will be open for 72 hours, ending Midnight Pacific, March 29, 2015
>
> [ ] +1
> [ ] +0
> [ ] -1
>
>
> Thank you,
> Jacques
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12328812
> [2] http://people.apache.org/~jacques/apache-drill-0.8.0.rc1/
> [3]
>
> https://git-wip-us.apache.org/repos/asf?p=drill.git;a=shortlog;h=refs/heads/0.8.0
>

Re: Lets cut a 0.8 release

2015-03-19 Thread Tomer Shiran

Makes sense. Lots of progress since 0.7.

On Thu, Mar 19, 2015 at 9:47 AM, Jacques Nadeau  wrote:

> Hey Y'all,
>
> We haven't had a release in a while.  I'd like to propose that we slice off
> our current master as 0.8 so users can take advantage of all the fixes
> since 0.7.  I'm happy to be the release manager unless someone else wants
> to raise their hand.  I'd like to target 6pm pacific today as the cutoff
> line for patches.  Let's get back on the monthly release cycle.  As such,
> we can target 0.9 for mid-April, etc.
>
> thanks,
> Jacques
>

Re: UML State Diagrams for C++ Client

2015-01-08 Thread Tomer Shiran

Alex, the mailing list doesn't allow for attachments. Can you post it
somewhere (Dropbox, Google Drive, etc.) and send the link.

Thanks,
Tomer

On Thu, Jan 8, 2015 at 5:40 PM, Alexander Zarei 
wrote:

> Hi all,
>
>
>
> We put together (attached) two UML State Diagrams for Drill C++ Client to
> synchronize our comprehension of C++ Client and to foster contribution to
> it.
>
>
> I was wondering if you could review it and provide us with your feedback,
> suggestions and corrections.
>
>
>
> Thank you very much for your time. Hope this will bring about more synergy
> and looking forward to hearing your feedback.
>
>
>
> Thanks,
>
> Alex
>

Re: [DISCUSS] Cassandra storage for Drill

2015-01-08 Thread Tomer Shiran

I think that any valid SQL statement should work with any data source.
Drill should:

   - Push down as much processing as possible into the data source
   (Cassandra in this case)
   - Maintain as much data locality as possible (ie, spread the work so
   that each drillbit is handling local data)
   - In the worst case, Drill should pull the entire table from the data
   source if that's what's needed to satisfy the query.


On Thu, Jan 8, 2015 at 8:29 AM, Yash Sharma  wrote:

> Hi Folks,
> This thread is to discuss few scenarios how Cassandra works - and how do we
> think it should be supported in Drill.
>
> While they are not supported in Cassandra inherently but its doable on
> Drill's end once we fetch a superset of data without these cases.
>
> 1. Filtering non indexed column in Cassandra
> 2. Filtering by subset of primary key
> 3. OR condition in where clause
>
> Should we apply filters at Drill's end and support these features or we
> propagate an error back to user for asking for a valid Cassandra based
> query?
>
> -
> Examples:
> Here 'trending_now' is a dummy table with (id, rank, pog_id) where
> (id,rank) is primary key pair.
> 1.
> cqlsh:recsys> select * from trending_now where pog_id=10004 ;
> Bad Request: No indexed columns present in by-columns clause with Equal
> operator
>
> 2.
> cqlsh:recsys> select * from trending_now where rank=4;
> Bad Request: Cannot execute this query as it might involve data filtering
> and thus may have unpredictable performance. If you want to execute this
> query despite the performance unpredictability, use ALLOW FILTERING
> P.S. ALLOW FILTERING is not permitted in Cassandra java driver as of now.
>
> 3.
> cqlsh:recsys> select * from trending_now where rank=4 or id='id0004';
> Bad Request: line 1:40 missing EOF at 'or'
>
> 4. Valid Query:
> cqlsh:recsys> select * from trending_now where id='id0004' and rank=4;
>
>  id | rank | pog_id
> +--+
>  id0004 |4 |  10002
>
> (1 rows)
>

Re: [ANNOUNCE] Release of Apache Drill 0.7.0

2014-12-23 Thread Tomer Shiran

Congratulations and happy holidays to the entire Drill community. Looking 
forward to an exciting 2015.

> On Dec 23, 2014, at 6:50 PM, Jacques Nadeau  wrote:
> 
> It is my pleasure to announce the release of Apache Drill 0.7.0.  This is
> our first release since becoming a top level project and it includes a
> large number of fixes and enhancements.  It is a significant milestone as
> we move towards 1.0.
> 
> Download it now at [2].  Read about release highlights at [1]. See the
> complete release notes at [3].
> 
> Thanks to everyone for putting this release together.
> 
> Happy Holidays!
> 
> 
> [1] http://drill.apache.org/blog/2014/12/23/drill-0.7-released/
> [2] http://drill.apache.org/download/
> [3]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12327473

Re: [VOTE] Release Apache Drill 0.7.0 (rc1)

2014-12-18 Thread Tomer Shiran

+1

> On Dec 18, 2014, at 12:06 PM, Jacques Nadeau  wrote:
> 
> Good morning,
> 
> I would like to propose the release of Apache Drill, version 0.7.0.  This
> is the second release candidate (zero-index rc1) and includes fixes for a
> few issues identified as part of the first candidate.
> 
> This release includes 228 resolved JIRAs [1].
> 
> The artifacts are hosted at [2].
> 
> The vote will be open for 72 hours, ending Noon Pacific, December 21, 2014.
> 
> [ ] +1
> [ ] +0
> [ ] -1
> 
> 
> Thank you,
> Jacques
> 
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12327473
> [2] http://people.apache.org/~jacques/apache-drill-0.7.0.rc1/

Re: [VOTE] Release Apache Drill 0.7.0

2014-12-15 Thread Tomer Shiran

+1

On Mon, Dec 15, 2014 at 9:39 AM, Jacques Nadeau  wrote:
>
> Good morning,
>
> I would like to propose the release of Apache Drill, version 0.7.0.
>
> This release includes 223 resolved JIRAs [1].
>
> The artifacts are hosted at [2].
>
> The vote will be open for 72 hours, ending 10 AM Pacific, December 18,
> 2014.
>
> [ ] +1
> [ ] +0
> [ ] -1
>
>
> Thank you,
> Jacques
>
> [1]
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12327473
> [2] http://people.apache.org/~jacques/apache-drill-0.7.0.rc0/
>

Re: Regarding storage plugin development

2014-12-06 Thread Tomer Shiran

That sounds great. A good starting point for developing a storage plugin is
the existing plugins. These are mostly in the contrib directory:

   - MongoDB:
   https://github.com/apache/drill/tree/master/contrib/storage-mongo
   - HBase:
   https://github.com/apache/drill/tree/master/contrib/storage-hbase

The folks who built these are on this list, so feel free to reach out with
any questions. We could also set up a hangout to discuss.

I believe that HBase 0.98 is currently being developed.
A JDBC storage plugin would be very valuable as it would allow Drill to
talk to MySQL as well as other databases.

On Sat, Dec 6, 2014 at 12:51 AM, prasanna pradhan 
wrote:

> Hi Team,
>
> I am Prasanna from DataRPM (A natural language data analytics company).
> First of all thank you for a such a wonderful tool. I have been following
> drill for nearly 8 months now and I love the way it is growing.
>
>
> We have a requirement where we need to develop custom storage plugins for
> drill for hbase 0.98, MySql, Salesforce etc.
> We need some pointers on how to start developing a storage plugin maybe
> documentation / sample / specs etc.
>
>
> Thanks,
> Prasanna
>

Re: Manta Object Store Support

Re: [MapR Answers] thatte.hiten...@gmail.com asked "Apache Drill not installing on windows"

Re: [VOTE] Release Apache Drill 1.3.0 (rc3)

Re: No s3 access without jars from hadoop

Re: [VOTE] Release Apache Drill 1.3.0 (rc1)

[jira] [Created] (DRILL-3978) Exceptions when using JDBC driver

Re: [ANNOUNCE] Release of Apache Drill 1.2.0

Parquet read optimization

Re: Who is coming to Strata NYC?

Re: Who is coming to Strata NYC?

Re: Drill Sql Max row size

Re: Who is coming to Strata NYC?

Re: Apache drill jdbc driver - can i connect to a drillbit?

Re: Cross Origin REST API calls in embedded mode (for testing)

Re: Replacing the S3 Client

Re: [VOTE] Release Apache Drill 1.0.0 (rc1)

Re: Things are looking pretty good, let's do a 1.0 vote soon

Re: git based web sites available

Re: Should we make dir* columns only exist when requested?

Re: [DISCUSS] improve physical plan formatting

Re: [VOTE] Release Apache Drill 0.8.0 (rc1)

Re: [VOTE] Release Apache Drill 0.8.0 (rc1)

Re: Lets cut a 0.8 release

Re: UML State Diagrams for C++ Client

Re: [DISCUSS] Cassandra storage for Drill

Re: [ANNOUNCE] Release of Apache Drill 0.7.0

Re: [VOTE] Release Apache Drill 0.7.0 (rc1)

Re: [VOTE] Release Apache Drill 0.7.0

Re: Regarding storage plugin development

29 matches

Site Navigation

Mail list logo

Footer information