Re: Performance tuning for TPC-H Q1 on a three nodes cluster

2016-05-25 Thread Dechang Gu
On Tue, May 24, 2016 at 7:07 PM, Yijie Shen wrote: > Hi Dechang, > > Thanks very much for your help! > > I get a little confused here, why does skew exist? > > After some statistic work, I got this: 1516 files and 102.54MB on average, > max of 104MB, min of 95MB. > On

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Parth Chandra
Thanks everyone, and in particular, thank you, Jacques, for making Drill possible. On Wed, May 25, 2016 at 3:31 PM, Chunhui Shi wrote: > Big congratulations to Parth! > Thanks Jacques for founding Drill project and way to go drillers! > > Chunhui > > On Wed, May 25, 2016 at

Re: Is there a good way to handle bad date data?

2016-05-25 Thread John Omernik
AR to (I learned something) However, that > > >> doesn't work because my data is accurate i.e. '-__-__' 2015-04-02 > > and > > >> 2015-00-23 but 00 doesn't work (bad data) . > > >> > > >> UDFs scare me in that the only Java I've conquered is evide

Re: Is there a good way to handle bad date data?

2016-05-25 Thread Vince Gonzalez
've conquered is evident from my > >> empty > >> french press... > >> > >> I know I've brought it up in the past, but has anyone seen any community > >> around UDFs start? I'd love to have a community that follows Apache like > >> rules, and allows us to c

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Chunhui Shi
Big congratulations to Parth! Thanks Jacques for founding Drill project and way to go drillers! Chunhui On Wed, May 25, 2016 at 11:45 AM, John Omernik wrote: > Congratz Parth, and thank you Jacques! > > On Wed, May 25, 2016 at 1:25 PM, Xiao Meng wrote: >

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread John Omernik
Congratz Parth, and thank you Jacques! On Wed, May 25, 2016 at 1:25 PM, Xiao Meng wrote: > Big congratulations, Parth! > > And thank you, Jacques, for the leadership and the tremendous contributions > to the community. > > Best, > > Xiao > > On Wed, May 25, 2016 at 8:35 AM,

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Xiao Meng
Big congratulations, Parth! And thank you, Jacques, for the leadership and the tremendous contributions to the community. Best, Xiao On Wed, May 25, 2016 at 8:35 AM, Jacques Nadeau wrote: > I'm pleased to announce that the Drill PMC has voted to elect Parth Chandra > as

One more with Excepting Columns by name

2016-05-25 Thread Andrew Evans
Drill Community, Thanks for the quick response to my last question. I am exploring the tool and will definitely use it, especially the json capabilities with the flatten function. I noticed that Flatten will error on nested maps. I would like to exclude these maps to build tables from them

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Abhishek Girish
Congrats Parth! And thanks Jacques. On Wed, May 25, 2016 at 10:53 AM, Jason Altekruse wrote: > Congrats Parth! > > Jason Altekruse > Software Engineer at Dremio > Apache Drill Committer > > On Wed, May 25, 2016 at 10:39 AM, rahul challapalli < > challapallira...@gmail.com>

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Jason Altekruse
Congrats Parth! Jason Altekruse Software Engineer at Dremio Apache Drill Committer On Wed, May 25, 2016 at 10:39 AM, rahul challapalli < challapallira...@gmail.com> wrote: > Congratulations Parth! > > Thank You Jacques for your leadership over the last few years. > > On Wed, May 25, 2016 at

Re: Error with flatten function on MongoDB documents that contain array of key-value pairs

2016-05-25 Thread rahul challapalli
Just to be sure, can you run the below query which does not contain flatten? If this query also fails, then it could be bad data in "Pnl" column ( may be an empty string?) SELECT x.DateValueCollection FROM `mongo`.`db_name`.` some.random.collection.name` AS x; On Wed, May 25, 2016 at 10:32 AM,

Error with flatten function on MongoDB documents that contain array of key-value pairs

2016-05-25 Thread Arman Siddiqui
Good afternoon, I am receiving the following error when using the flatten function to query a MongoDB collection: Error: SYSTEM ERROR: IllegalArgumentException: You tried to write a VarChar type when you are using a ValueWriter of type NullableFloat8WriterImpl. The collection contains a number

Re: Is there a good way to handle bad date data?

2016-05-25 Thread John Omernik
hub project and encourage folks to >> come to the table or is there better way via Apache to do something like >> that? >> >> On Wed, May 25, 2016 at 10:27 AM, Veera Naranammalpuram < >> vnaranammalpu...@maprtech.com> wrote: >> >> You could write a UDF. Or yo

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread rahul challapalli
Congratulations Parth! Thank You Jacques for your leadership over the last few years. On Wed, May 25, 2016 at 10:26 AM, Gautam Parai wrote: > Congratulations Parth! > > On Wed, May 25, 2016 at 9:02 AM, Jinfeng Ni wrote: > > > Big congratulations,

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Gautam Parai
Congratulations Parth! On Wed, May 25, 2016 at 9:02 AM, Jinfeng Ni wrote: > Big congratulations, Parth! > > Thank you, Jacques, for your contribution and leadership over the last > few years! > > > On Wed, May 25, 2016 at 8:35 AM, Jacques Nadeau >

RE: Apache Drill, Query PostgreSQL text or Jsonb as if it were from a json storage type?

2016-05-25 Thread Andrew Evans
I am trying to avoid reading through millions of records but would like to also avoid building a postgreSQL table by making less but an also significant number of calls to check column existence in the information_schema. Drill seems perfect for implicity discovering this. Right now, I am dumpy

Re: Is there a good way to handle bad date data?

2016-05-25 Thread MattK
or is there better way via Apache to do something like that? On Wed, May 25, 2016 at 10:27 AM, Veera Naranammalpuram < vnaranammalpu...@maprtech.com> wrote: You could write a UDF. Or you could do something like this: cat data.csv 05/25/2016 20160525 May 25th 2016 0: jdbc:drill:> sele

Re: Is there a good way to handle bad date data?

2016-05-25 Thread Charles Givre
is there better way via Apache to do something like > that? > > On Wed, May 25, 2016 at 10:27 AM, Veera Naranammalpuram < > vnaranammalpu...@maprtech.com> wrote: > >> You could write a UDF. Or you could do something like this: >> >> cat data.csv >> 05

Re: Is there a good way to handle bad date data?

2016-05-25 Thread Charles Givre
oject and encourage folks to > come to the table or is there better way via Apache to do something like > that? > > On Wed, May 25, 2016 at 10:27 AM, Veera Naranammalpuram < > vnaranammalpu...@maprtech.com> wrote: > >> You could write a UDF. Or you could do something like

Re: Is there a good way to handle bad date data?

2016-05-25 Thread John Omernik
: > > cat data.csv > 05/25/2016 > 20160525 > May 25th 2016 > > 0: jdbc:drill:> select case when columns[0] similar to '__/__/' then > to_date(columns[0],'MM/dd/') when columns[0] similar to '' then > to_date(columns[0],'MMdd') else NULL end from `da

Re: Apache Drill, Query PostgreSQL text or Jsonb as if it were from a json storage type?

2016-05-25 Thread MattK
Would the PostgreSQL function jsonb_to_recordset(jsonb) help in this case? It would return to Drill a table instead of a set of JSON objects, but you would have to declare the types in the call. On 25 May 2016, at 12:26, Andrew Evans wrote: Drill Members, I have an intriguing problem

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Aditya
Congratulations Parth! And I echo JNI's words regarding Jacques' leadership in Apache Drill community. On Wed, May 25, 2016 at 9:23 AM, scott cote wrote: > Parth is a nice “bit” to add to the tool kit :) > > SCott > > On May 25, 2016, at 11:15 AM, Abdel Hakim Deneche

Re: Apache Drill, Query PostgreSQL text or Jsonb as if it were from a json storage type?

2016-05-25 Thread Neeraja Rentachintala
There is ability to do retrieve JSON fields using the convert_to function in Drill. Check the following doc. https://drill.apache.org/docs/data-type-conversion/#convert_to-and-convert_from On Wed, May 25, 2016 at 9:26 AM, Andrew Evans wrote: > Drill Members, >

Apache Drill, Query PostgreSQL text or Jsonb as if it were from a json storage type?

2016-05-25 Thread Andrew Evans
Drill Members, I have an intriguing problem where I have hundreds of thousands or even millions of records stored in jsonb format in uneven JSon objects in a PostgreSQL database. I would like to be able to implicitly grab column names and data types using your tool since neither PostreSQL or

Re: Converting INTERVAL to Number

2016-05-25 Thread John Omernik
Yep, that works without the extra CAST, I didn't see the Y in the original age return, I assumed it had to be as year before I could use it, I assumed wrong. I guess that's my superpower, being wrong and answering my own questions on lists so others can learn from my mistakes ;) On Wed, May 25,

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread scott cote
Parth is a nice “bit” to add to the tool kit :) SCott > On May 25, 2016, at 11:15 AM, Abdel Hakim Deneche > wrote: > > Congrats Parth ! > > On Wed, May 25, 2016 at 9:15 AM, Zelaine Fong wrote: > >> Congratulations, Parth. Looking forward to

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Zelaine Fong
Congratulations, Parth. Looking forward to working with in your new role :). -- Zelaine On Wed, May 25, 2016 at 9:02 AM, Jinfeng Ni wrote: > Big congratulations, Parth! > > Thank you, Jacques, for your contribution and leadership over the last > few years! > > > On Wed,

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Jinfeng Ni
Big congratulations, Parth! Thank you, Jacques, for your contribution and leadership over the last few years! On Wed, May 25, 2016 at 8:35 AM, Jacques Nadeau wrote: > I'm pleased to announce that the Drill PMC has voted to elect Parth Chandra > as the new PMC chair of

Re: Converting INTERVAL to Number

2016-05-25 Thread Andries Engelbrecht
Great! Answering your own questions on email lists can be therapeutic ;-) Why not just use EXTRACT(year from age(dob)) --Andries > On May 25, 2016, at 7:39 AM, John Omernik wrote: > > Well I need to include the Staples "That was Easy" button here... I tried: > >

[ANNOUNCE] New PMC Chair of Apache Drill

2016-05-25 Thread Jacques Nadeau
I'm pleased to announce that the Drill PMC has voted to elect Parth Chandra as the new PMC chair of Apache Drill. Please join me in congratulating Parth! thanks, Jacques -- Jacques Nadeau CTO and Co-Founder, Dremio

Re: Is there a good way to handle bad date data?

2016-05-25 Thread Veera Naranammalpuram
You could write a UDF. Or you could do something like this: cat data.csv 05/25/2016 20160525 May 25th 2016 0: jdbc:drill:> select case when columns[0] similar to '__/__/' then to_date(columns[0],'MM/dd/') when columns[0] similar to '' then to_date(columns[0],'MMdd') else N

Re: Caravel and Drill Integration Update

2016-05-25 Thread Ted Dunning
Updates are AWEsome here. On Wed, May 25, 2016 at 9:08 AM, John Omernik wrote: > Update! > > (Note, please tell me if updates here are inappropriate. I like updating > here, because I think it's relevant to the Drill community, and I want > people to see what is happening

Re: Regarding Apache Drill 1.7 Version Release Date

2016-05-25 Thread Zelaine Fong
Same response as the one I provided about a week ago :). But it may have gotten lost as that email thread was asking other questions as well. We're in the process of re-evaluating the release cadence of Drill, and will no longer be doing releases on a monthly cadence. At this time, we do not

Re: Is there a good way to handle bad date data?

2016-05-25 Thread Vince Gonzalez
Sounds like a job for a UDF? You could do the try/catch inside the UDF. Vince Gonzalez Systems Engineer 212.694.3879 mapr.com On Wed, May 25, 2016 at 11:05 AM, John Omernik wrote: > I have some DOBs, and some fields are empty others apparently were filled > by

Is there a good way to handle bad date data?

2016-05-25 Thread John Omernik
I have some DOBs, and some fields are empty others apparently were filled by trained monkeys, but while most data is accurate, some data is not. As you saw from my other post, I am trying to get the age for those DOBs that are valid... My function works, until I get to a record that is not valid

Re: Converting INTERVAL to Number

2016-05-25 Thread John Omernik
Well I need to include the Staples "That was Easy" button here... I tried: EXTRACT(year from cast(age(dob) as INTERVAL YEAR)) as yr_age And it worked! Self Answering question is self answering... On Wed, May 25, 2016 at 9:35 AM, John Omernik wrote: > Hey all, simple

Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread John Omernik
Ah good points. I think this also factors into the Workspace Security topic I bumped up. Trying to ensure we have the proper tools to holistically manage our data environment as presented to the user by Drill I think is important for any admin. On Wed, May 25, 2016 at 9:34 AM, Andries

Converting INTERVAL to Number

2016-05-25 Thread John Omernik
Hey all, simple question, I have a field, dob, I want to get the current age from... I have: cast(age(dob) as INTERVAL YEAR) as yr_age Which works pretty well, as you can see below, however, I'd like a column that is just the integer age, no months, no P/Y etc. Now, I can play with string

Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Andries Engelbrecht
It is an interesting idea, but may warrant more discussion in the overall Drill metadata management. For example how will it affect other SPs that are not DFS? How will it be represented/managed in INFORMATION_SCHEMA when tools are used to work with Drill metadata? I support that this is a

Re: Caravel and Drill Integration Update

2016-05-25 Thread John Omernik
Update! (Note, please tell me if updates here are inappropriate. I like updating here, because I think it's relevant to the Drill community, and I want people to see what is happening so they can help, however, I do know there are others who may see this as a separate project from Drill, a thus

Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Charles Givre
+2 I really like this idea. —C > On May 25, 2016, at 08:52, Jim Scott wrote: > > +1 > > On Wed, May 25, 2016 at 7:05 AM, John Omernik wrote: > >> Prior to opening a JIRA on this, I was curious what the community thought. >> I'd like to have a setting

Re: Discussion - "Hidden" Workspaces

2016-05-25 Thread Jim Scott
+1 On Wed, May 25, 2016 at 7:05 AM, John Omernik wrote: > Prior to opening a JIRA on this, I was curious what the community thought. > I'd like to have a setting for workspaces that would indicate "hidden". > (Defaulting to false if not specified to not break any already

Re: Security with Storage Plugins

2016-05-25 Thread John Omernik
After talking about my other idea (Hidden workspaces) I remembered this thread. This doesn't supplant my other discussion, however I would like to see if I couldn't get some discussion going here on workspace security again. On Wed, Nov 25, 2015 at 9:25 AM, John Omernik wrote:

Discussion - "Hidden" Workspaces

2016-05-25 Thread John Omernik
Prior to opening a JIRA on this, I was curious what the community thought. I'd like to have a setting for workspaces that would indicate "hidden". (Defaulting to false if not specified to not break any already implemented workspace definitions) For example: "workspaces" { "dev": {

Re: Regarding Apache Drill 1.7 Version Release Date

2016-05-25 Thread John Omernik
I am also interested in this. I know there was some talk about this and then some of that work was focused on the 2.0 Anyone care to talk about 1.7 vs 2.0, and what these various releases could me to the community? Thanks! John On Wed, May 25, 2016 at 12:44 AM, Sanjiv Kumar