Re: Aggregation performance

2016-12-20 Thread yousuf
Hi Kathleen A bug is reported on JIRA Thanks & Regards Yousuf On 12/19/2016 10:03 PM, Kathleen Li wrote: It seemed in Drill 1.8 parameter store.mongo.all_text_mode=true by default already Try ALTER SESSION SET `exec.enable_union_type` =

Re: Aggregation performance

2016-12-19 Thread Kathleen Li
It seemed in Drill 1.8 parameter store.mongo.all_text_mode=true by default already Try ALTER SESSION SET `exec.enable_union_type` = true; if you still get the errors, you might open one public JIRA with the detailed information. You might also try to use CTAS to create drill tables with parq

Re: Aggregation performance

2016-12-19 Thread Dechang Gu
Hi Yousuf, Thanks for the update and profile. From the profile, looks like most of the time was spent on the following operator: 05-xx-03 UNKNOWN_OPERATOR 0.000s 0.000s 0.000s *1.350s* *4.903s* *7.817s* 0.000s 0.000s 0.000s 280KB 280KBwhich is mainly mongoScan. Also the min (1.35s) and max (7.81

Re: Aggregation performance

2016-12-19 Thread Kathleen Li
Hi Yousuf, Yes in my env, I was set store.mongo.bson.record.reader = true. With one record you provided, the same query works fine for me, the error you got is the schema changes related errors: 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> SELECT hashtag, count(*) as cnt from (select . . . .

Re: Aggregation performance

2016-12-18 Thread yousuf
Hi Kathleen, Thanks for responding... I've noticed when alter session set store.mongo.bson.record.reader = true; the performance is improved. However, the other queries are failing :(. 0: jdbc:drill:> alter session set store.mongo.bson.record.reader = true; +---+---

Re: Aggregation performance

2016-12-15 Thread Kathleen Li
In my env, first time took about 1.6s, second time only took 0.5s 0: jdbc:drill:zk=drill1:5181,drill2:5181,dril> SELECT count(*) as cnt, actor_preferred_username from test where . . . . . . . . . . . . . . . . . . . . . . .> posted_time >= '2016-08-01T00.00.00.000Z' and posted_time . . . . . . .

Re: Aggregation performance

2016-12-15 Thread Dechang Gu
Yousuf, Which version of drill are you running? Can you share the profile of the query? Thanks, Dechang On Thu, Dec 15, 2016 at 3:27 AM, yousuf wrote: > Hello experts > > As a POC project, I've built a drill cluster on 5 VMs , each with the > following specs > > 32 GB ram > > 1 TB storage > >

Aggregation performance

2016-12-15 Thread yousuf
Hello experts As a POC project, I've built a drill cluster on 5 VMs , each with the following specs 32 GB ram 1 TB storage 16 Cores Zookeeper quorum & apache drill installed on all 5 nodes. My storage engine is mongo which has 5 million docs. (Our daily collection is close to 2.5 million