RE: Drill Capacity

2017-11-08 Thread Yun Liu
Hi Kunal, Please see below dataset I've provided this week. Hope it helps: [ { "type" : "quality-rules", "reference" : { "href" : "", "name" : "Avoid unreferenced Tables", "key" : "1634", "critical" : false }, "result" : { "grade" : 2, "violationRatio" : { "t

RE: Drill Capacity

2017-11-08 Thread Yun Liu
-- From: Paul Rogers [mailto:prog...@mapr.com] Sent: Tuesday, November 7, 2017 7:55 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, I looked at the sqlline.log file you posted. (Thanks much for doing so.) Here’s what I noted: The log shows a failed query, but this one is diffe

Re: Drill Capacity

2017-11-07 Thread Paul Rogers
ass this error until your new release comes out? > > Thanks, > Yun > > -Original Message- > From: Arjun kr [mailto:arjun...@outlook.com] > Sent: Monday, November 6, 2017 7:39 PM > To: user@drill.apache.org > Subject: Re: Drill Capacity > > Hi Yun, > >

RE: Drill Capacity

2017-11-07 Thread Kunal Khatua
you faced. Thanks ~K -Original Message- From: Yun Liu [mailto:y@castsoftware.com] Sent: Tuesday, November 07, 2017 7:17 AM To: user@drill.apache.org Subject: RE: Drill Capacity Hi Arjun, That was already altered and schema was not changed. I've reduced the json size and everythin

RE: Drill Capacity

2017-11-07 Thread Yun Liu
: Arjun kr [mailto:arjun...@outlook.com] Sent: Monday, November 6, 2017 7:39 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, Looking at the log shared, You seems to be running below query. 2017-11-06 15:09:37,383 [25ff3e7e-39ef-a175-93e7-e4e62b284add:for

Re: Drill Capacity

2017-11-06 Thread Arjun kr
schema change. Can you try setting below session parameter if not tried already? alter session set `store.json.all_text_mode`=true; Thanks, Arjun From: Yun Liu Sent: Tuesday, November 7, 2017 1:46 AM To: user@drill.apache.org Subject: RE: Drill Capacity Hi

RE: Drill Capacity

2017-11-06 Thread Yun Liu
: Arjun kr [mailto:arjun...@outlook.com] Sent: Monday, November 6, 2017 1:20 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG

Re: Drill Capacity

2017-11-06 Thread Arjun kr
level logging. Thanks, Arjun From: Paul Rogers Sent: Monday, November 6, 2017 10:56 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile

Re: Drill Capacity

2017-11-06 Thread Paul Rogers
mmendations provided by various experts, nothing has worked. >> >> Issue 2#: >> While processing a query with is a join of 2 functional .json files, I am >> getting a RESOURCE ERROR: One or more nodes ran out of memory while >> executing the query. These 2 json f

RE: Drill Capacity

2017-11-06 Thread Yun Liu
ILED","username":"","remoteAddress":"localhost"} Is this what you're looking for? Thanks, Yun -----Original Message- From: Paul Rogers [mailto:prog...@mapr.com] Sent: Friday, November 3, 2017 6:45 PM To: user@drill.apache.org Subject: Re: Drill Capa

Re: Drill Capacity

2017-11-03 Thread Paul Rogers
> Thanks for the help so far! > > Yun > > -Original Message- > From: Paul Rogers [mailto:prog...@mapr.com] > Sent: Thursday, November 2, 2017 11:06 PM > To: user@drill.apache.org > Subject: Re: Drill Capacity > > Hi Yun, > > I’m going to give you multi

RE: Drill Capacity

2017-11-03 Thread Yun Liu
Yes- I guess breaking them into smaller file will solve this. Thanks! Yun -Original Message- From: Arjun kr [mailto:arjun...@outlook.com] Sent: Friday, November 3, 2017 5:40 PM To: user@drill.apache.org Subject: Re: Drill Capacity I have seen a use-case where query fails for 12 GB

Re: Drill Capacity

2017-11-03 Thread Arjun kr
iu Sent: Saturday, November 4, 2017 2:27 AM To: user@drill.apache.org Subject: RE: Drill Capacity Hi Arjun, Column 4 has the most data and a bit long here. The other 3 columns has maybe a word or 2. Thanks for your patience. [ { "type" : "quality-rules", "reference

RE: Drill Capacity

2017-11-03 Thread Yun Liu
Hi Arjun, Column 4 has the most data and a bit long here. The other 3 columns has maybe a word or 2. Thanks for your patience. [ { "type" : "quality-rules", "reference" : { "href" : "", "name" : "Avoid unreferenced Tables", "key" : "1634", "critical" : false }, "result" :

Re: Drill Capacity

2017-11-03 Thread Arjun kr
m: Yun Liu Sent: Saturday, November 4, 2017 1:49 AM To: user@drill.apache.org Subject: RE: Drill Capacity Hi Paul, Thanks for you detailed explanation. First off- I have 2 issues and I wanted to clear it out before continuing. Current setting: planner.memory.max_query_memory_per_node = 10GB,

RE: Drill Capacity

2017-11-03 Thread Yun Liu
.@mapr.com] Sent: Thursday, November 2, 2017 11:06 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll

RE: Drill Capacity

2017-11-03 Thread Yun Liu
Hi Boaz, Looks like I've already had those set to "false". So it didn't change much. Thanks, Yun -Original Message- From: Boaz Ben-Zvi [mailto:bben-...@mapr.com] Sent: Thursday, November 2, 2017 6:14 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yu

RE: Drill Capacity

2017-11-03 Thread Yun Liu
Hi Boaz, Seems I've already had those set to false. So it didn't help ☹ Thanks, Yun -Original Message- From: Boaz Ben-Zvi [mailto:bben-...@mapr.com] Sent: Thursday, November 2, 2017 6:14 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, Can you

Re: Drill Capacity

2017-11-02 Thread Paul Rogers
Hi Yun, I’m going to give you multiple ways to understand the issue based on the information you’ve provided. I generally like to see the full logs to diagnose such problems, but we’ll start with what you’ve provided thus far. How large is each record in your file? How many fields? How many by

Re: Drill Capacity

2017-11-02 Thread Ted Dunning
anges. As with the same file format but less > data- it works perfectly ok. I am unable to tell if there's corruption. > > Yun > > -Original Message- > From: Andries Engelbrecht [mailto:aengelbre...@mapr.com] > Sent: Thursday, November 2, 2017 3:35 PM > To:

Re: Drill Capacity

2017-11-02 Thread Boaz Ben-Zvi
age- From: Yun Liu [mailto:y@castsoftware.com] Sent: Thursday, November 2, 2017 3:52 PM To: user@drill.apache.org Subject: RE: Drill Capacity Yes- I increased planner.memory.max_query_memory_per_node to 10GB HEAP to 12G Direct memory to 16G And Perm to 1024M I

RE: Drill Capacity

2017-11-02 Thread Yun Liu
@castsoftware.com] Sent: Thursday, November 2, 2017 3:52 PM To: user@drill.apache.org Subject: RE: Drill Capacity Yes- I increased planner.memory.max_query_memory_per_node to 10GB HEAP to 12G Direct memory to 16G And Perm to 1024M It didn't have any schema changes. As with the same file format but

RE: Drill Capacity

2017-11-02 Thread Yun Liu
al Message- From: Andries Engelbrecht [mailto:aengelbre...@mapr.com] Sent: Thursday, November 2, 2017 3:35 PM To: user@drill.apache.org Subject: Re: Drill Capacity What memory setting did you increase? Have you tried 6 or 8GB? How much memory is allocated to Drill Heap and Direct memory for th

Re: Drill Capacity

2017-11-02 Thread Andries Engelbrecht
s, Yun -Original Message- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Thursday, November 2, 2017 2:01 PM To: user@drill.apache.org Subject: RE: Drill Capacity Hi Yun Andries solution should address your problem. However, do understand tha

RE: Drill Capacity

2017-11-02 Thread Yun Liu
- From: Kunal Khatua [mailto:kkha...@mapr.com] Sent: Thursday, November 2, 2017 2:01 PM To: user@drill.apache.org Subject: RE: Drill Capacity Hi Yun Andries solution should address your problem. However, do understand that, unlike CSV files, a JSON file cannot be processed in parallel

RE: Drill Capacity

2017-11-02 Thread Kunal Khatua
-Original Message- From: Andries Engelbrecht [mailto:aengelbre...@mapr.com] Sent: Thursday, November 02, 2017 10:26 AM To: user@drill.apache.org Subject: Re: Drill Capacity How much memory is allocated to the Drill environment? Embedded or in a cluster? I don’t think there is a particular limit, but

Re: Drill Capacity

2017-11-02 Thread Andries Engelbrecht
How much memory is allocated to the Drill environment? Embedded or in a cluster? I don’t think there is a particular limit, but a single JSON file will be read by a single minor fragment, in general it is better to match the number/size of files to the Drill environment. In the short term try t

Re: Drill Capacity

2017-11-02 Thread Prasad Nagaraj Subramanya
Hi Yun, Drill is designed to query large datasets. There is no specific limit on the size, it works well even when data is in hundreds of GBs. DATA_READ ERROR has something to do with the data in your file. The data in some of the columns may not be consistent with the datatype. Please refer to t