Hi Yun,
Are you running in Drill embedded mode ? If so , the logs will be available in sqllline.log and drillbit.log will not be populated. You can enable DEBUG logging in logback.xml , run the query and share log file as Paul suggested. Edit $DRILL_HOME/conf/logback.xml to enable DEBUG level logging. <logger name="org.apache.drill" additivity="false"> <level value="debug" /> <appender-ref ref="FILE" /> </logger> Thanks, Arjun ________________________________ From: Paul Rogers <prog...@mapr.com> Sent: Monday, November 6, 2017 10:56 PM To: user@drill.apache.org Subject: Re: Drill Capacity Hi Yun, Sorry, it is a bit confusing. The log will contain two kinds of JSON. One is the query profile, which is what you found. The other is the physical plan used to run the query. It is the physical plan you want to find; that is the one that has the max allocation. If you can post your logs somewhere, I’ll d/l them and take a look. - Paul > On Nov 6, 2017, at 7:27 AM, Yun Liu <y....@castsoftware.com> wrote: > > Hi Paul, > > I am using Drill v 1.11.0 so I am only seeing sqlline.log and > sqlline_queries.log. hopefully the same. > > I am following your instructions and I am not seeing any maxAllocation other > than 10,000,000,000. No other number (or small number) than this. The query > profile reads the following: > > {"queryId":"25ff81fc-3b7a-a840-b557-d2194cc6819a","schema":"","queryText":"SELECT > * FROM `dfs`.`Inputs`.`./ Compliance.json` LIMIT > 100","start":1509981699406,"finish":1509981707544,"outcome":"FAILED","username":"","remoteAddress":"localhost"} > > Is this what you're looking for? > > Thanks, > Yun > > -----Original Message----- > From: Paul Rogers [mailto:prog...@mapr.com] > Sent: Friday, November 3, 2017 6:45 PM > To: user@drill.apache.org > Subject: Re: Drill Capacity > > Thanks for the info. Clearly you are way ahead of me. > > In issue 1, although you have only four (top level) fields, your example > shows that you have many nested fields. It is the total field count (across > all maps) that drives total width. And, it is the total amount of data that > drives memory consumption. > > You mentioned each record is 64KB and 3K rows. That suggests a total size of > around 200MB. But, you mention the total file size is 400MB. So, either the > rows are twice as large, or there are twice as many. If you have 3K rows of > 128MB each, then each batch of data is 400MB, which is pretty large. > > If your records are 64K in size, and we read 4K per batch, then the total > size is 256MB, which is also large. > > So, we are dealing with jumbo records and you really want the “batch size > control” feature that we are working on, but have not yet shipped. > > Let’s work out the math. How many sorts in your query? What other operators > does the query include? Let’s assume a single sort. > > Max query memory is 10 GB. 10 GB / 1 sort / max width of 5 = 2 GB per sort. > Since your batches are ~400 MB, things should work. > > Since things don’t work, I suspect that we’re missing something. (Note that > the memory size we just calculated does not match the numbers shown in an > earlier post in which the sort got just ~40 MB of memory…) > > Try this: > > * With your current settings, enable debug-level logging. Run your query. > > * Open the Drillbit log. Look for the JSON version of the query plan (there > will be two). One will tell you how much memory is given to the sort: > > maxAllocation: (some number) > > * Ignore the one that says 10,000,000, find the one with a smaller number. > What is that number? > > * Then, look in the query profile for your query. Look at the peak memory for > your JSON reader scan operator. The peak memory more-or-less reflects the > batch size. What is that number? > > With those, we can tell if the settings and sizes we think we are using are, > in fact, correct. > > Thanks, > > - Paul > >> On Nov 3, 2017, at 1:19 PM, Yun Liu <y....@castsoftware.com> wrote: >> >> Hi Paul, >> >> Thanks for you detailed explanation. First off- I have 2 issues and I wanted >> to clear it out before continuing. >> >> Current setting: planner.memory.max_query_memory_per_node = 10GB, HEAP >> = 12G, Direct memory = 32G, Perm 1024M, and planner.width.max_per_node >> = 5 >> >> Issue # 1: >> When loading a json file with 400MB I keep getting a DATA_READ ERROR. >> Each record in the file is about 64KB. Since it's a json file, there are >> only 4 fields per each record. Not sure how many records this file contains >> as it's too large to open with any tools, but I am guessing about 3k rows. >> With all the recommendations provided by various experts, nothing has worked. >> >> Issue 2#: >> While processing a query with is a join of 2 functional .json files, I am >> getting a RESOURCE ERROR: One or more nodes ran out of memory while >> executing the query. These 2 json files alone process fine but when joined >> together, Drill throws me that error. >> Json#1 is 11k KB, has 8 fields with 74091 rows >> Json#2 is 752kb, has 8 fields with 4245 rows >> >> Besides breaking them up to smaller files, not sure what else I could do. >> >> Thanks for the help so far! >> >> Yun >> >> -----Original Message----- >> From: Paul Rogers [mailto:prog...@mapr.com] >> Sent: Thursday, November 2, 2017 11:06 PM >> To: user@drill.apache.org >> Subject: Re: Drill Capacity >> >> Hi Yun, >> >> I’m going to give you multiple ways to understand the issue based on the >> information you’ve provided. I generally like to see the full logs to >> diagnose such problems, but we’ll start with what you’ve provided thus far. >> >> How large is each record in your file? How many fields? How many >> bytes? (Alternatively, how big is a single input file and how many >> records does it contain?) >> >> You mention the limit of 64K columns in CSV. This makes me wonder if you >> have a “jumbo” record. If each individual record is large, then there won’t >> be enough space in the sort to take even a single batch of records, and >> you’ll get the sv2 error that you saw. >> >> We can guess the size, however, from the info you provided: >> >> batchGroups.size 1 >> spilledBatchGroups.size 0 >> allocated memory 42768000 >> allocator limit 41943040 >> >> This says you have a batch in memory and are trying to allocate some memory >> (the “sv2”). The allocated memory number tells us that each batch size is >> probably ~43 MB. But, the sort only has 42 MB to play with. The sort needs >> at least two batches in memory to make progress, hence the out-of-memory >> errors. >> >> It would be nice to confirm this from the logs, but unfortunately, Drill >> does not normally log the size of each batch. As it turns out, however, the >> “managed” version that Boaz mentioned added more logging around this >> problem: it will tell you how large it thinks each batch is, and will warn >> if you have, say, a 43 MB batch but only 42 MB in which to sort. >> >> (If you do want to use the “managed” version of the sort, I suggest >> you try Drill 1.12 when it is released as that version contains >> additional fixes to handle constrained memory.) >> >> Also, at present, The JSON record reader loads 4096 records into each >> batch. If your file has at least that many records, then we can guess >> each record is about 43 MB / 4096 =~ 10K in size. (You can confirm, as >> noted above, by dividing total file size by record count.) >> >> We are doing work to handle such large batches, but the work is not yet >> available in a release. Unfortunately, in the meanwhile, we also don’t let >> you control the batch size. But, we can provide another solution. >> >> Let's explain why the message you provided said that the “allocator limit” >> was 42 MB. Drill does the following to allocate memory to the sort: >> >> * Take the “max query memory per node” (default of 2 GB regardless of >> actual direct memory), >> * Divide by the number of sort operators in the plan (as shown in the >> visualized query profile) >> * Divide by the “planner width” which is, by default, 70% of the number of >> cores on your system. >> >> In your case, if you are using the default 2 GB total, but getting 41 MB per >> sort, the divisor is 50. Maybe you have 2 sorts and 32 cores? (2 * 32 * 70% >> =~ 45.) Or some other combination. >> >> We can’t reduce the number of sorts; that’s determined by your query. But, >> we can play with the other numbers. >> >> First, we can increase the memory per query: >> >> ALTER SESSION SET `planner.memory.max_query_memory_per_node` = >> 4,294,967,296 >> >> That is, 4 GB. This obviously means you must have at least 6 GB of direct >> memory; more is better. >> >> And/or, we can reduce the number of fragments: >> >> ALTER SESSION SET `planner.width.max_per_node` = <a number> >> >> The value is a bit tricky. Drill normally creates a number of fragments >> equal to 70% of the number of CPUs on your system. Let’s say you have 32 >> cores. If so, change the max_per_node to, say, 10 or even 5. This will mean >> fewer sorts and so more memory per sort, helping compensate for the “jumbo” >> batches in your query. Pick a number based on your actual number of cores. >> >> As an alternative, as Ted suggested, you could create a larger number of >> smaller files as this would solve the batch size problem while also getting >> the parallelization benefits that Kunal mentioned. >> >> That is three separate possible solutions. Try them one by one or >> (carefully) together. >> >> - Paul >> >>>> On 11/2/17, 12:31 PM, "Yun Liu" <y....@castsoftware.com> wrote: >>>> >>>> Hi Kunal and Andries, >>>> >>>> Thanks for your reply. We need json in this case because Drill >>>> only supports up to 65536 columns in a csv file. >