By "instance", you mean minor fragments, correct ? And does the *planner.memory.max_query_memory_per_node* limit apply to each *type* of minor fragments individually ? Assuming the memory limit is set to *4 GB*, and the running query involves external sort and hash aggregates, should I expect the query to consume at least *8 GB* ?
Would you please point out to me, where in the code can I look into the implementation of this 10GB memory limit ? Thanks, Gelbana On Tue, Aug 15, 2017 at 2:40 AM, Boaz Ben-Zvi <bben-...@mapr.com> wrote: > There is this page: https://drill.apache.org/docs/ > sort-based-and-hash-based-memory-constrained-operators/ > But it seems out of date (correct for 1.10). It does not explain about the > hash operators, except that they run till they can not allocate any more > memory. This will happen (undocumented) at either 10GB per instance, or > when there is no more memory at the node. > > Boaz > > On 8/14/17, 1:16 PM, "Muhammad Gelbana" <m.gelb...@gmail.com> wrote: > > I'm not sure which version I was using when that happened. But that's > some > precise details you've mentioned! Is this mentioned somewhere in the > docs ? > > Thanks a lot. > > On Aug 14, 2017 9:00 PM, "Boaz Ben-Zvi" <bben-...@mapr.com> wrote: > > > Did your query include a hash join ? > > > > As of 1.11, only the External Sort and Hash Aggregate operators obey > the > > memory limit (that is, the “max query memory per node” figure is > divided > > among all the instances of these operators). > > The Hash Join (as was before 1.11) still does not take part in this > memory > > allocation scheme, and each instance may use up to 10GB. > > > > Also in 1.11, the Hash Aggregate may “fall back” to the 1.10 behavior > > (same as the Hash Join; i.e. up to 10GB) in case there is too little > memory > > per an instance (because it cannot perform memory spilling, which > requires > > some minimal memory to hold multiple batches). > > > > Thanks, > > > > Boaz > > > > On 8/11/17, 4:25 PM, "Muhammad Gelbana" <m.gelb...@gmail.com> wrote: > > > > Sorry for the long subject ! > > > > I'm running a single query on a single node Drill setup. > > > > I assumed that setting the *planner.memory.max_query_ > memory_per_node* > > property > > controls the max amount of memory (in bytes) for each running on > a > > single > > node. Which means that in my setup, the *direct.used* metric in > the > > metrics > > page should never exceed that value in my case. > > > > But it did and drastically. I assigned *34359738368* (32 GB) to > the > > *planner.memory.max_query_memory_per_node* option but while > > monitoring the > > *direct.used* metric, I found that it reached *51640484458* (~48 > GB). > > > > What did I mistakenly do\interpret ? > > > > Thanks, > > Gelbana > > > > > > > > > > >