Re: Suspicious direct memory consumption when running queries concurrently

Abdel Hakim Deneche Mon, 27 Jul 2015 12:40:02 -0700

Yes, and also why do we keep creating new chunks ? how come the first
iteration was able to run with less than half the direct memory used by the
later iterations ? I thought maybe the chunks weren't distributed equitably
between Netty's memory arenas, so new iterations may actually hit other
direct arenas, but I'm not sure this is the case.


We are trying to figure out a small set of queries to help reproduce this
issue more easily, and will run a Jmap to get more informations.

On Mon, Jul 27, 2015 at 11:04 AM, Jacques Nadeau <[email protected]> wrote:

> It sounds like your statement is that we're cacheing too many unused
> chunks.  Hanifi and I previously discussed implementing a separate flushing
> mechanism to release unallocated chunks that are hanging around.  The main
> question is, why are so many chunks hanging around and what threads are
> they associated with.  A Jmap dump and analysis should allow you to do
> determine which thread owns the excess chunks.  My guess would be the RPC
> pool since those are long lasting (as opposed to the WorkManager pool,
> which is contracting).
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Jul 27, 2015 at 9:53 AM, Abdel Hakim Deneche <
> [email protected]>
> wrote:
>
> > When running a set of, mostly window function, queries concurrently on a
> > single drillbit with a 8GB max direct memory. We are seeing a continuous
> > increase of direct memory allocation.
> >
> > We repeat the following steps multiple times:
> > - we launch in "iteration" of tests that will run all queries in a random
> > order, 10 queries at a time
> > - after the iteration finishes, we wait for a couple of minute to give
> > Drill time to release the memory being held by the finishing fragments
> >
> > Using Drill's memory logger ("drill.allocator") we were able to get
> > snapshots of how memory was internally used by Netty, we only focused on
> > the number of allocated chunks, if we take this number and multiply it by
> > 16MB (netty's chunk size) we get approximately the same value reported by
> > Drill's direct memory allocation.
> > Here is a graph that shows the evolution of the number of allocated
> chunks
> > on a 500 iterations run (I'm working on improving the plots) :
> >
> > http://bit.ly/1JL6Kp3
> >
> > In this specific case, after the first iteration Drill was allocating
> ~2GB
> > of direct memory, this number kept rising after each iteration to ~6GB.
> We
> > suspect this caused one of our previous runs to crash the JVM.
> >
> > If we only focus on the log lines between iterations (when Drill's memory
> > usage is below 10MB) then all allocated chunks are at most 2% usage. At
> > some point we end up with 288 nearly empty chunks, yet the next iteration
> > will cause more chunks to be allocated!!!
> >
> > is this expected ?
> >
> > PS: I am running more tests and will update this thread with more
> > informations.
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: Suspicious direct memory consumption when running queries concurrently

Reply via email to