Re: Suspicious direct memory consumption when running queries concurrently

Jacques Nadeau Fri, 31 Jul 2015 21:51:07 -0700

Can you give me a single node repro?
On Jul 31, 2015 9:20 PM, "Abdel Hakim Deneche" <[email protected]>
wrote:


> I tried getting a jmap dump multiple times without success, each time it
> crashes the jvm with the following exception:
>
> Dumping heap to /home/mapr/private-sql-hadoop-test/framework/myfile.hprof
> > ...
> > Exception in thread "main" java.io.IOException: Premature EOF
> >         at
> >
> sun.tools.attach.HotSpotVirtualMachine.readInt(HotSpotVirtualMachine.java:248)
> >         at
> >
> sun.tools.attach.LinuxVirtualMachine.execute(LinuxVirtualMachine.java:199)
> >         at
> >
> sun.tools.attach.HotSpotVirtualMachine.executeCommand(HotSpotVirtualMachine.java:217)
> >         at
> >
> sun.tools.attach.HotSpotVirtualMachine.dumpHeap(HotSpotVirtualMachine.java:180)
> >         at sun.tools.jmap.JMap.dump(JMap.java:242)
> >         at sun.tools.jmap.JMap.main(JMap.java:140)
>
>
> On Mon, Jul 27, 2015 at 3:45 PM, Jacques Nadeau <[email protected]>
> wrote:
>
> > A allocate -> release cycle all on the same thread goes into a per thread
> > cache.
> >
> > A bunch of Netty arena settings are configurable.  The big issue I
> believe
> > is that the limits are soft limits implemented by the allocation-time
> > release mechanism.  As such, if you allocate a bunch of memory, then
> > release it all, that won't necessarily trigger any actual chunk releases.
> >
> > --
> > Jacques Nadeau
> > CTO and Co-Founder, Dremio
> >
> > On Mon, Jul 27, 2015 at 12:47 PM, Abdel Hakim Deneche <
> > [email protected]
> > > wrote:
> >
> > > @Jacques, my understanding is that chunks are not owned by specific a
> > > thread but they are part of a specific memory arena which is in turn
> only
> > > accessed by specific threads. Do you want me to find which threads are
> > > associated with the same arena where we have hanging chunks ?
> > >
> > >
> > > On Mon, Jul 27, 2015 at 11:04 AM, Jacques Nadeau <[email protected]>
> > > wrote:
> > >
> > > > It sounds like your statement is that we're cacheing too many unused
> > > > chunks.  Hanifi and I previously discussed implementing a separate
> > > flushing
> > > > mechanism to release unallocated chunks that are hanging around.  The
> > > main
> > > > question is, why are so many chunks hanging around and what threads
> are
> > > > they associated with.  A Jmap dump and analysis should allow you to
> do
> > > > determine which thread owns the excess chunks.  My guess would be the
> > RPC
> > > > pool since those are long lasting (as opposed to the WorkManager
> pool,
> > > > which is contracting).
> > > >
> > > > --
> > > > Jacques Nadeau
> > > > CTO and Co-Founder, Dremio
> > > >
> > > > On Mon, Jul 27, 2015 at 9:53 AM, Abdel Hakim Deneche <
> > > > [email protected]>
> > > > wrote:
> > > >
> > > > > When running a set of, mostly window function, queries concurrently
> > on
> > > a
> > > > > single drillbit with a 8GB max direct memory. We are seeing a
> > > continuous
> > > > > increase of direct memory allocation.
> > > > >
> > > > > We repeat the following steps multiple times:
> > > > > - we launch in "iteration" of tests that will run all queries in a
> > > random
> > > > > order, 10 queries at a time
> > > > > - after the iteration finishes, we wait for a couple of minute to
> > give
> > > > > Drill time to release the memory being held by the finishing
> > fragments
> > > > >
> > > > > Using Drill's memory logger ("drill.allocator") we were able to get
> > > > > snapshots of how memory was internally used by Netty, we only
> focused
> > > on
> > > > > the number of allocated chunks, if we take this number and multiply
> > it
> > > by
> > > > > 16MB (netty's chunk size) we get approximately the same value
> > reported
> > > by
> > > > > Drill's direct memory allocation.
> > > > > Here is a graph that shows the evolution of the number of allocated
> > > > chunks
> > > > > on a 500 iterations run (I'm working on improving the plots) :
> > > > >
> > > > > http://bit.ly/1JL6Kp3
> > > > >
> > > > > In this specific case, after the first iteration Drill was
> allocating
> > > > ~2GB
> > > > > of direct memory, this number kept rising after each iteration to
> > ~6GB.
> > > > We
> > > > > suspect this caused one of our previous runs to crash the JVM.
> > > > >
> > > > > If we only focus on the log lines between iterations (when Drill's
> > > memory
> > > > > usage is below 10MB) then all allocated chunks are at most 2%
> usage.
> > At
> > > > > some point we end up with 288 nearly empty chunks, yet the next
> > > iteration
> > > > > will cause more chunks to be allocated!!!
> > > > >
> > > > > is this expected ?
> > > > >
> > > > > PS: I am running more tests and will update this thread with more
> > > > > informations.
> > > > >
> > > > > --
> > > > >
> > > > > Abdelhakim Deneche
> > > > >
> > > > > Software Engineer
> > > > >
> > > > >   <http://www.mapr.com/>
> > > > >
> > > > >
> > > > > Now Available - Free Hadoop On-Demand Training
> > > > > <
> > > > >
> > > >
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: Suspicious direct memory consumption when running queries concurrently

Reply via email to