On Fri, 1 Mar 2013, Wido den Hollander wrote:
> On 02/23/2013 01:44 AM, Sage Weil wrote:
> > On Fri, 22 Feb 2013, S?bastien Han wrote:
> > > Hi all,
> > > 
> > > I finally got a core dump.
> > > 
> > > I did it with a kill -SEGV on the OSD process.
> > > 
> > > https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008
> > > 
> > > Hope we will get something out of it :-).
> > 
> > AHA!  We have a theory.  The pg log isnt trimmed during scrub (because teh
> > old scrub code required that), but the new (deep) scrub can take a very
> > long time, which means the pg log will eat ram in the meantime..
> > especially under high iops.
> > 
> 
> Does the number of PGs influence the memory leak? So my theory is that when
> you have a high number of PGs with a low number of objects per PG you don't
> see the memory leak.
> 
> I saw the memory leak on a RBD system where a pool had just 8 PGs, but after
> going to 1024 PGs in a new pool it seemed to be resolved.
> 
> I've asked somebody else to try your patch since he's still seeing it on his
> systems. Hopefully that gives us some results.

The PGs were active+clean when you saw the leak?  There is a problem (that 
we just fixed in master) where pg logs aren't trimmed for degraded PGs.

sage

> 
> Wido
> 
> > Can you try wip-osd-log-trim (which is bobtail + a simple patch) and see
> > if that seems to work?  Note that that patch shouldn't be run in a mixed
> > argonaut+bobtail cluster, since it isn't properly checking if the scrub is
> > class or chunky/deep.
> > 
> > Thanks!
> > sage
> > 
> > 
> >   > --
> > > Regards,
> > > S?bastien Han.
> > > 
> > > 
> > > On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum <g...@inktank.com> wrote:
> > > > On Fri, Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebast...@gmail.com>
> > > > wrote:
> > > > > > Is osd.1 using the heap profiler as well? Keep in mind that active
> > > > > > use
> > > > > > of the memory profiler will itself cause memory usage to increase ?
> > > > > > this sounds a bit like that to me since it's staying stable at a
> > > > > > large
> > > > > > but finite portion of total memory.
> > > > > 
> > > > > Well, the memory consumption was already high before the profiler was
> > > > > started. So yes with the memory profiler enable an OSD might consume
> > > > > more memory but this doesn't cause the memory leaks.
> > > > 
> > > > My concern is that maybe you saw a leak but when you restarted with
> > > > the memory profiling you lost whatever conditions caused it.
> > > > 
> > > > > Any ideas? Nothing to say about my scrumbing theory?
> > > > I like it, but Sam indicates that without some heap dumps which
> > > > capture the actual leak then scrub is too large to effectively code
> > > > review for leaks. :(
> > > > -Greg
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > > the body of a message to majord...@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
> -- 
> Wido den Hollander
> 42on B.V.
> 
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to