I renamed the old one from images to images-old, and the new one from 
images-new to images. 

Dave Spano
Optogenics
Systems Administrator



----- Original Message -----
From: Greg Farnum <g...@inktank.com>
To: Dave Spano <dsp...@optogenics.com>
Cc: Sébastien Han <han.sebast...@gmail.com>, ceph-devel 
<ceph-devel@vger.kernel.org>, Sage Weil <s...@inktank.com>, Wido 
den Hollander <w...@42on.com>, Sylvain Munaut 
<s.mun...@whatever-company.com>, Samuel Just 
<sam.j...@inktank.com>, Vladislav Gorbunov <vadi...@gmail.com>
Sent: Wed, 13 Mar 2013 18:52:29 -0400 (EDT)
Subject: Re: OSD memory leaks?

It sounds like maybe you didn't rename the new pool to use the old pool's name? 
Glance is looking for a specific pool to store its data in; I believe it's 
configurable but you'll need to do one or the other.
-Greg

On Wednesday, March 13, 2013 at 3:38 PM, Dave Spano wrote:

> Sebastien,
> 
> I'm not totally sure yet, but everything is still working. 
> 
> 
> Sage and Greg, 
> I copied my glance image pool per the posting I mentioned previously, and 
everything works when I use the ceph tools. I can export rbds from the new pool 
and delete them as well.
> 
> I noticed that the copied images pool does not work with glance. 
> 
> I get this error when I try to create images in the new pool. If I put the 
old pool back, I can create images no problem. 
> 
> Is there something I'm missing in glance that I need to work with a pool 
created in bobtail? I'm using Openstack Folsom. 
> 
> File "/usr/lib/python2.7/dist-packages/glance/api/v1/images.py", line 437, 
in _upload 
> image_meta['size']) 
> File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 244, in 
add 
> image_size, order) 
> File "/usr/lib/python2.7/dist-packages/glance/store/rbd.py", line 207, in 
_create_image 
> features=rbd.RBD_FEATURE_LAYERING) 
> File "/usr/lib/python2.7/dist-packages/rbd.py", line 194, in create 
> raise make_ex(ret, 'error creating image') 
> PermissionError: error creating image
> 
> 
> Dave Spano 
> 
> 
> 
> 
> ----- Original Message ----- 
> 
> From: "Sébastien Han" <han.sebast...@gmail.com 
(mailto:han.sebast...@gmail.com)> 
> To: "Dave Spano" <dsp...@optogenics.com 
(mailto:dsp...@optogenics.com)> 
> Cc: "Greg Farnum" <g...@inktank.com (mailto:g...@inktank.com)>, 
"ceph-devel" <ceph-devel@vger.kernel.org 
(mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <s...@inktank.com 
(mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com 
(mailto:w...@42on.com)>, "Sylvain Munaut" <s.mun...@whatever-company.com 
(mailto:s.mun...@whatever-company.com)>, "Samuel Just" 
<sam.j...@inktank.com (mailto:sam.j...@inktank.com)>, "Vladislav 
Gorbunov" <vadi...@gmail.com (mailto:vadi...@gmail.com)> 
> Sent: Wednesday, March 13, 2013 3:59:03 PM 
> Subject: Re: OSD memory leaks? 
> 
> Dave, 
> 
> Just to be sure, did the log max recent=10000 _completely_ stod the 
> memory leak or did it slow it down? 
> 
> Thanks! 
> -- 
> Regards, 
> Sébastien Han. 
> 
> 
> On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano <dsp...@optogenics.com 
(mailto:dsp...@optogenics.com)> wrote: 
> > Lol. I'm totally fine with that. My glance images pool isn't used too 
often. I'm going to give that a try today and see what happens. 
> > 
> > I'm still crossing my fingers, but since I added log max recent=10000 
to ceph.conf, I've been okay despite the improper pg_num, and a lot of 
scrubbing/deep scrubbing yesterday. 
> > 
> > Dave Spano 
> > 
> > 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Greg Farnum" <g...@inktank.com 
(mailto:g...@inktank.com)> 
> > To: "Dave Spano" <dsp...@optogenics.com 
(mailto:dsp...@optogenics.com)> 
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org 
(mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <s...@inktank.com 
(mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com 
(mailto:w...@42on.com)>, "Sylvain Munaut" <s.mun...@whatever-company.com 
(mailto:s.mun...@whatever-company.com)>, "Samuel Just" 
<sam.j...@inktank.com (mailto:sam.j...@inktank.com)>, "Vladislav 
Gorbunov" <vadi...@gmail.com (mailto:vadi...@gmail.com)>, "Sébastien Han" 
<han.sebast...@gmail.com (mailto:han.sebast...@gmail.com)> 
> > Sent: Tuesday, March 12, 2013 5:37:37 PM 
> > Subject: Re: OSD memory leaks? 
> > 
> > Yeah. There's not anything intelligent about that cppool mechanism. 
:) 
> > -Greg 
> > 
> > On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote: 
> > 
> > > I'd rather shut the cloud down and copy the pool to a new one 
than take any chances of corruption by using an experimental feature. My guess 
is that there cannot be any i/o to the pool while copying, otherwise you'll 
lose the changes that are happening during the copy, correct? 
> > > 
> > > Dave Spano 
> > > Optogenics 
> > > Systems Administrator 
> > > 
> > > 
> > > 
> > > ----- Original Message ----- 
> > > 
> > > From: "Greg Farnum" <g...@inktank.com 
(mailto:g...@inktank.com)> 
> > > To: "Sébastien Han" <han.sebast...@gmail.com 
(mailto:han.sebast...@gmail.com)> 
> > > Cc: "Dave Spano" <dsp...@optogenics.com 
(mailto:dsp...@optogenics.com)>, "ceph-devel" <ceph-devel@vger.kernel.org 
(mailto:ceph-devel@vger.kernel.org)>, "Sage Weil" <s...@inktank.com 
(mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com 
(mailto:w...@42on.com)>, "Sylvain Munaut" <s.mun...@whatever-company.com 
(mailto:s.mun...@whatever-company.com)>, "Samuel Just" 
<sam.j...@inktank.com (mailto:sam.j...@inktank.com)>, "Vladislav 
Gorbunov" <vadi...@gmail.com (mailto:vadi...@gmail.com)> 
> > > Sent: Tuesday, March 12, 2013 4:20:13 PM 
> > > Subject: Re: OSD memory leaks? 
> > > 
> > > On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: 
> > > > Well to avoid un necessary data movement, there is also an 
> > > > _experimental_ feature to change on fly the number of PGs 
in a pool. 
> > > > 
> > > > ceph osd pool set <poolname> pg_num <numpgs> 
--allow-experimental-feature 
> > > Don't do that. We've got a set of 3 patches which fix bugs we 
know about that aren't in bobtail yet, and I'm sure there's more we aren't 
aware of… 
> > > -Greg 
> > > 
> > > Software Engineer #42 @ http://inktank.com | http://ceph.com 
> > > 
> > > > 
> > > > Cheers! 
> > > > -- 
> > > > Regards, 
> > > > Sébastien Han. 
> > > > 
> > > > 
> > > > On Tue, Mar 12, 2013 at 7:09 PM, Dave Spano 
<dsp...@optogenics.com (mailto:dsp...@optogenics.com)> wrote: 
> > > > > Disregard my previous question. I found my answer in 
the post below. Absolutely brilliant! I thought I was screwed! 
> > > > > 
> > > > > 
http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/8924 
> > > > > 
> > > > > Dave Spano 
> > > > > Optogenics 
> > > > > Systems Administrator 
> > > > > 
> > > > > 
> > > > > 
> > > > > ----- Original Message ----- 
> > > > > 
> > > > > From: "Dave Spano" <dsp...@optogenics.com 
(mailto:dsp...@optogenics.com)> 
> > > > > To: "Sébastien Han" <han.sebast...@gmail.com 
(mailto:han.sebast...@gmail.com)> 
> > > > > Cc: "Sage Weil" <s...@inktank.com 
(mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com 
(mailto:w...@42on.com)>, "Gregory Farnum" <g...@inktank.com 
(mailto:g...@inktank.com)>, "Sylvain Munaut" 
<s.mun...@whatever-company.com (mailto:s.mun...@whatever-company.com)>, 
"ceph-devel" <ceph-devel@vger.kernel.org 
(mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.j...@inktank.com 
(mailto:sam.j...@inktank.com)>, "Vladislav Gorbunov" <vadi...@gmail.com 
(mailto:vadi...@gmail.com)> 
> > > > > Sent: Tuesday, March 12, 2013 1:41:21 PM 
> > > > > Subject: Re: OSD memory leaks? 
> > > > > 
> > > > > 
> > > > > If one were stupid enough to have their pg_num and 
pgp_num set to 8 on two of their pools, how could you fix that? 
> > > > > 
> > > > > 
> > > > > Dave Spano 
> > > > > 
> > > > > 
> > > > > 
> > > > > ----- Original Message ----- 
> > > > > 
> > > > > From: "Sébastien Han" <han.sebast...@gmail.com 
(mailto:han.sebast...@gmail.com)> 
> > > > > To: "Vladislav Gorbunov" <vadi...@gmail.com 
(mailto:vadi...@gmail.com)> 
> > > > > Cc: "Sage Weil" <s...@inktank.com 
(mailto:s...@inktank.com)>, "Wido den Hollander" <w...@42on.com 
(mailto:w...@42on.com)>, "Gregory Farnum" <g...@inktank.com 
(mailto:g...@inktank.com)>, "Sylvain Munaut" 
<s.mun...@whatever-company.com (mailto:s.mun...@whatever-company.com)>, 
"Dave Spano" <dsp...@optogenics.com (mailto:dsp...@optogenics.com)>, 
"ceph-devel" <ceph-devel@vger.kernel.org 
(mailto:ceph-devel@vger.kernel.org)>, "Samuel Just" <sam.j...@inktank.com 
(mailto:sam.j...@inktank.com)> 
> > > > > Sent: Tuesday, March 12, 2013 9:43:44 AM 
> > > > > Subject: Re: OSD memory leaks? 
> > > > > 
> > > > > > Sorry, i mean pg_num and pgp_num on all pools. 
Shown by the "ceph osd 
> > > > > > dump | grep 'rep size'" 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Well it's still 450 each... 
> > > > > 
> > > > > > The default pg_num value 8 is NOT suitable for 
big cluster. 
> > > > > 
> > > > > Thanks I know, I'm not new with Ceph. What's your 
point here? I 
> > > > > already said that pg_num was 450... 
> > > > > -- 
> > > > > Regards, 
> > > > > Sébastien Han. 
> > > > > 
> > > > > 
> > > > > On Tue, Mar 12, 2013 at 2:00 PM, Vladislav Gorbunov 
<vadi...@gmail.com (mailto:vadi...@gmail.com)> wrote: 
> > > > > > Sorry, i mean pg_num and pgp_num on all pools. 
Shown by the "ceph osd 
> > > > > > dump | grep 'rep size'" 
> > > > > > The default pg_num value 8 is NOT suitable for 
big cluster. 
> > > > > > 
> > > > > > 2013/3/13 Sébastien Han 
<han.sebast...@gmail.com (mailto:han.sebast...@gmail.com)>: 
> > > > > > > Replica count has been set to 2. 
> > > > > > > 
> > > > > > > Why? 
> > > > > > > -- 
> > > > > > > Regards, 
> > > > > > > Sébastien Han. 
> > > > > > > 
> > > > > > > 
> > > > > > > On Tue, Mar 12, 2013 at 12:45 PM, Vladislav 
Gorbunov <vadi...@gmail.com (mailto:vadi...@gmail.com)> wrote: 
> > > > > > > > > FYI I'm using 450 pgs for my 
pools. 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Please, can you show the number of 
object replicas? 
> > > > > > > > 
> > > > > > > > ceph osd dump | grep 'rep size' 
> > > > > > > > 
> > > > > > > > Vlad Gorbunov 
> > > > > > > > 
> > > > > > > > 2013/3/5 Sébastien Han 
<han.sebast...@gmail.com (mailto:han.sebast...@gmail.com)>: 
> > > > > > > > > FYI I'm using 450 pgs for my 
pools. 
> > > > > > > > > 
> > > > > > > > > -- 
> > > > > > > > > Regards, 
> > > > > > > > > Sébastien Han. 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Fri, Mar 1, 2013 at 8:10 PM, 
Sage Weil <s...@inktank.com (mailto:s...@inktank.com)> wrote: 
> > > > > > > > > > 
> > > > > > > > > > On Fri, 1 Mar 2013, Wido den 
Hollander wrote: 
> > > > > > > > > > > On 02/23/2013 01:44 AM, 
Sage Weil wrote: 
> > > > > > > > > > > > On Fri, 22 Feb 
2013, S?bastien Han wrote: 
> > > > > > > > > > > > > Hi all, 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I finally got 
a core dump. 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I did it with 
a kill -SEGV on the OSD process. 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Hope we will 
get something out of it :-). 
> > > > > > > > > > > > 
> > > > > > > > > > > > AHA! We have a 
theory. The pg log isnt trimmed during scrub (because teh 
> > > > > > > > > > > > old scrub code 
required that), but the new (deep) scrub can take a very 
> > > > > > > > > > > > long time, which 
means the pg log will eat ram in the meantime.. 
> > > > > > > > > > > > especially under 
high iops. 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Does the number of PGs 
influence the memory leak? So my theory is that when 
> > > > > > > > > > > you have a high number 
of PGs with a low number of objects per PG you don't 
> > > > > > > > > > > see the memory leak. 
> > > > > > > > > > > 
> > > > > > > > > > > I saw the memory leak on 
a RBD system where a pool had just 8 PGs, but after 
> > > > > > > > > > > going to 1024 PGs in a 
new pool it seemed to be resolved. 
> > > > > > > > > > > 
> > > > > > > > > > > I've asked somebody else 
to try your patch since he's still seeing it on his 
> > > > > > > > > > > systems. Hopefully that 
gives us some results. 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > The PGs were active+clean 
when you saw the leak? There is a problem (that 
> > > > > > > > > > we just fixed in master) 
where pg logs aren't trimmed for degraded PGs. 
> > > > > > > > > > 
> > > > > > > > > > sage 
> > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > Wido 
> > > > > > > > > > > 
> > > > > > > > > > > > Can you try 
wip-osd-log-trim (which is bobtail + a simple patch) and see 
> > > > > > > > > > > > if that seems to 
work? Note that that patch shouldn't be run in a mixed 
> > > > > > > > > > > > argonaut+bobtail 
cluster, since it isn't properly checking if the scrub is 
> > > > > > > > > > > > class or 
chunky/deep. 
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks! 
> > > > > > > > > > > > sage 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > > -- 
> > > > > > > > > > > > > Regards, 
> > > > > > > > > > > > > S?bastien Han. 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Fri, Jan 
11, 2013 at 7:13 PM, Gregory Farnum <g...@inktank.com 
(mailto:g...@inktank.com)> wrote: 
> > > > > > > > > > > > > > On Fri, 
Jan 11, 2013 at 6:57 AM, S?bastien Han <han.sebast...@gmail.com 
(mailto:han.sebast...@gmail.com)> 
> > > > > > > > > > > > > > wrote: 
> > > > > > > > > > > > > > > > 
Is osd.1 using the heap profiler as well? Keep in mind that active 
> > > > > > > > > > > > > > > > 
use 
> > > > > > > > > > > > > > > > 
of the memory profiler will itself cause memory usage to increase ? 
> > > > > > > > > > > > > > > > 
this sounds a bit like that to me since it's staying stable at a 
> > > > > > > > > > > > > > > > 
large 
> > > > > > > > > > > > > > > > 
but finite portion of total memory. 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
Well, the memory consumption was already high before the profiler was 
> > > > > > > > > > > > > > > 
started. So yes with the memory profiler enable an OSD might consume 
> > > > > > > > > > > > > > > more 
memory but this doesn't cause the memory leaks. 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > My 
concern is that maybe you saw a leak but when you restarted with 
> > > > > > > > > > > > > > the 
memory profiling you lost whatever conditions caused it. 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Any 
ideas? Nothing to say about my scrumbing theory? 
> > > > > > > > > > > > > > I like 
it, but Sam indicates that without some heap dumps which 
> > > > > > > > > > > > > > capture 
the actual leak then scrub is too large to effectively code 
> > > > > > > > > > > > > > review 
for leaks. :( 
> > > > > > > > > > > > > > -Greg 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > -- 
> > > > > > > > > > > > > To unsubscribe 
from this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > > > > > > > the body of a 
message to majord...@vger.kernel.org (mailto:majord...@vger.kernel.org) 
> > > > > > > > > > > > > More majordomo 
info at http://vger.kernel.org/majordomo-info.html 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > -- 
> > > > > > > > > > > > To unsubscribe from 
this list: send the line "unsubscribe ceph-devel" in 
> > > > > > > > > > > > the body of a 
message to majord...@vger.kernel.org (mailto:majord...@vger.kernel.org) 
> > > > > > > > > > > > More majordomo info 
at http://vger.kernel.org/majordomo-info.html 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > -- 
> > > > > > > > > > > Wido den Hollander 
> > > > > > > > > > > 42on B.V. 
> > > > > > > > > > > 
> > > > > > > > > > > Phone: +31 (0)20 700 
9902 
> > > > > > > > > > > Skype: contact42on 
> > > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > -- 
> > > > > > > > > To unsubscribe from this list: 
send the line "unsubscribe ceph-devel" in 
> > > > > > > > > the body of a message to 
majord...@vger.kernel.org (mailto:majord...@vger.kernel.org) 
> > > > > > > > > More majordomo info at 
http://vger.kernel.org/majordomo-info.html 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > -- 
> > > > > To unsubscribe from this list: send the line 
"unsubscribe ceph-devel" in 
> > > > > the body of a message to majord...@vger.kernel.org 
(mailto:majord...@vger.kernel.org) 
> > > > > More majordomo info at 
http://vger.kernel.org/majordomo-info.html 
> > > > 
> > > 
> > 
> 




--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to