Re: OSD memory leaks?

2013-03-14 Thread Dave Spano
ceph-devel@vger.kernel.org, Sage Weil s...@inktank.com, Wido den Hollander w...@42on.com, Sylvain Munaut s.mun...@whatever-company.com, Samuel Just sam.j...@inktank.com, Vladislav Gorbunov vadi...@gmail.com Sent: Wednesday, March 13, 2013 3:59:03 PM Subject: Re: OSD memory leaks? Dave, Just

Re: OSD memory leaks?

2013-03-13 Thread Dave Spano
Just sam.j...@inktank.com, Vladislav Gorbunov vadi...@gmail.com, Sébastien Han han.sebast...@gmail.com Sent: Tuesday, March 12, 2013 5:37:37 PM Subject: Re: OSD memory leaks? Yeah. There's not anything intelligent about that cppool mechanism. :) -Greg On Tuesday, March 12, 2013 at 2:15 PM

Re: OSD memory leaks?

2013-03-13 Thread Sébastien Han
, Sébastien Han han.sebast...@gmail.com Sent: Tuesday, March 12, 2013 5:37:37 PM Subject: Re: OSD memory leaks? Yeah. There's not anything intelligent about that cppool mechanism. :) -Greg On Tuesday, March 12, 2013 at 2:15 PM, Dave Spano wrote: I'd rather shut the cloud down and copy the pool

Re: OSD memory leaks?

2013-03-13 Thread Dave Spano
Just sam.j...@inktank.com, Vladislav Gorbunov vadi...@gmail.com Sent: Wednesday, March 13, 2013 3:59:03 PM Subject: Re: OSD memory leaks? Dave, Just to be sure, did the log max recent=1 _completely_ stod the memory leak or did it slow it down? Thanks! -- Regards, Sébastien Han

Re: OSD memory leaks?

2013-03-13 Thread Greg Farnum
: Wednesday, March 13, 2013 3:59:03 PM Subject: Re: OSD memory leaks? Dave, Just to be sure, did the log max recent=1 _completely_ stod the memory leak or did it slow it down? Thanks! -- Regards, Sébastien Han. On Wed, Mar 13, 2013 at 2:12 PM, Dave Spano dsp

Re: OSD memory leaks?

2013-03-13 Thread Dave Spano
-0400 (EDT) Subject: Re: OSD memory leaks? It sounds like maybe you didn't rename the new pool to use the old pool's name? Glance is looking for a specific pool to store its data in; I believe it's configurable but you'll need to do one or the other. -Greg On Wednesday, March 13, 2013 at 3:38 PM

Re: OSD memory leaks?

2013-03-13 Thread Josh Durgin
On 03/13/2013 05:05 PM, Dave Spano wrote: I renamed the old one from images to images-old, and the new one from images-new to images. This reminds me of a problem you might hit with this: RBD clones track the parent image pool by id, so they'll continue working after the pool is renamed. If

Re: OSD memory leaks?

2013-03-12 Thread Vladislav Gorbunov
FYI I'm using 450 pgs for my pools. Please, can you show the number of object replicas? ceph osd dump | grep 'rep size' Vlad Gorbunov 2013/3/5 Sébastien Han han.sebast...@gmail.com: FYI I'm using 450 pgs for my pools. -- Regards, Sébastien Han. On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil

Re: OSD memory leaks?

2013-03-12 Thread Sébastien Han
Replica count has been set to 2. Why? -- Regards, Sébastien Han. On Tue, Mar 12, 2013 at 12:45 PM, Vladislav Gorbunov vadi...@gmail.com wrote: FYI I'm using 450 pgs for my pools. Please, can you show the number of object replicas? ceph osd dump | grep 'rep size' Vlad Gorbunov 2013/3/5

Re: OSD memory leaks?

2013-03-12 Thread Vladislav Gorbunov
Sorry, i mean pg_num and pgp_num on all pools. Shown by the ceph osd dump | grep 'rep size' The default pg_num value 8 is NOT suitable for big cluster. 2013/3/13 Sébastien Han han.sebast...@gmail.com: Replica count has been set to 2. Why? -- Regards, Sébastien Han. On Tue, Mar 12, 2013

Re: OSD memory leaks?

2013-03-12 Thread Sébastien Han
Sorry, i mean pg_num and pgp_num on all pools. Shown by the ceph osd dump | grep 'rep size' Well it's still 450 each... The default pg_num value 8 is NOT suitable for big cluster. Thanks I know, I'm not new with Ceph. What's your point here? I already said that pg_num was 450... -- Regards,

Re: OSD memory leaks?

2013-03-12 Thread Dave Spano
...@gmail.com Sent: Tuesday, March 12, 2013 1:41:21 PM Subject: Re: OSD memory leaks? If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? Dave Spano - Original Message - From: Sébastien Han han.sebast...@gmail.com

Re: OSD memory leaks?

2013-03-12 Thread Sébastien Han
Gorbunov vadi...@gmail.com Sent: Tuesday, March 12, 2013 1:41:21 PM Subject: Re: OSD memory leaks? If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? Dave Spano - Original Message - From: Sébastien Han han.sebast

Re: OSD memory leaks?

2013-03-12 Thread Greg Farnum
...@inktank.com), Vladislav Gorbunov vadi...@gmail.com (mailto:vadi...@gmail.com) Sent: Tuesday, March 12, 2013 1:41:21 PM Subject: Re: OSD memory leaks? If one were stupid enough to have their pg_num and pgp_num set to 8 on two of their pools, how could you fix that? Dave Spano

Re: OSD memory leaks?

2013-03-12 Thread Bryan K. Wright
han.sebast...@gmail.com said: Well to avoid un necessary data movement, there is also an _experimental_ feature to change on fly the number of PGs in a pool. ceph osd pool set poolname pg_num numpgs --allow-experimental-feature I've been following the instructions here:

Re: OSD memory leaks?

2013-03-12 Thread Dave Spano
Munaut s.mun...@whatever-company.com, Samuel Just sam.j...@inktank.com, Vladislav Gorbunov vadi...@gmail.com Sent: Tuesday, March 12, 2013 4:20:13 PM Subject: Re: OSD memory leaks? On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: Well to avoid un necessary data movement, there is also

Re: OSD memory leaks?

2013-03-12 Thread Greg Farnum
Subject: Re: OSD memory leaks? On Tuesday, March 12, 2013 at 1:10 PM, Sébastien Han wrote: Well to avoid un necessary data movement, there is also an _experimental_ feature to change on fly the number of PGs in a pool. ceph osd pool set poolname pg_num numpgs --allow

Re: OSD memory leaks?

2013-03-11 Thread Sébastien Han
...@inktank.com Sent: Monday, March 4, 2013 12:11:22 PM Subject: Re: OSD memory leaks? FYI I'm using 450 pgs for my pools. -- Regards, Sébastien Han. On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil s...@inktank.com wrote: On Fri, 1 Mar 2013, Wido den Hollander wrote: On 02/23/2013 01:44 AM, Sage

Re: OSD memory leaks?

2013-03-04 Thread Sébastien Han
FYI I'm using 450 pgs for my pools. -- Regards, Sébastien Han. On Fri, Mar 1, 2013 at 8:10 PM, Sage Weil s...@inktank.com wrote: On Fri, 1 Mar 2013, Wido den Hollander wrote: On 02/23/2013 01:44 AM, Sage Weil wrote: On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally

Re: OSD memory leaks?

2013-03-01 Thread Wido den Hollander
On 02/23/2013 01:44 AM, Sage Weil wrote: On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally got a core dump. I did it with a kill -SEGV on the OSD process. https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 Hope we will get something out of it :-).

Re: OSD memory leaks?

2013-03-01 Thread Samuel Just
That pattern would seem to support the log trimming theory of the leak. -Sam On Fri, Mar 1, 2013 at 7:51 AM, Wido den Hollander w...@42on.com wrote: On 02/23/2013 01:44 AM, Sage Weil wrote: On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally got a core dump. I did it with a kill

Re: OSD memory leaks?

2013-03-01 Thread Sage Weil
On Fri, 1 Mar 2013, Wido den Hollander wrote: On 02/23/2013 01:44 AM, Sage Weil wrote: On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally got a core dump. I did it with a kill -SEGV on the OSD process.

Re: OSD memory leaks?

2013-02-25 Thread Sébastien Han
Ok thanks guys. Hope we will find something :-). -- Regards, Sébastien Han. On Mon, Feb 25, 2013 at 8:51 AM, Wido den Hollander w...@42on.com wrote: On 02/25/2013 01:21 AM, Sage Weil wrote: On Mon, 25 Feb 2013, S?bastien Han wrote: Hi Sage, Sorry it's a production system, so I can't test

Re: OSD memory leaks?

2013-02-24 Thread Sébastien Han
Hi Sage, Sorry it's a production system, so I can't test it. So at the end, you can't get anything out of the core dump? -- Regards, Sébastien Han. On Sat, Feb 23, 2013 at 1:44 AM, Sage Weil s...@inktank.com wrote: On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally got a core

Re: OSD memory leaks?

2013-02-24 Thread Sage Weil
On Mon, 25 Feb 2013, S?bastien Han wrote: Hi Sage, Sorry it's a production system, so I can't test it. So at the end, you can't get anything out of the core dump? I saw a bunch of dup object anmes, which is what led us to the pg log theory. I can look a bit more carefully to confirm, but

Re: OSD memory leaks?

2013-02-24 Thread Wido den Hollander
On 02/25/2013 01:21 AM, Sage Weil wrote: On Mon, 25 Feb 2013, S?bastien Han wrote: Hi Sage, Sorry it's a production system, so I can't test it. So at the end, you can't get anything out of the core dump? I saw a bunch of dup object anmes, which is what led us to the pg log theory. I can

Re: OSD memory leaks?

2013-02-22 Thread Sébastien Han
Hi all, I finally got a core dump. I did it with a kill -SEGV on the OSD process. https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 Hope we will get something out of it :-). -- Regards, Sébastien Han. On Fri, Jan 11, 2013 at 7:13 PM, Gregory Farnum

Re: OSD memory leaks?

2013-02-22 Thread Sage Weil
On Fri, 22 Feb 2013, S?bastien Han wrote: Hi all, I finally got a core dump. I did it with a kill -SEGV on the OSD process. https://www.dropbox.com/s/ahv6hm0ipnak5rf/core-ceph-osd-11-0-0-20100-1361539008 Hope we will get something out of it :-). AHA! We have a theory. The pg log

Re: OSD memory leaks?

2013-01-11 Thread Sébastien Han
Is osd.1 using the heap profiler as well? Keep in mind that active use of the memory profiler will itself cause memory usage to increase — this sounds a bit like that to me since it's staying stable at a large but finite portion of total memory. Well, the memory consumption was already high

Re: OSD memory leaks?

2013-01-11 Thread Gregory Farnum
On Fri, Jan 11, 2013 at 6:57 AM, Sébastien Han han.sebast...@gmail.com wrote: Is osd.1 using the heap profiler as well? Keep in mind that active use of the memory profiler will itself cause memory usage to increase — this sounds a bit like that to me since it's staying stable at a large but

Re: OSD memory leaks?

2013-01-10 Thread Gregory Farnum
On Wed, Jan 9, 2013 at 10:09 AM, Sylvain Munaut s.mun...@whatever-company.com wrote: Just fyi, I also have growing memory on OSD, and I have the same logs: libceph: osd4 172.20.11.32:6801 socket closed in the RBD clients That message is not an error; it just happens if the RBD client doesn't

Re: OSD memory leaks?

2013-01-09 Thread Dave Spano
, 2013 5:12:12 PM Subject: Re: OSD memory leaks? Dave, I share you my little script for now if you want it: #!/bin/bash for i in $(ps aux | grep [c]eph-osd | awk '{print $4}') do MEM_INTEGER=$(echo $i | cut -d '.' -f1) OSD=$(ps aux | grep [c]eph-osd | grep $i | awk '{print $13

Re: OSD memory leaks?

2013-01-07 Thread Samuel Just
- From: Dave Spano dsp...@optogenics.com To: Sébastien Han han.sebast...@gmail.com Cc: ceph-devel ceph-devel@vger.kernel.org, Samuel Just sam.j...@inktank.com Sent: Monday, January 7, 2013 12:40:06 PM Subject: Re: OSD memory leaks? Sam, Attached are some heaps that I collected today

Re: OSD memory leaks?

2013-01-04 Thread Sébastien Han
Hi Sam, Thanks for your answer and sorry the late reply. Unfortunately I can't get something out from the profiler, actually I do but I guess it doesn't show what is supposed to show... I will keep on trying this. Anyway yesterday I just thought that the problem might be due to some over usage

Re: OSD memory leaks?

2012-12-19 Thread Sébastien Han
No more suggestions? :( -- Regards, Sébastien Han. On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han han.sebast...@gmail.com wrote: Nothing terrific... Kernel logs from my clients are full of libceph: osd4 172.20.11.32:6801 socket closed I saw this somewhere on the tracker. Does this harm?

Re: OSD memory leaks?

2012-12-19 Thread Samuel Just
Sorry, it's been very busy. The next step would to try to get a heap dump. You can start a heap profile on osd N by: ceph osd tell N heap start_profiler and you can get it to dump the collected profile using ceph osd tell N heap dump. The dumps should show up in the osd log directory.

Re: OSD memory leaks?

2012-12-18 Thread Sébastien Han
Nothing terrific... Kernel logs from my clients are full of libceph: osd4 172.20.11.32:6801 socket closed I saw this somewhere on the tracker. Does this harm? Thanks. -- Regards, Sébastien Han. On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just sam.j...@inktank.com wrote: What is the workload

Fwd: OSD memory leaks?

2012-12-17 Thread Sébastien Han
Hi guys, Today looking at my graphs I noticed that one over 4 ceph nodes used a lot of memory. It keeps growing and growing. See the graph attached to this mail. I run 0.48.2 on Ubuntu 12.04. The other nodes also grow, but slowly than the first one. I'm not quite sure about the information that

Re: OSD memory leaks?

2012-12-17 Thread Sébastien Han
Hi, No, I don't see nothing abnormal in the network stats. I don't see anything in the logs... :( The weird thing is that one node over 4 seems to take way more memory than the others... -- Regards, Sébastien Han. On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han han.sebast...@gmail.com wrote: