Re: OSD memory leaks?

Dave Spano Wed, 09 Jan 2013 08:11:32 -0800

Yes, I'm using argonaut. 

I've got 38 heap files from yesterday. Currently, the OSD in question is using 
91.2% of memory according to top, and staying there. I initially thought it 
would go until the OOM killer started killing processes, but I don't see 
anything funny in the system logs that indicate that.


On the other hand, the ceph-osd process on osd.1 is using far less memory. 

osd.0
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
                                                                                
                                     
 9151 root      20   0 20.4g  14g 2548 S    1 91.2 517:58.71 ceph-osd 

osd.1

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
                                                                                
                                     
10785 root      20   0  673m 310m 5164 S    3  1.9 107:04.39 ceph-osd  

Here's what tcmalloc says when I run ceph osd tell 0 heap stats:
2013-01-09 11:09:36.778675 7f62aae23700  0 log [INF] : osd.0tcmalloc heap 
stats:------------------------------------------------
2013-01-09 11:09:36.779113 7f62aae23700  0 log [INF] : MALLOC:      210884768 ( 
 201.1 MB) Bytes in use by application
2013-01-09 11:09:36.779348 7f62aae23700  0 log [INF] : MALLOC: +     89026560 ( 
  84.9 MB) Bytes in page heap freelist
2013-01-09 11:09:36.779928 7f62aae23700  0 log [INF] : MALLOC: +      7926512 ( 
   7.6 MB) Bytes in central cache freelist
2013-01-09 11:09:36.779951 7f62aae23700  0 log [INF] : MALLOC: +       144896 ( 
   0.1 MB) Bytes in transfer cache freelist
2013-01-09 11:09:36.779972 7f62aae23700  0 log [INF] : MALLOC: +     11046512 ( 
  10.5 MB) Bytes in thread cache freelists
2013-01-09 11:09:36.780013 7f62aae23700  0 log [INF] : MALLOC: +      5177344 ( 
   4.9 MB) Bytes in malloc metadata
2013-01-09 11:09:36.780030 7f62aae23700  0 log [INF] : MALLOC:   ------------
2013-01-09 11:09:36.780056 7f62aae23700  0 log [INF] : MALLOC: =    324206592 ( 
 309.2 MB) Actual memory used (physical + swap)
2013-01-09 11:09:36.780081 7f62aae23700  0 log [INF] : MALLOC: +    126177280 ( 
 120.3 MB) Bytes released to OS (aka unmapped)
2013-01-09 11:09:36.780112 7f62aae23700  0 log [INF] : MALLOC:   ------------
2013-01-09 11:09:36.780127 7f62aae23700  0 log [INF] : MALLOC: =    450383872 ( 
 429.5 MB) Virtual address space used
2013-01-09 11:09:36.780152 7f62aae23700  0 log [INF] : MALLOC:
2013-01-09 11:09:36.780168 7f62aae23700  0 log [INF] : MALLOC:          37492   
           Spans in use
2013-01-09 11:09:36.780330 7f62aae23700  0 log [INF] : MALLOC:             51   
           Thread heaps in use
2013-01-09 11:09:36.780359 7f62aae23700  0 log [INF] : MALLOC:           4096   
           Tcmalloc page size
2013-01-09 11:09:36.780384 7f62aae23700  0 log [INF] : 
------------------------------------------------


Dave Spano 
Optogenics 
Systems Administrator 



----- Original Message ----- 

From: "Sébastien Han" <han.sebast...@gmail.com> 
To: "Samuel Just" <sam.j...@inktank.com> 
Cc: "Dave Spano" <dsp...@optogenics.com>, "ceph-devel" 
<ceph-devel@vger.kernel.org> 
Sent: Wednesday, January 9, 2013 10:20:43 AM 
Subject: Re: OSD memory leaks? 

I guess he runs Argonaut as well. 

More suggestions about this problem? 

Thanks! 

-- 
Regards, 
Sébastien Han. 


On Mon, Jan 7, 2013 at 8:09 PM, Samuel Just <sam.j...@inktank.com> wrote: 
> 
> Awesome! What version are you running (ceph-osd -v, include the hash)? 
> -Sam 
> 
> On Mon, Jan 7, 2013 at 11:03 AM, Dave Spano <dsp...@optogenics.com> wrote: 
> > This failed the first time I sent it, so I'm resending in plain text. 
> > 
> > Dave Spano 
> > Optogenics 
> > Systems Administrator 
> > 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Dave Spano" <dsp...@optogenics.com> 
> > To: "Sébastien Han" <han.sebast...@gmail.com> 
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org>, "Samuel Just" 
> > <sam.j...@inktank.com> 
> > Sent: Monday, January 7, 2013 12:40:06 PM 
> > Subject: Re: OSD memory leaks? 
> > 
> > 
> > Sam, 
> > 
> > Attached are some heaps that I collected today. 001 and 003 are just after 
> > I started the profiler; 011 is the most recent. If you need more, or 
> > anything different let me know. Already the OSD in question is at 38% 
> > memory usage. As mentioned by Sèbastien, restarting ceph-osd keeps things 
> > going. 
> > 
> > Not sure if this is helpful information, but out of the two OSDs that I 
> > have running, the first one (osd.0) is the one that develops this problem 
> > the quickest. osd.1 does have the same issue, it just takes much longer. Do 
> > the monitors hit the first osd in the list first, when there's activity? 
> > 
> > 
> > Dave Spano 
> > Optogenics 
> > Systems Administrator 
> > 
> > 
> > ----- Original Message ----- 
> > 
> > From: "Sébastien Han" <han.sebast...@gmail.com> 
> > To: "Samuel Just" <sam.j...@inktank.com> 
> > Cc: "ceph-devel" <ceph-devel@vger.kernel.org> 
> > Sent: Friday, January 4, 2013 10:20:58 AM 
> > Subject: Re: OSD memory leaks? 
> > 
> > Hi Sam, 
> > 
> > Thanks for your answer and sorry the late reply. 
> > 
> > Unfortunately I can't get something out from the profiler, actually I 
> > do but I guess it doesn't show what is supposed to show... I will keep 
> > on trying this. Anyway yesterday I just thought that the problem might 
> > be due to some over usage of some OSDs. I was thinking that the 
> > distribution of the primary OSD might be uneven, this could have 
> > explained that some memory leaks are more important with some servers. 
> > At the end, the repartition seems even but while looking at the pg 
> > dump I found something interesting in the scrub column, timestamps 
> > from the last scrubbing operation matched with times showed on the 
> > graph. 
> > 
> > After this, I made some calculation, I compared the total number of 
> > scrubbing operation with the time range where memory leaks occurred. 
> > First of all check my setup: 
> > 
> > root@c2-ceph-01 ~ # ceph osd tree 
> > dumped osdmap tree epoch 859 
> > # id weight type name up/down reweight 
> > -1 12 pool default 
> > -3 12 rack lc2_rack33 
> > -2 3 host c2-ceph-01 
> > 0 1 osd.0 up 1 
> > 1 1 osd.1 up 1 
> > 2 1 osd.2 up 1 
> > -4 3 host c2-ceph-04 
> > 10 1 osd.10 up 1 
> > 11 1 osd.11 up 1 
> > 9 1 osd.9 up 1 
> > -5 3 host c2-ceph-02 
> > 3 1 osd.3 up 1 
> > 4 1 osd.4 up 1 
> > 5 1 osd.5 up 1 
> > -6 3 host c2-ceph-03 
> > 6 1 osd.6 up 1 
> > 7 1 osd.7 up 1 
> > 8 1 osd.8 up 1 
> > 
> > 
> > And there are the results: 
> > 
> > * Ceph node 1 which has the most important memory leak performed 1608 
> > in total and 1059 during the time range where memory leaks occured 
> > * Ceph node 2, 1168 in total and 776 during the time range where 
> > memory leaks occured 
> > * Ceph node 3, 940 in total and 94 during the time range where memory 
> > leaks occurred 
> > * Ceph node 4, 899 in total and 191 during the time range where 
> > memory leaks occurred 
> > 
> > I'm still not entirely sure that the scrub operation causes the leak 
> > but the only relevant relation that I found... 
> > 
> > Could it be that the scrubbing process doesn't release memory? Btw I 
> > was wondering, how ceph decides at what time it should run the 
> > scrubbing operation? I know that it's once a day and control by the 
> > following options 
> > 
> > OPTION(osd_scrub_min_interval, OPT_FLOAT, 300) 
> > OPTION(osd_scrub_max_interval, OPT_FLOAT, 60*60*24) 
> > 
> > But how ceph determined the time where the operation started, during 
> > cluster creation probably? 
> > 
> > I just checked the options that control OSD scrubbing and found that by 
> > default: 
> > 
> > OPTION(osd_max_scrubs, OPT_INT, 1) 
> > 
> > So that might explain why only one OSD uses a lot of memory. 
> > 
> > My dirty workaround at the moment is to performed a check of memory 
> > use by every OSD and restart it if it uses more than 25% of the total 
> > memory. Also note that on ceph 1, 3 and 4 it's always one OSD that 
> > uses a lot of memory, for ceph 2 only the mem usage is high but almost 
> > the same for all the OSD process. 
> > 
> > Thank you in advance. 
> > 
> > -- 
> > Regards, 
> > Sébastien Han. 
> > 
> > 
> > On Wed, Dec 19, 2012 at 10:43 PM, Samuel Just <sam.j...@inktank.com> wrote: 
> >> 
> >> Sorry, it's been very busy. The next step would to try to get a heap 
> >> dump. You can start a heap profile on osd N by: 
> >> 
> >> ceph osd tell N heap start_profiler 
> >> 
> >> and you can get it to dump the collected profile using 
> >> 
> >> ceph osd tell N heap dump. 
> >> 
> >> The dumps should show up in the osd log directory. 
> >> 
> >> Assuming the heap profiler is working correctly, you can look at the 
> >> dump using pprof in google-perftools. 
> >> 
> >> On Wed, Dec 19, 2012 at 8:37 AM, Sébastien Han <han.sebast...@gmail.com> 
> >> wrote: 
> >> > No more suggestions? :( 
> >> > -- 
> >> > Regards, 
> >> > Sébastien Han. 
> >> > 
> >> > 
> >> > On Tue, Dec 18, 2012 at 6:21 PM, Sébastien Han <han.sebast...@gmail.com> 
> >> > wrote: 
> >> >> Nothing terrific... 
> >> >> 
> >> >> Kernel logs from my clients are full of "libceph: osd4 
> >> >> 172.20.11.32:6801 socket closed" 
> >> >> 
> >> >> I saw this somewhere on the tracker. 
> >> >> 
> >> >> Does this harm? 
> >> >> 
> >> >> Thanks. 
> >> >> 
> >> >> -- 
> >> >> Regards, 
> >> >> Sébastien Han. 
> >> >> 
> >> >> 
> >> >> 
> >> >> On Mon, Dec 17, 2012 at 11:55 PM, Samuel Just <sam.j...@inktank.com> 
> >> >> wrote: 
> >> >>> 
> >> >>> What is the workload like? 
> >> >>> -Sam 
> >> >>> 
> >> >>> On Mon, Dec 17, 2012 at 2:41 PM, Sébastien Han 
> >> >>> <han.sebast...@gmail.com> wrote: 
> >> >>> > Hi, 
> >> >>> > 
> >> >>> > No, I don't see nothing abnormal in the network stats. I don't see 
> >> >>> > anything in the logs... :( 
> >> >>> > The weird thing is that one node over 4 seems to take way more 
> >> >>> > memory 
> >> >>> > than the others... 
> >> >>> > 
> >> >>> > -- 
> >> >>> > Regards, 
> >> >>> > Sébastien Han. 
> >> >>> > 
> >> >>> > 
> >> >>> > On Mon, Dec 17, 2012 at 11:31 PM, Sébastien Han 
> >> >>> > <han.sebast...@gmail.com> wrote: 
> >> >>> >> 
> >> >>> >> Hi, 
> >> >>> >> 
> >> >>> >> No, I don't see nothing abnormal in the network stats. I don't see 
> >> >>> >> anything in the logs... :( 
> >> >>> >> The weird thing is that one node over 4 seems to take way more 
> >> >>> >> memory than the others... 
> >> >>> >> 
> >> >>> >> -- 
> >> >>> >> Regards, 
> >> >>> >> Sébastien Han. 
> >> >>> >> 
> >> >>> >> 
> >> >>> >> 
> >> >>> >> On Mon, Dec 17, 2012 at 7:12 PM, Samuel Just <sam.j...@inktank.com> 
> >> >>> >> wrote: 
> >> >>> >>> 
> >> >>> >>> Are you having network hiccups? There was a bug noticed recently 
> >> >>> >>> that 
> >> >>> >>> could cause a memory leak if nodes are being marked up and down. 
> >> >>> >>> -Sam 
> >> >>> >>> 
> >> >>> >>> On Mon, Dec 17, 2012 at 12:28 AM, Sébastien Han 
> >> >>> >>> <han.sebast...@gmail.com> wrote: 
> >> >>> >>> > Hi guys, 
> >> >>> >>> > 
> >> >>> >>> > Today looking at my graphs I noticed that one over 4 ceph nodes 
> >> >>> >>> > used a 
> >> >>> >>> > lot of memory. It keeps growing and growing. 
> >> >>> >>> > See the graph attached to this mail. 
> >> >>> >>> > I run 0.48.2 on Ubuntu 12.04. 
> >> >>> >>> > 
> >> >>> >>> > The other nodes also grow, but slowly than the first one. 
> >> >>> >>> > 
> >> >>> >>> > I'm not quite sure about the information that I have to provide. 
> >> >>> >>> > So 
> >> >>> >>> > let me know. The only thing I can say is that the load haven't 
> >> >>> >>> > increase that much this week. It seems to be consuming and not 
> >> >>> >>> > giving 
> >> >>> >>> > back the memory. 
> >> >>> >>> > 
> >> >>> >>> > Thank you in advance. 
> >> >>> >>> > 
> >> >>> >>> > -- 
> >> >>> >>> > Regards, 
> >> >>> >>> > Sébastien Han. 
> >> >>> >> 
> >> >>> >> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
> the body of a message to majord...@vger.kernel.org 
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: OSD memory leaks?

Reply via email to