Hi Dan,

no worries. I checked and osd_map_dedup is set to true, the default value.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Dan van der Ster <d...@vanderster.com>
Sent: 20 August 2020 09:41
To: Frank Schilder
Cc: Mark Nelson; ceph-users
Subject: Re: [ceph-users] Re: OSD memory leak?

Hi Frank,

I didn't get time yet. On our side, I was planning to see if the issue
persists after upgrading to v14.2.11 -- it includes some updates to
how the osdmap is referenced across OSD.cc.

BTW, do you happen to have osd_map_dedup set to false? We do, and that
surely increases the osdmap memory usage somewhat.

-- Dan



-- Dan

On Thu, Aug 20, 2020 at 9:33 AM Frank Schilder <fr...@dtu.dk> wrote:
>
> Hi Dan and Mark,
>
> could you please let me know if you can read the files with the version info 
> I provided in my previous e-mail? I'm in the process of collecting data with 
> more FS activity and would like to send it in a format that is useful for 
> investigation.
>
> Right now I'm observing a daily growth of swap of ca. 100-200MB on servers 
> with 16 OSDs each, 1SSD and 15HDDs. The OS+daemons operate fine, the OS 
> manages to keep enough RAM available. Also the mempool dump still shows onode 
> and data cached at a seemingly reasonable level. Users report a more stable 
> performance of the FS after I increased the cach min sizes on all OSDs.
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Frank Schilder <fr...@dtu.dk>
> Sent: 17 August 2020 09:37
> To: Dan van der Ster
> Cc: ceph-users
> Subject: [ceph-users] Re: OSD memory leak?
>
> Hi Dan,
>
> I use the container 
> docker.io/ceph/daemon:v3.2.10-stable-3.2-mimic-centos-7-x86_64. As far as I 
> can see, it uses the packages from http://download.ceph.com/rpm-mimic/el7, 
> its a Centos 7 build. The version is:
>
> # ceph -v
> ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable)
>
> On Centos, the profiler packages are called different, without the "google-" 
> prefix. The version I have installed is
>
> # pprof --version
> pprof (part of gperftools 2.0)
>
> Copyright 1998-2007 Google Inc.
>
> This is BSD licensed software; see the source for copying conditions
> and license information.
> There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
> PARTICULAR PURPOSE.
>
> It is possible to install pprof inside this container and analyse the 
> *.heap-files I provided.
>
> If this doesn't work for you and you want me to generate the text output for 
> heap-files, I can do that. Please let me know if I should do all files and 
> with what option (eg. against a base etc.).
>
> Best regards,
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> ________________________________________
> From: Dan van der Ster <d...@vanderster.com>
> Sent: 14 August 2020 10:38:57
> To: Frank Schilder
> Cc: Mark Nelson; ceph-users
> Subject: Re: [ceph-users] Re: OSD memory leak?
>
> Hi Frank,
>
> I'm having trouble getting the exact version of ceph you used to
> create this heap profile.
> Could you run the google-pprof --text steps at [1] and share the output?
>
> Thanks, Dan
>
> [1] https://docs.ceph.com/docs/master/rados/troubleshooting/memory-profiling/
>
>
> On Tue, Aug 11, 2020 at 2:37 PM Frank Schilder <fr...@dtu.dk> wrote:
> >
> > Hi Mark,
> >
> > here is a first collection of heap profiling data (valid 30 days):
> >
> > https://files.dtu.dk/u/53HHic_xx5P1cceJ/heap_profiling-2020-08-03.tgz?l
> >
> > This was collected with the following config settings:
> >
> >   osd                      dev      osd_memory_cache_min              
> > 805306368
> >   osd                      basic    osd_memory_target                 
> > 2147483648
> >
> > Setting the cache_min value seems to help keeping cache space available. 
> > Unfortunately, the above collection is for 12 days only. I needed to 
> > restart the OSD and will need to restart it soon again. I hope I can then 
> > run a longer sample. The profiling does cause slow ops though.
> >
> > Maybe you can see something already? It seems to have collected some leaked 
> > memory. Unfortunately, it was a period of extremely low load. Basically, 
> > with the day of recording the utilization dropped to almost zero.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Frank Schilder <fr...@dtu.dk>
> > Sent: 21 July 2020 12:57:32
> > To: Mark Nelson; Dan van der Ster
> > Cc: ceph-users
> > Subject: [ceph-users] Re: OSD memory leak?
> >
> > Quick question: Is there a way to change the frequency of heap dumps? On 
> > this page http://goog-perftools.sourceforge.net/doc/heap_profiler.html a 
> > function HeapProfilerSetAllocationInterval() is mentioned, but no other way 
> > of configuring this. Is there a config parameter or a ceph daemon call to 
> > adjust this?
> >
> > If not, can I change the dump path?
> >
> > Its likely to overrun my log partition quickly if I cannot adjust either of 
> > the two.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Frank Schilder <fr...@dtu.dk>
> > Sent: 20 July 2020 15:19:05
> > To: Mark Nelson; Dan van der Ster
> > Cc: ceph-users
> > Subject: [ceph-users] Re: OSD memory leak?
> >
> > Dear Mark,
> >
> > thank you very much for the very helpful answers. I will raise 
> > osd_memory_cache_min, leave everything else alone and watch what happens. I 
> > will report back here.
> >
> > Thanks also for raising this as an issue.
> >
> > Best regards,
> > =================
> > Frank Schilder
> > AIT Risø Campus
> > Bygning 109, rum S14
> >
> > ________________________________________
> > From: Mark Nelson <mnel...@redhat.com>
> > Sent: 20 July 2020 15:08:11
> > To: Frank Schilder; Dan van der Ster
> > Cc: ceph-users
> > Subject: Re: [ceph-users] Re: OSD memory leak?
> >
> > On 7/20/20 3:23 AM, Frank Schilder wrote:
> > > Dear Mark and Dan,
> > >
> > > I'm in the process of restarting all OSDs and could use some quick advice 
> > > on bluestore cache settings. My plan is to set higher minimum values and 
> > > deal with accumulated excess usage via regular restarts. Looking at the 
> > > documentation 
> > > (https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/),
> > >  I find the following relevant options (with defaults):
> > >
> > > # Automatic Cache Sizing
> > > osd_memory_target {4294967296} # 4GB
> > > osd_memory_base {805306368} # 768MB
> > > osd_memory_cache_min {134217728} # 128MB
> > >
> > > # Manual Cache Sizing
> > > bluestore_cache_meta_ratio {.4} # 40% ?
> > > bluestore_cache_kv_ratio {.4} # 40% ?
> > > bluestore_cache_kv_max {512 * 1024*1024} # 512MB
> > >
> > > Q1) If I increase osd_memory_cache_min, should I also increase 
> > > osd_memory_base by the same or some other amount?
> >
> >
> > osd_memory_base is a hint at how much memory the OSD could consume
> > outside the cache once it's reached steady state.  It basically sets a
> > hard cap on how much memory the cache will use to avoid over-committing
> > memory and thrashing when we exceed the memory limit. It's not necessary
> > to get it right, it just helps smooth things out by making the automatic
> > memory tuning less aggressive.  IE if you have a 2 GB memory target and
> > a 512MB base, you'll never assign more than 1.5GB to the cache on the
> > assumption that the rest of the OSD will eventually need 512MB to
> > operate even if it's not using that much right now.  I think you can
> > probably just leave it alone.  What you and Dan appear to be seeing is
> > that this number isn't static in your case but increases over time any
> > way.  Eventually I'm hoping that we can automatically account for more
> > and more of that memory by reading the data from the mempools.
> >
> > > Q2) The cache ratio options are shown under the section "Manual Cache 
> > > Sizing". Do they also apply when cache auto tuning is enabled? If so, is 
> > > it worth changing these defaults for higher values of 
> > > osd_memory_cache_min?
> >
> >
> > They actually do have an effect on the automatic cache sizing and
> > probably shouldn't only be under the manual section.  When you have the
> > automatic cache sizing enabled, those options will affect the "fair
> > share" values of the different caches at each cache priority level.  IE
> > at priority level 0, if both caches want more memory than is available,
> > those ratios will determine how much each cache gets.  If there is more
> > memory available than requested, each cache gets as much as they want
> > and we move on to the next priority level and do the same thing again.
> > So in this case the ratios end up being sort of more like fallback
> > settings for when you don't have enough memory to fulfill all cache
> > requests at a given priority level, but otherwise are not utilized until
> > we hit that limit.  The goal with this scheme is to make sure that "high
> > priority" items in each cache get first dibs at the memory even if it
> > might skew the ratios.  This might be things like rocksdb bloom filters
> > and indexes, or potentially very recent hot items in one cache vs very
> > old items in another cache.  The ratios become more like guidelines than
> > hard limits.
> >
> >
> > When you change to manual mode, you set an overall bluestore cache size
> > and each cache gets a flat percentage of it based on the ratios.  With
> > 0.4/0.4 you will always have 40% for onode, 40% for omap, and 20% for
> > data even if one of those caches does not use all of it's memory.
> >
> >
> > >
> > > Many thanks for your help with this. I can't find answers to these 
> > > questions in the docs.
> > >
> > > There might be two reasons for high osd_map memory usage. One is, that 
> > > our OSDs seem to hold a large number of OSD maps:
> >
> >
> > I brought this up in our core team standup last week.  Not sure if
> > anyone has had time to look at it yet though.
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to