Hi Mark and Dan, I can generate text files. Can you let me know what you would like to see? Without further instructions, I can do a simple conversion and a conversion against the first dump as a base. I will upload an archive with converted files added tomorrow afternoon.
Best regards, ================= Frank Schilder AIT Risø Campus Bygning 109, rum S14 ________________________________________ From: Mark Nelson <mnel...@redhat.com> Sent: 20 August 2020 21:52 To: Frank Schilder; Dan van der Ster; ceph-users Subject: Re: [ceph-users] Re: OSD memory leak? Hi Frank, I downloaded but haven't had time to get the environment setup yet either. It might be better to just generate the txt files if you can. Thanks! Mark On 8/20/20 2:33 AM, Frank Schilder wrote: > Hi Dan and Mark, > > could you please let me know if you can read the files with the version info > I provided in my previous e-mail? I'm in the process of collecting data with > more FS activity and would like to send it in a format that is useful for > investigation. > > Right now I'm observing a daily growth of swap of ca. 100-200MB on servers > with 16 OSDs each, 1SSD and 15HDDs. The OS+daemons operate fine, the OS > manages to keep enough RAM available. Also the mempool dump still shows onode > and data cached at a seemingly reasonable level. Users report a more stable > performance of the FS after I increased the cach min sizes on all OSDs. > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Frank Schilder <fr...@dtu.dk> > Sent: 17 August 2020 09:37 > To: Dan van der Ster > Cc: ceph-users > Subject: [ceph-users] Re: OSD memory leak? > > Hi Dan, > > I use the container > docker.io/ceph/daemon:v3.2.10-stable-3.2-mimic-centos-7-x86_64. As far as I > can see, it uses the packages from http://download.ceph.com/rpm-mimic/el7, > its a Centos 7 build. The version is: > > # ceph -v > ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable) > > On Centos, the profiler packages are called different, without the "google-" > prefix. The version I have installed is > > # pprof --version > pprof (part of gperftools 2.0) > > Copyright 1998-2007 Google Inc. > > This is BSD licensed software; see the source for copying conditions > and license information. > There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A > PARTICULAR PURPOSE. > > It is possible to install pprof inside this container and analyse the > *.heap-files I provided. > > If this doesn't work for you and you want me to generate the text output for > heap-files, I can do that. Please let me know if I should do all files and > with what option (eg. against a base etc.). > > Best regards, > ================= > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > ________________________________________ > From: Dan van der Ster <d...@vanderster.com> > Sent: 14 August 2020 10:38:57 > To: Frank Schilder > Cc: Mark Nelson; ceph-users > Subject: Re: [ceph-users] Re: OSD memory leak? > > Hi Frank, > > I'm having trouble getting the exact version of ceph you used to > create this heap profile. > Could you run the google-pprof --text steps at [1] and share the output? > > Thanks, Dan > > [1] https://docs.ceph.com/docs/master/rados/troubleshooting/memory-profiling/ > > > On Tue, Aug 11, 2020 at 2:37 PM Frank Schilder <fr...@dtu.dk> wrote: >> Hi Mark, >> >> here is a first collection of heap profiling data (valid 30 days): >> >> https://files.dtu.dk/u/53HHic_xx5P1cceJ/heap_profiling-2020-08-03.tgz?l >> >> This was collected with the following config settings: >> >> osd dev osd_memory_cache_min >> 805306368 >> osd basic osd_memory_target >> 2147483648 >> >> Setting the cache_min value seems to help keeping cache space available. >> Unfortunately, the above collection is for 12 days only. I needed to restart >> the OSD and will need to restart it soon again. I hope I can then run a >> longer sample. The profiling does cause slow ops though. >> >> Maybe you can see something already? It seems to have collected some leaked >> memory. Unfortunately, it was a period of extremely low load. Basically, >> with the day of recording the utilization dropped to almost zero. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Frank Schilder <fr...@dtu.dk> >> Sent: 21 July 2020 12:57:32 >> To: Mark Nelson; Dan van der Ster >> Cc: ceph-users >> Subject: [ceph-users] Re: OSD memory leak? >> >> Quick question: Is there a way to change the frequency of heap dumps? On >> this page http://goog-perftools.sourceforge.net/doc/heap_profiler.html a >> function HeapProfilerSetAllocationInterval() is mentioned, but no other way >> of configuring this. Is there a config parameter or a ceph daemon call to >> adjust this? >> >> If not, can I change the dump path? >> >> Its likely to overrun my log partition quickly if I cannot adjust either of >> the two. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Frank Schilder <fr...@dtu.dk> >> Sent: 20 July 2020 15:19:05 >> To: Mark Nelson; Dan van der Ster >> Cc: ceph-users >> Subject: [ceph-users] Re: OSD memory leak? >> >> Dear Mark, >> >> thank you very much for the very helpful answers. I will raise >> osd_memory_cache_min, leave everything else alone and watch what happens. I >> will report back here. >> >> Thanks also for raising this as an issue. >> >> Best regards, >> ================= >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 >> >> ________________________________________ >> From: Mark Nelson <mnel...@redhat.com> >> Sent: 20 July 2020 15:08:11 >> To: Frank Schilder; Dan van der Ster >> Cc: ceph-users >> Subject: Re: [ceph-users] Re: OSD memory leak? >> >> On 7/20/20 3:23 AM, Frank Schilder wrote: >>> Dear Mark and Dan, >>> >>> I'm in the process of restarting all OSDs and could use some quick advice >>> on bluestore cache settings. My plan is to set higher minimum values and >>> deal with accumulated excess usage via regular restarts. Looking at the >>> documentation >>> (https://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/), >>> I find the following relevant options (with defaults): >>> >>> # Automatic Cache Sizing >>> osd_memory_target {4294967296} # 4GB >>> osd_memory_base {805306368} # 768MB >>> osd_memory_cache_min {134217728} # 128MB >>> >>> # Manual Cache Sizing >>> bluestore_cache_meta_ratio {.4} # 40% ? >>> bluestore_cache_kv_ratio {.4} # 40% ? >>> bluestore_cache_kv_max {512 * 1024*1024} # 512MB >>> >>> Q1) If I increase osd_memory_cache_min, should I also increase >>> osd_memory_base by the same or some other amount? >> >> osd_memory_base is a hint at how much memory the OSD could consume >> outside the cache once it's reached steady state. It basically sets a >> hard cap on how much memory the cache will use to avoid over-committing >> memory and thrashing when we exceed the memory limit. It's not necessary >> to get it right, it just helps smooth things out by making the automatic >> memory tuning less aggressive. IE if you have a 2 GB memory target and >> a 512MB base, you'll never assign more than 1.5GB to the cache on the >> assumption that the rest of the OSD will eventually need 512MB to >> operate even if it's not using that much right now. I think you can >> probably just leave it alone. What you and Dan appear to be seeing is >> that this number isn't static in your case but increases over time any >> way. Eventually I'm hoping that we can automatically account for more >> and more of that memory by reading the data from the mempools. >> >>> Q2) The cache ratio options are shown under the section "Manual Cache >>> Sizing". Do they also apply when cache auto tuning is enabled? If so, is it >>> worth changing these defaults for higher values of osd_memory_cache_min? >> >> They actually do have an effect on the automatic cache sizing and >> probably shouldn't only be under the manual section. When you have the >> automatic cache sizing enabled, those options will affect the "fair >> share" values of the different caches at each cache priority level. IE >> at priority level 0, if both caches want more memory than is available, >> those ratios will determine how much each cache gets. If there is more >> memory available than requested, each cache gets as much as they want >> and we move on to the next priority level and do the same thing again. >> So in this case the ratios end up being sort of more like fallback >> settings for when you don't have enough memory to fulfill all cache >> requests at a given priority level, but otherwise are not utilized until >> we hit that limit. The goal with this scheme is to make sure that "high >> priority" items in each cache get first dibs at the memory even if it >> might skew the ratios. This might be things like rocksdb bloom filters >> and indexes, or potentially very recent hot items in one cache vs very >> old items in another cache. The ratios become more like guidelines than >> hard limits. >> >> >> When you change to manual mode, you set an overall bluestore cache size >> and each cache gets a flat percentage of it based on the ratios. With >> 0.4/0.4 you will always have 40% for onode, 40% for omap, and 20% for >> data even if one of those caches does not use all of it's memory. >> >> >>> Many thanks for your help with this. I can't find answers to these >>> questions in the docs. >>> >>> There might be two reasons for high osd_map memory usage. One is, that our >>> OSDs seem to hold a large number of OSD maps: >> >> I brought this up in our core team standup last week. Not sure if >> anyone has had time to look at it yet though. >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> _______________________________________________ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io