Re: [ceph-users] rocksdb mon stores growing until restart

2019-01-15 Thread Dan van der Ster
On Wed, Sep 19, 2018 at 7:01 PM Bryan Stillwell  wrote:
>
> > On 08/30/2018 11:00 AM, Joao Eduardo Luis wrote:
> > > On 08/30/2018 09:28 AM, Dan van der Ster wrote:
> > > Hi,
> > > Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> > > eventually triggering the 'mon is using a lot of disk space' warning?
> > > Since upgrading to luminous, we've seen this happen at least twice.
> > > Each time, we restart all the mons and then stores slowly trim down to
> > > <500MB. We have 'mon compact on start = true', but it's not the
> > > compaction that's shrinking the rockdb's -- the space used seems to
> > > decrease over a few minutes only after *all* mons have been restarted.
> > > This reminds me of a hammer-era issue where references to trimmed maps
> > > were leaking -- I can't find that bug at the moment, though.
> >
> > Next time this happens, mind listing the store contents and check if you
> > are holding way too many osdmaps? You shouldn't be holding more osdmaps
> > than the default IF the cluster is healthy and all the pgs are clean.
> >
> > I've chased a bug pertaining this last year, even got a patch, but then
> > was unable to reproduce it. Didn't pursue merging the patch any longer
> > (I think I may still have an open PR for it though), simply because it
> > was no longer clear if it was needed.
>
> I just had this happen to me while using ceph-gentle-split on a 12.2.5
> cluster with 1,370 OSDs.  Unfortunately, I restarted the mon nodes which
> fixed the problem before finding this thread.  I'm only halfway done
> with the split, so I'll see if the problem resurfaces again.
>

I think I've understood the what's causing this -- it's related to the
issue we've seen where osdmaps are not being trimmed on osds.
It seems that once the oldest_map and newest_map are within 500, they
are no longer trimmed ever until the mon's are restarted.

I updated this tracker: http://tracker.ceph.com/issues/37875

-- dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rocksdb mon stores growing until restart

2019-01-08 Thread Wido den Hollander



On 8/30/18 10:28 AM, Dan van der Ster wrote:
> Hi,
> 
> Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> eventually triggering the 'mon is using a lot of disk space' warning?
> 
> Since upgrading to luminous, we've seen this happen at least twice.
> Each time, we restart all the mons and then stores slowly trim down to
> <500MB. We have 'mon compact on start = true', but it's not the
> compaction that's shrinking the rockdb's -- the space used seems to
> decrease over a few minutes only after *all* mons have been restarted.
> 
> This reminds me of a hammer-era issue where references to trimmed maps
> were leaking -- I can't find that bug at the moment, though.
> 

I just saw our message in the other thread and I thought I'd reply here.

I have seen this recently as well with Luminous 12.2.8 after a large
migration. Cluster grew from ~2000 OSDs to ~2500. Rebalance took about 4
days.

After this all the MONs were 15~16GB in size and were issuing a warning.

I stopped the MONs and compacted their MON stores using
ceph-monstore-tool and started them again, that worked.

I'm usually cautions with doing a online compaction as this sometimes
hits the MON performance.

Not sure yet why this is happening as the MONs should compact during
normal operations.

Wido

> Cheers, Dan
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rocksdb mon stores growing until restart

2018-09-19 Thread Bryan Stillwell
> On 08/30/2018 11:00 AM, Joao Eduardo Luis wrote:
> > On 08/30/2018 09:28 AM, Dan van der Ster wrote:
> > Hi,
> > Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> > eventually triggering the 'mon is using a lot of disk space' warning?
> > Since upgrading to luminous, we've seen this happen at least twice.
> > Each time, we restart all the mons and then stores slowly trim down to
> > <500MB. We have 'mon compact on start = true', but it's not the
> > compaction that's shrinking the rockdb's -- the space used seems to
> > decrease over a few minutes only after *all* mons have been restarted.
> > This reminds me of a hammer-era issue where references to trimmed maps
> > were leaking -- I can't find that bug at the moment, though.
>
> Next time this happens, mind listing the store contents and check if you
> are holding way too many osdmaps? You shouldn't be holding more osdmaps
> than the default IF the cluster is healthy and all the pgs are clean.
>
> I've chased a bug pertaining this last year, even got a patch, but then
> was unable to reproduce it. Didn't pursue merging the patch any longer
> (I think I may still have an open PR for it though), simply because it
> was no longer clear if it was needed.

I just had this happen to me while using ceph-gentle-split on a 12.2.5
cluster with 1,370 OSDs.  Unfortunately, I restarted the mon nodes which
fixed the problem before finding this thread.  I'm only halfway done
with the split, so I'll see if the problem resurfaces again.

Bryan

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rocksdb mon stores growing until restart

2018-08-30 Thread David Turner
The Hammer ticket was https://tracker.ceph.com/issues/13990.  The problem
here was when OSDs asked each other for which map they needed to keep and a
leak would set it to NULL then that OSD would never delete an OSD map again
until it was restarted.

On Thu, Aug 30, 2018 at 3:09 AM Joao Eduardo Luis  wrote:

> On 08/30/2018 09:28 AM, Dan van der Ster wrote:
> > Hi,
> >
> > Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> > eventually triggering the 'mon is using a lot of disk space' warning?
> >
> > Since upgrading to luminous, we've seen this happen at least twice.
> > Each time, we restart all the mons and then stores slowly trim down to
> > <500MB. We have 'mon compact on start = true', but it's not the
> > compaction that's shrinking the rockdb's -- the space used seems to
> > decrease over a few minutes only after *all* mons have been restarted.
> >
> > This reminds me of a hammer-era issue where references to trimmed maps
> > were leaking -- I can't find that bug at the moment, though.
>
> Next time this happens, mind listing the store contents and check if you
> are holding way too many osdmaps? You shouldn't be holding more osdmaps
> than the default IF the cluster is healthy and all the pgs are clean.
>
> I've chased a bug pertaining this last year, even got a patch, but then
> was unable to reproduce it. Didn't pursue merging the patch any longer
> (I think I may still have an open PR for it though), simply because it
> was no longer clear if it was needed.
>
>   -Joao
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rocksdb mon stores growing until restart

2018-08-30 Thread Joao Eduardo Luis
On 08/30/2018 09:28 AM, Dan van der Ster wrote:
> Hi,
> 
> Is anyone else seeing rocksdb mon stores slowly growing to >15GB,
> eventually triggering the 'mon is using a lot of disk space' warning?
> 
> Since upgrading to luminous, we've seen this happen at least twice.
> Each time, we restart all the mons and then stores slowly trim down to
> <500MB. We have 'mon compact on start = true', but it's not the
> compaction that's shrinking the rockdb's -- the space used seems to
> decrease over a few minutes only after *all* mons have been restarted.
> 
> This reminds me of a hammer-era issue where references to trimmed maps
> were leaking -- I can't find that bug at the moment, though.

Next time this happens, mind listing the store contents and check if you
are holding way too many osdmaps? You shouldn't be holding more osdmaps
than the default IF the cluster is healthy and all the pgs are clean.

I've chased a bug pertaining this last year, even got a patch, but then
was unable to reproduce it. Didn't pursue merging the patch any longer
(I think I may still have an open PR for it though), simply because it
was no longer clear if it was needed.

  -Joao
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com