[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
Hi Dan, nope, osdmap_first_commited is still 1, it must be some different issue.. I'll report when I have something.. n. On Thu, Mar 12, 2020 at 04:07:26PM +0100, Dan van der Ster wrote: > You have to wait 5 minutes or so after restarting the mon before it > starts trimming. > Otherwise, hmm, I'm not sure. > > -- dan > > On Thu, Mar 12, 2020 at 3:55 PM Nikola Ciprich > wrote: > > > > Hi Dan, > > > > # ceph report 2>/dev/null | jq .osdmap_first_committed > > 1 > > # ceph report 2>/dev/null | jq .osdmap_last_committed > > 4646 > > > > seems like osdmap_first_committed doesn't change at all, restarting mons > > doesn't help.. I don't have any down OSD, everything seems to be healty.. > > > > BR > > > > nik > > > > > > > > > > On Thu, Mar 12, 2020 at 03:23:25PM +0100, Dan van der Ster wrote: > > > If untrimed osdmaps is related, then you should check: > > > https://tracker.ceph.com/issues/37875, particularly #note6 > > > > > > You can see what the mon thinks the valid range of osdmaps is: > > > > > > # ceph report | jq .osdmap_first_committed > > > 113300 > > > # ceph report | jq .osdmap_last_committed > > > 113938 > > > > > > Then the workaround to start trimming is to restart the leader. > > > This shrinks the range on the mon, which then starts telling the osds > > > to trim range. > > > Note that the OSDs will only trim 30 osdmaps for each new osdmap > > > generated -- so if you have a lot of osdmaps to trim, you need to > > > generate more. > > > > > > -- dan > > > > > > > > > On Thu, Mar 12, 2020 at 11:02 AM Nikola Ciprich > > > wrote: > > > > > > > > OK, > > > > > > > > so I can confirm that at least in my case, the problem is caused > > > > by old osd maps not being pruned for some reason, and thus not fitting > > > > into cache. When I increased osd map cache to 5000 the problem is gone. > > > > > > > > The question is why they're not being pruned, even though the cluster > > > > is in > > > > healthy state. But you can try checking: > > > > > > > > ceph daemon osd.X status to see how many maps are your OSDs storing > > > > and ceph daemon osd.X perf dump | grep osd_map_cache_miss > > > > > > > > to see if you're experiencing similar problem.. > > > > > > > > so I'm going to debug further.. > > > > > > > > BR > > > > > > > > nik > > > > > > > > On Thu, Mar 12, 2020 at 09:16:58AM +0100, Nikola Ciprich wrote: > > > > > Hi Paul and others, > > > > > > > > > > while digging deeper, I noticed that when the cluster gets into this > > > > > state, osd_map_cache_miss on OSDs starts growing rapidly.. even when > > > > > I increased osd map cache size to 500 (which was the default at least > > > > > for luminous) it behaves the same.. > > > > > > > > > > I think this could be related.. > > > > > > > > > > I'll try playing more with cache settings.. > > > > > > > > > > BR > > > > > > > > > > nik > > > > > > > > > > > > > > > > > > > > On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote: > > > > > > Encountered this one again today, I've updated the issue with new > > > > > > information: https://tracker.ceph.com/issues/44184 > > > > > > > > > > > > > > > > > > Paul > > > > > > > > > > > > -- > > > > > > Paul Emmerich > > > > > > > > > > > > Looking for help with your Ceph cluster? Contact us at > > > > > > https://croit.io > > > > > > > > > > > > croit GmbH > > > > > > Freseniusstr. 31h > > > > > > 81247 München > > > > > > www.croit.io > > > > > > Tel: +49 89 1896585 90 > > > > > > > > > > > > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich > > > > > > wrote: > > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > I just wanted to report we've just hit very similar problem.. on > > > > > > > mimic > > > > > > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of > > > > > > > slow > > > > > > > ops caused by waiting for new map. It seems those are slowed by > > > > > > > SATA > > > > > > > OSDs which keep being 100% busy reading for long time until all > > > > > > > ops are gone, > > > > > > > blocking OPS on unrelated NVME pools - SATA pools are completely > > > > > > > unused now. > > > > > > > > > > > > > > is this possible that those maps are being requested from slow > > > > > > > SATA OSDs > > > > > > > and it takes such a long time for some reason? why could it take > > > > > > > so long? > > > > > > > the cluster is very small with very light load.. > > > > > > > > > > > > > > BR > > > > > > > > > > > > > > nik > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > > > > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander > > > > > > > > > wrote: > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > > > > > > > >>> I've also seen this problem on Nautilus with no obvious > > > > > > > > >>> reason for the > > >
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
You have to wait 5 minutes or so after restarting the mon before it starts trimming. Otherwise, hmm, I'm not sure. -- dan On Thu, Mar 12, 2020 at 3:55 PM Nikola Ciprich wrote: > > Hi Dan, > > # ceph report 2>/dev/null | jq .osdmap_first_committed > 1 > # ceph report 2>/dev/null | jq .osdmap_last_committed > 4646 > > seems like osdmap_first_committed doesn't change at all, restarting mons > doesn't help.. I don't have any down OSD, everything seems to be healty.. > > BR > > nik > > > > > On Thu, Mar 12, 2020 at 03:23:25PM +0100, Dan van der Ster wrote: > > If untrimed osdmaps is related, then you should check: > > https://tracker.ceph.com/issues/37875, particularly #note6 > > > > You can see what the mon thinks the valid range of osdmaps is: > > > > # ceph report | jq .osdmap_first_committed > > 113300 > > # ceph report | jq .osdmap_last_committed > > 113938 > > > > Then the workaround to start trimming is to restart the leader. > > This shrinks the range on the mon, which then starts telling the osds > > to trim range. > > Note that the OSDs will only trim 30 osdmaps for each new osdmap > > generated -- so if you have a lot of osdmaps to trim, you need to > > generate more. > > > > -- dan > > > > > > On Thu, Mar 12, 2020 at 11:02 AM Nikola Ciprich > > wrote: > > > > > > OK, > > > > > > so I can confirm that at least in my case, the problem is caused > > > by old osd maps not being pruned for some reason, and thus not fitting > > > into cache. When I increased osd map cache to 5000 the problem is gone. > > > > > > The question is why they're not being pruned, even though the cluster is > > > in > > > healthy state. But you can try checking: > > > > > > ceph daemon osd.X status to see how many maps are your OSDs storing > > > and ceph daemon osd.X perf dump | grep osd_map_cache_miss > > > > > > to see if you're experiencing similar problem.. > > > > > > so I'm going to debug further.. > > > > > > BR > > > > > > nik > > > > > > On Thu, Mar 12, 2020 at 09:16:58AM +0100, Nikola Ciprich wrote: > > > > Hi Paul and others, > > > > > > > > while digging deeper, I noticed that when the cluster gets into this > > > > state, osd_map_cache_miss on OSDs starts growing rapidly.. even when > > > > I increased osd map cache size to 500 (which was the default at least > > > > for luminous) it behaves the same.. > > > > > > > > I think this could be related.. > > > > > > > > I'll try playing more with cache settings.. > > > > > > > > BR > > > > > > > > nik > > > > > > > > > > > > > > > > On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote: > > > > > Encountered this one again today, I've updated the issue with new > > > > > information: https://tracker.ceph.com/issues/44184 > > > > > > > > > > > > > > > Paul > > > > > > > > > > -- > > > > > Paul Emmerich > > > > > > > > > > Looking for help with your Ceph cluster? Contact us at > > > > > https://croit.io > > > > > > > > > > croit GmbH > > > > > Freseniusstr. 31h > > > > > 81247 München > > > > > www.croit.io > > > > > Tel: +49 89 1896585 90 > > > > > > > > > > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich > > > > > wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > I just wanted to report we've just hit very similar problem.. on > > > > > > mimic > > > > > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow > > > > > > ops caused by waiting for new map. It seems those are slowed by SATA > > > > > > OSDs which keep being 100% busy reading for long time until all ops > > > > > > are gone, > > > > > > blocking OPS on unrelated NVME pools - SATA pools are completely > > > > > > unused now. > > > > > > > > > > > > is this possible that those maps are being requested from slow SATA > > > > > > OSDs > > > > > > and it takes such a long time for some reason? why could it take so > > > > > > long? > > > > > > the cluster is very small with very light load.. > > > > > > > > > > > > BR > > > > > > > > > > > > nik > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > > > > > > > > > > > > > > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > > > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander > > > > > > > > wrote: > > > > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > > > > > > >>> I've also seen this problem on Nautilus with no obvious > > > > > > > >>> reason for the > > > > > > > >>> slowness once. > > > > > > > >> > > > > > > > >> Did this resolve itself? Or did you remove the pool? > > > > > > > > > > > > > > > > I've seen this twice on the same cluster, it fixed itself the > > > > > > > > first > > > > > > > > time (maybe with some OSD restarts?) and the other time I > > > > > > > > removed the > > > > > > > > pool after a few minutes because the OSDs were running into > > > > > > > > heartbeat > > > > > > > > timeouts. There unfortunately seems to be no way to reproduce > > > > > >
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
Hi Dan, # ceph report 2>/dev/null | jq .osdmap_first_committed 1 # ceph report 2>/dev/null | jq .osdmap_last_committed 4646 seems like osdmap_first_committed doesn't change at all, restarting mons doesn't help.. I don't have any down OSD, everything seems to be healty.. BR nik On Thu, Mar 12, 2020 at 03:23:25PM +0100, Dan van der Ster wrote: > If untrimed osdmaps is related, then you should check: > https://tracker.ceph.com/issues/37875, particularly #note6 > > You can see what the mon thinks the valid range of osdmaps is: > > # ceph report | jq .osdmap_first_committed > 113300 > # ceph report | jq .osdmap_last_committed > 113938 > > Then the workaround to start trimming is to restart the leader. > This shrinks the range on the mon, which then starts telling the osds > to trim range. > Note that the OSDs will only trim 30 osdmaps for each new osdmap > generated -- so if you have a lot of osdmaps to trim, you need to > generate more. > > -- dan > > > On Thu, Mar 12, 2020 at 11:02 AM Nikola Ciprich > wrote: > > > > OK, > > > > so I can confirm that at least in my case, the problem is caused > > by old osd maps not being pruned for some reason, and thus not fitting > > into cache. When I increased osd map cache to 5000 the problem is gone. > > > > The question is why they're not being pruned, even though the cluster is in > > healthy state. But you can try checking: > > > > ceph daemon osd.X status to see how many maps are your OSDs storing > > and ceph daemon osd.X perf dump | grep osd_map_cache_miss > > > > to see if you're experiencing similar problem.. > > > > so I'm going to debug further.. > > > > BR > > > > nik > > > > On Thu, Mar 12, 2020 at 09:16:58AM +0100, Nikola Ciprich wrote: > > > Hi Paul and others, > > > > > > while digging deeper, I noticed that when the cluster gets into this > > > state, osd_map_cache_miss on OSDs starts growing rapidly.. even when > > > I increased osd map cache size to 500 (which was the default at least > > > for luminous) it behaves the same.. > > > > > > I think this could be related.. > > > > > > I'll try playing more with cache settings.. > > > > > > BR > > > > > > nik > > > > > > > > > > > > On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote: > > > > Encountered this one again today, I've updated the issue with new > > > > information: https://tracker.ceph.com/issues/44184 > > > > > > > > > > > > Paul > > > > > > > > -- > > > > Paul Emmerich > > > > > > > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > > > > > > > croit GmbH > > > > Freseniusstr. 31h > > > > 81247 München > > > > www.croit.io > > > > Tel: +49 89 1896585 90 > > > > > > > > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > I just wanted to report we've just hit very similar problem.. on mimic > > > > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow > > > > > ops caused by waiting for new map. It seems those are slowed by SATA > > > > > OSDs which keep being 100% busy reading for long time until all ops > > > > > are gone, > > > > > blocking OPS on unrelated NVME pools - SATA pools are completely > > > > > unused now. > > > > > > > > > > is this possible that those maps are being requested from slow SATA > > > > > OSDs > > > > > and it takes such a long time for some reason? why could it take so > > > > > long? > > > > > the cluster is very small with very light load.. > > > > > > > > > > BR > > > > > > > > > > nik > > > > > > > > > > > > > > > > > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > > > > > > > > > > > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander > > > > > > > wrote: > > > > > > >> > > > > > > >> > > > > > > >> > > > > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > > > > > >>> I've also seen this problem on Nautilus with no obvious reason > > > > > > >>> for the > > > > > > >>> slowness once. > > > > > > >> > > > > > > >> Did this resolve itself? Or did you remove the pool? > > > > > > > > > > > > > > I've seen this twice on the same cluster, it fixed itself the > > > > > > > first > > > > > > > time (maybe with some OSD restarts?) and the other time I removed > > > > > > > the > > > > > > > pool after a few minutes because the OSDs were running into > > > > > > > heartbeat > > > > > > > timeouts. There unfortunately seems to be no way to reproduce > > > > > > > this :( > > > > > > > > > > > > > > > > > > > Yes, that's the problem. I've been trying to reproduce it, but I > > > > > > can't. > > > > > > It works on all my Nautilus systems except for this one. > > > > > > > > > > > > As you saw it, Bryan saw it, I expect others to encounter this at > > > > > > some > > > > > > point as well. > > > > > > > > > > > > I don't have any extensive logging as this cluster is in production > > > > > > and > > > > > > I can't simply crank up the
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
If untrimed osdmaps is related, then you should check: https://tracker.ceph.com/issues/37875, particularly #note6 You can see what the mon thinks the valid range of osdmaps is: # ceph report | jq .osdmap_first_committed 113300 # ceph report | jq .osdmap_last_committed 113938 Then the workaround to start trimming is to restart the leader. This shrinks the range on the mon, which then starts telling the osds to trim range. Note that the OSDs will only trim 30 osdmaps for each new osdmap generated -- so if you have a lot of osdmaps to trim, you need to generate more. -- dan On Thu, Mar 12, 2020 at 11:02 AM Nikola Ciprich wrote: > > OK, > > so I can confirm that at least in my case, the problem is caused > by old osd maps not being pruned for some reason, and thus not fitting > into cache. When I increased osd map cache to 5000 the problem is gone. > > The question is why they're not being pruned, even though the cluster is in > healthy state. But you can try checking: > > ceph daemon osd.X status to see how many maps are your OSDs storing > and ceph daemon osd.X perf dump | grep osd_map_cache_miss > > to see if you're experiencing similar problem.. > > so I'm going to debug further.. > > BR > > nik > > On Thu, Mar 12, 2020 at 09:16:58AM +0100, Nikola Ciprich wrote: > > Hi Paul and others, > > > > while digging deeper, I noticed that when the cluster gets into this > > state, osd_map_cache_miss on OSDs starts growing rapidly.. even when > > I increased osd map cache size to 500 (which was the default at least > > for luminous) it behaves the same.. > > > > I think this could be related.. > > > > I'll try playing more with cache settings.. > > > > BR > > > > nik > > > > > > > > On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote: > > > Encountered this one again today, I've updated the issue with new > > > information: https://tracker.ceph.com/issues/44184 > > > > > > > > > Paul > > > > > > -- > > > Paul Emmerich > > > > > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > > > > > croit GmbH > > > Freseniusstr. 31h > > > 81247 München > > > www.croit.io > > > Tel: +49 89 1896585 90 > > > > > > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich > > > wrote: > > > > > > > > Hi, > > > > > > > > I just wanted to report we've just hit very similar problem.. on mimic > > > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow > > > > ops caused by waiting for new map. It seems those are slowed by SATA > > > > OSDs which keep being 100% busy reading for long time until all ops are > > > > gone, > > > > blocking OPS on unrelated NVME pools - SATA pools are completely unused > > > > now. > > > > > > > > is this possible that those maps are being requested from slow SATA OSDs > > > > and it takes such a long time for some reason? why could it take so > > > > long? > > > > the cluster is very small with very light load.. > > > > > > > > BR > > > > > > > > nik > > > > > > > > > > > > > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > > > > > > > > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander > > > > > > wrote: > > > > > >> > > > > > >> > > > > > >> > > > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > > > > >>> I've also seen this problem on Nautilus with no obvious reason > > > > > >>> for the > > > > > >>> slowness once. > > > > > >> > > > > > >> Did this resolve itself? Or did you remove the pool? > > > > > > > > > > > > I've seen this twice on the same cluster, it fixed itself the first > > > > > > time (maybe with some OSD restarts?) and the other time I removed > > > > > > the > > > > > > pool after a few minutes because the OSDs were running into > > > > > > heartbeat > > > > > > timeouts. There unfortunately seems to be no way to reproduce this > > > > > > :( > > > > > > > > > > > > > > > > Yes, that's the problem. I've been trying to reproduce it, but I > > > > > can't. > > > > > It works on all my Nautilus systems except for this one. > > > > > > > > > > As you saw it, Bryan saw it, I expect others to encounter this at some > > > > > point as well. > > > > > > > > > > I don't have any extensive logging as this cluster is in production > > > > > and > > > > > I can't simply crank up the logging and try again. > > > > > > > > > > > In this case it wasn't a new pool that caused problems but a very > > > > > > old one. > > > > > > > > > > > > > > > > > > Paul > > > > > > > > > > > >> > > > > > >>> In my case it was a rather old cluster that was upgraded all the > > > > > >>> way > > > > > >>> from firefly > > > > > >>> > > > > > >>> > > > > > >> > > > > > >> This cluster has also been installed with Firefly. It was > > > > > >> installed in > > > > > >> 2015, so a while ago. > > > > > >> > > > > > >> Wido > > > > > ___ > > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > > To unsubscribe
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
OK, so I can confirm that at least in my case, the problem is caused by old osd maps not being pruned for some reason, and thus not fitting into cache. When I increased osd map cache to 5000 the problem is gone. The question is why they're not being pruned, even though the cluster is in healthy state. But you can try checking: ceph daemon osd.X status to see how many maps are your OSDs storing and ceph daemon osd.X perf dump | grep osd_map_cache_miss to see if you're experiencing similar problem.. so I'm going to debug further.. BR nik On Thu, Mar 12, 2020 at 09:16:58AM +0100, Nikola Ciprich wrote: > Hi Paul and others, > > while digging deeper, I noticed that when the cluster gets into this > state, osd_map_cache_miss on OSDs starts growing rapidly.. even when > I increased osd map cache size to 500 (which was the default at least > for luminous) it behaves the same.. > > I think this could be related.. > > I'll try playing more with cache settings.. > > BR > > nik > > > > On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote: > > Encountered this one again today, I've updated the issue with new > > information: https://tracker.ceph.com/issues/44184 > > > > > > Paul > > > > -- > > Paul Emmerich > > > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > > > croit GmbH > > Freseniusstr. 31h > > 81247 München > > www.croit.io > > Tel: +49 89 1896585 90 > > > > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich > > wrote: > > > > > > Hi, > > > > > > I just wanted to report we've just hit very similar problem.. on mimic > > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow > > > ops caused by waiting for new map. It seems those are slowed by SATA > > > OSDs which keep being 100% busy reading for long time until all ops are > > > gone, > > > blocking OPS on unrelated NVME pools - SATA pools are completely unused > > > now. > > > > > > is this possible that those maps are being requested from slow SATA OSDs > > > and it takes such a long time for some reason? why could it take so long? > > > the cluster is very small with very light load.. > > > > > > BR > > > > > > nik > > > > > > > > > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > > > > > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander > > > > > wrote: > > > > >> > > > > >> > > > > >> > > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > > > >>> I've also seen this problem on Nautilus with no obvious reason for > > > > >>> the > > > > >>> slowness once. > > > > >> > > > > >> Did this resolve itself? Or did you remove the pool? > > > > > > > > > > I've seen this twice on the same cluster, it fixed itself the first > > > > > time (maybe with some OSD restarts?) and the other time I removed the > > > > > pool after a few minutes because the OSDs were running into heartbeat > > > > > timeouts. There unfortunately seems to be no way to reproduce this :( > > > > > > > > > > > > > Yes, that's the problem. I've been trying to reproduce it, but I can't. > > > > It works on all my Nautilus systems except for this one. > > > > > > > > As you saw it, Bryan saw it, I expect others to encounter this at some > > > > point as well. > > > > > > > > I don't have any extensive logging as this cluster is in production and > > > > I can't simply crank up the logging and try again. > > > > > > > > > In this case it wasn't a new pool that caused problems but a very old > > > > > one. > > > > > > > > > > > > > > > Paul > > > > > > > > > >> > > > > >>> In my case it was a rather old cluster that was upgraded all the way > > > > >>> from firefly > > > > >>> > > > > >>> > > > > >> > > > > >> This cluster has also been installed with Firefly. It was installed > > > > >> in > > > > >> 2015, so a while ago. > > > > >> > > > > >> Wido > > > > ___ > > > > ceph-users mailing list -- ceph-users@ceph.io > > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > > > > -- > > > - > > > Ing. Nikola CIPRICH > > > LinuxBox.cz, s.r.o. > > > 28.rijna 168, 709 00 Ostrava > > > > > > tel.: +420 591 166 214 > > > fax:+420 596 621 273 > > > mobil: +420 777 093 799 > > > www.linuxbox.cz > > > > > > mobil servis: +420 737 238 656 > > > email servis: ser...@linuxbox.cz > > > - > > > > -- > - > Ing. Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28.rijna 168, 709 00 Ostrava > > tel.: +420 591 166 214 > fax:+420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > - > -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
Hi Paul and others, while digging deeper, I noticed that when the cluster gets into this state, osd_map_cache_miss on OSDs starts growing rapidly.. even when I increased osd map cache size to 500 (which was the default at least for luminous) it behaves the same.. I think this could be related.. I'll try playing more with cache settings.. BR nik On Wed, Mar 11, 2020 at 03:40:04PM +0100, Paul Emmerich wrote: > Encountered this one again today, I've updated the issue with new > information: https://tracker.ceph.com/issues/44184 > > > Paul > > -- > Paul Emmerich > > Looking for help with your Ceph cluster? Contact us at https://croit.io > > croit GmbH > Freseniusstr. 31h > 81247 München > www.croit.io > Tel: +49 89 1896585 90 > > On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich > wrote: > > > > Hi, > > > > I just wanted to report we've just hit very similar problem.. on mimic > > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow > > ops caused by waiting for new map. It seems those are slowed by SATA > > OSDs which keep being 100% busy reading for long time until all ops are > > gone, > > blocking OPS on unrelated NVME pools - SATA pools are completely unused now. > > > > is this possible that those maps are being requested from slow SATA OSDs > > and it takes such a long time for some reason? why could it take so long? > > the cluster is very small with very light load.. > > > > BR > > > > nik > > > > > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander > > > > wrote: > > > >> > > > >> > > > >> > > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > > >>> I've also seen this problem on Nautilus with no obvious reason for the > > > >>> slowness once. > > > >> > > > >> Did this resolve itself? Or did you remove the pool? > > > > > > > > I've seen this twice on the same cluster, it fixed itself the first > > > > time (maybe with some OSD restarts?) and the other time I removed the > > > > pool after a few minutes because the OSDs were running into heartbeat > > > > timeouts. There unfortunately seems to be no way to reproduce this :( > > > > > > > > > > Yes, that's the problem. I've been trying to reproduce it, but I can't. > > > It works on all my Nautilus systems except for this one. > > > > > > As you saw it, Bryan saw it, I expect others to encounter this at some > > > point as well. > > > > > > I don't have any extensive logging as this cluster is in production and > > > I can't simply crank up the logging and try again. > > > > > > > In this case it wasn't a new pool that caused problems but a very old > > > > one. > > > > > > > > > > > > Paul > > > > > > > >> > > > >>> In my case it was a rather old cluster that was upgraded all the way > > > >>> from firefly > > > >>> > > > >>> > > > >> > > > >> This cluster has also been installed with Firefly. It was installed in > > > >> 2015, so a while ago. > > > >> > > > >> Wido > > > ___ > > > ceph-users mailing list -- ceph-users@ceph.io > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > > -- > > - > > Ing. Nikola CIPRICH > > LinuxBox.cz, s.r.o. > > 28.rijna 168, 709 00 Ostrava > > > > tel.: +420 591 166 214 > > fax:+420 596 621 273 > > mobil: +420 777 093 799 > > www.linuxbox.cz > > > > mobil servis: +420 737 238 656 > > email servis: ser...@linuxbox.cz > > - > -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
Encountered this one again today, I've updated the issue with new information: https://tracker.ceph.com/issues/44184 Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Sat, Feb 29, 2020 at 10:21 PM Nikola Ciprich wrote: > > Hi, > > I just wanted to report we've just hit very similar problem.. on mimic > (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow > ops caused by waiting for new map. It seems those are slowed by SATA > OSDs which keep being 100% busy reading for long time until all ops are gone, > blocking OPS on unrelated NVME pools - SATA pools are completely unused now. > > is this possible that those maps are being requested from slow SATA OSDs > and it takes such a long time for some reason? why could it take so long? > the cluster is very small with very light load.. > > BR > > nik > > > > On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander wrote: > > >> > > >> > > >> > > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > > >>> I've also seen this problem on Nautilus with no obvious reason for the > > >>> slowness once. > > >> > > >> Did this resolve itself? Or did you remove the pool? > > > > > > I've seen this twice on the same cluster, it fixed itself the first > > > time (maybe with some OSD restarts?) and the other time I removed the > > > pool after a few minutes because the OSDs were running into heartbeat > > > timeouts. There unfortunately seems to be no way to reproduce this :( > > > > > > > Yes, that's the problem. I've been trying to reproduce it, but I can't. > > It works on all my Nautilus systems except for this one. > > > > As you saw it, Bryan saw it, I expect others to encounter this at some > > point as well. > > > > I don't have any extensive logging as this cluster is in production and > > I can't simply crank up the logging and try again. > > > > > In this case it wasn't a new pool that caused problems but a very old one. > > > > > > > > > Paul > > > > > >> > > >>> In my case it was a rather old cluster that was upgraded all the way > > >>> from firefly > > >>> > > >>> > > >> > > >> This cluster has also been installed with Firefly. It was installed in > > >> 2015, so a while ago. > > >> > > >> Wido > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > -- > - > Ing. Nikola CIPRICH > LinuxBox.cz, s.r.o. > 28.rijna 168, 709 00 Ostrava > > tel.: +420 591 166 214 > fax:+420 596 621 273 > mobil: +420 777 093 799 > www.linuxbox.cz > > mobil servis: +420 737 238 656 > email servis: ser...@linuxbox.cz > - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
Hi, I just wanted to report we've just hit very similar problem.. on mimic (13.2.6). Any manipulation with OSD (ie restart) causes lot of slow ops caused by waiting for new map. It seems those are slowed by SATA OSDs which keep being 100% busy reading for long time until all ops are gone, blocking OPS on unrelated NVME pools - SATA pools are completely unused now. is this possible that those maps are being requested from slow SATA OSDs and it takes such a long time for some reason? why could it take so long? the cluster is very small with very light load.. BR nik On Wed, Feb 19, 2020 at 10:03:35AM +0100, Wido den Hollander wrote: > > > On 2/19/20 9:34 AM, Paul Emmerich wrote: > > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander wrote: > >> > >> > >> > >> On 2/18/20 6:54 PM, Paul Emmerich wrote: > >>> I've also seen this problem on Nautilus with no obvious reason for the > >>> slowness once. > >> > >> Did this resolve itself? Or did you remove the pool? > > > > I've seen this twice on the same cluster, it fixed itself the first > > time (maybe with some OSD restarts?) and the other time I removed the > > pool after a few minutes because the OSDs were running into heartbeat > > timeouts. There unfortunately seems to be no way to reproduce this :( > > > > Yes, that's the problem. I've been trying to reproduce it, but I can't. > It works on all my Nautilus systems except for this one. > > As you saw it, Bryan saw it, I expect others to encounter this at some > point as well. > > I don't have any extensive logging as this cluster is in production and > I can't simply crank up the logging and try again. > > > In this case it wasn't a new pool that caused problems but a very old one. > > > > > > Paul > > > >> > >>> In my case it was a rather old cluster that was upgraded all the way > >>> from firefly > >>> > >>> > >> > >> This cluster has also been installed with Firefly. It was installed in > >> 2015, so a while ago. > >> > >> Wido > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 621 273 mobil: +420 777 093 799 www.linuxbox.cz mobil servis: +420 737 238 656 email servis: ser...@linuxbox.cz - ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
On 2/19/20 9:34 AM, Paul Emmerich wrote: > On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander wrote: >> >> >> >> On 2/18/20 6:54 PM, Paul Emmerich wrote: >>> I've also seen this problem on Nautilus with no obvious reason for the >>> slowness once. >> >> Did this resolve itself? Or did you remove the pool? > > I've seen this twice on the same cluster, it fixed itself the first > time (maybe with some OSD restarts?) and the other time I removed the > pool after a few minutes because the OSDs were running into heartbeat > timeouts. There unfortunately seems to be no way to reproduce this :( > Yes, that's the problem. I've been trying to reproduce it, but I can't. It works on all my Nautilus systems except for this one. As you saw it, Bryan saw it, I expect others to encounter this at some point as well. I don't have any extensive logging as this cluster is in production and I can't simply crank up the logging and try again. > In this case it wasn't a new pool that caused problems but a very old one. > > > Paul > >> >>> In my case it was a rather old cluster that was upgraded all the way >>> from firefly >>> >>> >> >> This cluster has also been installed with Firefly. It was installed in >> 2015, so a while ago. >> >> Wido ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
On 2/19/20 9:21 AM, Dan van der Ster wrote: > On Wed, Feb 19, 2020 at 7:29 AM Wido den Hollander wrote: >> >> >> >> On 2/18/20 6:54 PM, Paul Emmerich wrote: >>> I've also seen this problem on Nautilus with no obvious reason for the >>> slowness once. >> >> Did this resolve itself? Or did you remove the pool? >> >>> In my case it was a rather old cluster that was upgraded all the way >>> from firefly >>> >>> >> >> This cluster has also been installed with Firefly. It was installed in >> 2015, so a while ago. > > FileStore vs. BlueStore relevant ? > We've checked that. All the OSDs involved are SSD and running on BlueStore. Convertered to BlueStore under Luminous. There are some HDD OSDs left on FileStore. Wido > -- dan > > >> >> Wido >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
On Wed, Feb 19, 2020 at 7:26 AM Wido den Hollander wrote: > > > > On 2/18/20 6:54 PM, Paul Emmerich wrote: > > I've also seen this problem on Nautilus with no obvious reason for the > > slowness once. > > Did this resolve itself? Or did you remove the pool? I've seen this twice on the same cluster, it fixed itself the first time (maybe with some OSD restarts?) and the other time I removed the pool after a few minutes because the OSDs were running into heartbeat timeouts. There unfortunately seems to be no way to reproduce this :( In this case it wasn't a new pool that caused problems but a very old one. Paul > > > In my case it was a rather old cluster that was upgraded all the way > > from firefly > > > > > > This cluster has also been installed with Firefly. It was installed in > 2015, so a while ago. > > Wido ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
On Wed, Feb 19, 2020 at 7:29 AM Wido den Hollander wrote: > > > > On 2/18/20 6:54 PM, Paul Emmerich wrote: > > I've also seen this problem on Nautilus with no obvious reason for the > > slowness once. > > Did this resolve itself? Or did you remove the pool? > > > In my case it was a rather old cluster that was upgraded all the way > > from firefly > > > > > > This cluster has also been installed with Firefly. It was installed in > 2015, so a while ago. FileStore vs. BlueStore relevant ? -- dan > > Wido > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
I've also seen this problem on Nautilus with no obvious reason for the slowness once. In my case it was a rather old cluster that was upgraded all the way from firefly -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Tue, Feb 18, 2020 at 5:52 PM Wido den Hollander wrote: > > > > On 8/27/19 11:49 PM, Bryan Stillwell wrote: > > We've run into a problem on our test cluster this afternoon which is > > running Nautilus (14.2.2). It seems that any time PGs move on the cluster > > (from marking an OSD down, setting the primary-affinity to 0, or by using > > the balancer), a large number of the OSDs in the cluster peg the CPU cores > > they're running on for a while which causes slow requests. From what I can > > tell it appears to be related to slow peering caused by osd_pg_create() > > taking a long time. > > > > This was seen on quite a few OSDs while waiting for peering to complete: > > > > # ceph daemon osd.3 ops > > { > > "ops": [ > > { > > "description": "osd_pg_create(e179061 287.7a:177739 > > 287.9a:177739 287.e2:177739 287.e7:177739 287.f6:177739 287.187:177739 > > 287.1aa:177739 287.216:177739 287.306:177739 287.3e6:177739)", > > "initiated_at": "2019-08-27 14:34:46.556413", > > "age": 318.2523453801, > > "duration": 318.2524189532, > > "type_data": { > > "flag_point": "started", > > "events": [ > > { > > "time": "2019-08-27 14:34:46.556413", > > "event": "initiated" > > }, > > { > > "time": "2019-08-27 14:34:46.556413", > > "event": "header_read" > > }, > > { > > "time": "2019-08-27 14:34:46.556299", > > "event": "throttled" > > }, > > { > > "time": "2019-08-27 14:34:46.556456", > > "event": "all_read" > > }, > > { > > "time": "2019-08-27 14:35:12.456901", > > "event": "dispatched" > > }, > > { > > "time": "2019-08-27 14:35:12.456903", > > "event": "wait for new map" > > }, > > { > > "time": "2019-08-27 14:40:01.292346", > > "event": "started" > > } > > ] > > } > > }, > > ...snip... > > { > > "description": "osd_pg_create(e179066 287.7a:177739 > > 287.9a:177739 287.e2:177739 287.e7:177739 287.f6:177739 287.187:177739 > > 287.1aa:177739 287.216:177739 287.306:177739 287.3e6:177739)", > > "initiated_at": "2019-08-27 14:35:09.908567", > > "age": 294.900191001, > > "duration": 294.9006841689, > > "type_data": { > > "flag_point": "delayed", > > "events": [ > > { > > "time": "2019-08-27 14:35:09.908567", > > "event": "initiated" > > }, > > { > > "time": "2019-08-27 14:35:09.908567", > > "event": "header_read" > > }, > > { > > "time": "2019-08-27 14:35:09.908520", > > "event": "throttled" > > }, > > { > > "time": "2019-08-27 14:35:09.908617", > > "event": "all_read" > > }, > > { > > "time": "2019-08-27 14:35:12.456921", > > "event": "dispatched" > > }, > > { > > "time": "2019-08-27 14:35:12.456923", > > "event": "wait for new map" > > } > > ] > > } > > } > > ], > > "num_ops": 6 > > } > > > > > > That "wait for new map" message made us think something was getting hung up > > on the monitors, so we restarted them all without any luck. > > > > I'll keep investigating, but so far my google searches aren't pulling > > anything up so I wanted to see if anyone else is running into this? > > > > I've seen this twice now on a ~1400 OSD cluster running Nautilus. > > I created a bug report for this: https://tracker.ceph.com/issues/44184 > > Did you make any progress on this or run into it a second time? > > Wido > > > Thanks, > > Bryan > >
[ceph-users] Re: osd_pg_create causing slow requests in Nautilus
On 8/27/19 11:49 PM, Bryan Stillwell wrote: > We've run into a problem on our test cluster this afternoon which is running > Nautilus (14.2.2). It seems that any time PGs move on the cluster (from > marking an OSD down, setting the primary-affinity to 0, or by using the > balancer), a large number of the OSDs in the cluster peg the CPU cores > they're running on for a while which causes slow requests. From what I can > tell it appears to be related to slow peering caused by osd_pg_create() > taking a long time. > > This was seen on quite a few OSDs while waiting for peering to complete: > > # ceph daemon osd.3 ops > { > "ops": [ > { > "description": "osd_pg_create(e179061 287.7a:177739 287.9a:177739 > 287.e2:177739 287.e7:177739 287.f6:177739 287.187:177739 287.1aa:177739 > 287.216:177739 287.306:177739 287.3e6:177739)", > "initiated_at": "2019-08-27 14:34:46.556413", > "age": 318.2523453801, > "duration": 318.2524189532, > "type_data": { > "flag_point": "started", > "events": [ > { > "time": "2019-08-27 14:34:46.556413", > "event": "initiated" > }, > { > "time": "2019-08-27 14:34:46.556413", > "event": "header_read" > }, > { > "time": "2019-08-27 14:34:46.556299", > "event": "throttled" > }, > { > "time": "2019-08-27 14:34:46.556456", > "event": "all_read" > }, > { > "time": "2019-08-27 14:35:12.456901", > "event": "dispatched" > }, > { > "time": "2019-08-27 14:35:12.456903", > "event": "wait for new map" > }, > { > "time": "2019-08-27 14:40:01.292346", > "event": "started" > } > ] > } > }, > ...snip... > { > "description": "osd_pg_create(e179066 287.7a:177739 287.9a:177739 > 287.e2:177739 287.e7:177739 287.f6:177739 287.187:177739 287.1aa:177739 > 287.216:177739 287.306:177739 287.3e6:177739)", > "initiated_at": "2019-08-27 14:35:09.908567", > "age": 294.900191001, > "duration": 294.9006841689, > "type_data": { > "flag_point": "delayed", > "events": [ > { > "time": "2019-08-27 14:35:09.908567", > "event": "initiated" > }, > { > "time": "2019-08-27 14:35:09.908567", > "event": "header_read" > }, > { > "time": "2019-08-27 14:35:09.908520", > "event": "throttled" > }, > { > "time": "2019-08-27 14:35:09.908617", > "event": "all_read" > }, > { > "time": "2019-08-27 14:35:12.456921", > "event": "dispatched" > }, > { > "time": "2019-08-27 14:35:12.456923", > "event": "wait for new map" > } > ] > } > } > ], > "num_ops": 6 > } > > > That "wait for new map" message made us think something was getting hung up > on the monitors, so we restarted them all without any luck. > > I'll keep investigating, but so far my google searches aren't pulling > anything up so I wanted to see if anyone else is running into this? > I've seen this twice now on a ~1400 OSD cluster running Nautilus. I created a bug report for this: https://tracker.ceph.com/issues/44184 Did you make any progress on this or run into it a second time? Wido > Thanks, > Bryan > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io