Miroslav,

On 12/11/2020 4:57 PM, Miroslav Boháč wrote:
Hi Igor,

thank you. Yes you are right.

It seems that the background removal is completed.
you can inspect "numpg_removing" performance counter to make sure it's been completed.

The correct way to fix it is "ceph-kvstore-tool bluestore-kv <path-to-osd> compact" to all OSD (one by one)?

Right.

You can want to try online compaction: ceph tell osd.X compact

I'm still not sure the latter is effective in fixing the issue though. Will appreciate if you share your experience on that.

Regards,
Miroslav

pá 11. 12. 2020 v 14:19 odesílatel Igor Fedotov <ifedo...@suse.de <mailto:ifedo...@suse.de>> napsal:

    Hi Miroslav,

    haven't you performed massive data removal (PG migration) recently?

    If so you might want to apply manual DB compaction to your OSDs.

    The positive effect might be just temporary if background removals
    are
    still in progress though..


    See https://tracker.ceph.com/issues/47044
    <https://tracker.ceph.com/issues/47044> and the references for more
    info on the issue.


    Thanks,

    Igor




    On 12/11/2020 12:50 PM, Miroslav Boháč wrote:
    > Hi,
    >
    > I have a problem with crashing OSD daemons in our Ceph 15.2.6
    cluster . The
    > problem was temporarily resolved by disabling scrub and
    deep-scrub. All PGs
    > are active+clean. After a few days I tried to enable scrubbing
    again, but
    > the problem persists. OSD with high latencies, PG laggy, osd not
    > responding, OSD was marked down and the cluster is not usable.
    Problem
    > appeared after failover when 10 OSD was marked as out.
    >
    > I will appreciate any help or advice, how to resolve this problem.
    >
    > Regards,
    > Miroslav
    >
    >
    >
    >
    >
    > 2020-12-10T21:39:57.721883+0100 osd.1 (osd.1) 5 : cluster [DBG]
    18.11
    > deep-scrub starts
    >
    > 2020-12-10T21:39:57.880861+0100 osd.1 (osd.1) 6 : cluster [DBG]
    18.11
    > deep-scrub ok
    >
    > 2020-12-10T21:39:58.713422+0100 osd.1 (osd.1) 7 : cluster [DBG]
    43.5 scrub
    > starts
    >
    > 2020-12-10T21:39:58.719372+0100 osd.1 (osd.1) 8 : cluster [DBG]
    43.5 scrub
    > ok
    >
    > 2020-12-10T21:39:59.296962+0100 mgr.pve2-prg2a (mgr.91575377)
    118746 :
    > cluster [DBG] pgmap v119000: 1737 pgs: 3 active+clean+laggy, 3
    > active+clean+scrubbing+deep, 1731 active+clean; 3.3 TiB data,
    7.8 TiB used,
    > 21 TiB / 29 TiB avail; 117 KiB/s rd, 2.5 MiB/s wr, 269 op/s
    >
    > 2020-12-10T21:40:00.088421+0100 osd.29 (osd.29) 74 : cluster
    [DBG] 1.13b
    > deep-scrub starts
    >
    > 2020-12-10T21:40:01.300373+0100 mgr.pve2-prg2a (mgr.91575377)
    118747 :
    > cluster [DBG] pgmap v119001: 1737 pgs: 3 active+clean+laggy, 3
    > active+clean+scrubbing+deep, 1731 active+clean; 3.3 TiB data,
    7.8 TiB used,
    > 21 TiB / 29 TiB avail; 101 KiB/s rd, 1.9 MiB/s wr, 202 op/s
    >
    > 2020-12-10T21:40:02.681058+0100 osd.34 (osd.34) 13 : cluster
    [DBG] 1.a2
    > deep-scrub ok
    >
    > 2020-12-10T21:40:03.304009+0100 mgr.pve2-prg2a (mgr.91575377)
    118749 :
    > cluster [DBG] pgmap v119002: 1737 pgs: 3 active+clean+laggy, 3
    > active+clean+scrubbing+deep, 1731 active+clean; 3.3 TiB data,
    7.8 TiB used,
    > 21 TiB / 29 TiB avail; 101 KiB/s rd, 1.9 MiB/s wr, 198 op/s
    >
    > 2020-12-10T21:40:05.316233+0100 mgr.pve2-prg2a (mgr.91575377)
    118750 :
    > cluster [DBG] pgmap v119003: 1737 pgs: 6 active+clean+laggy, 3
    > active+clean+scrubbing+deep, 1728 active+clean; 3.3 TiB data,
    7.8 TiB used,
    > 21 TiB / 29 TiB avail; 150 KiB/s rd, 3.0 MiB/s wr, 249 op/s
    >
    > 2020-12-10T21:40:07.319643+0100 mgr.pve2-prg2a (mgr.91575377)
    118751 :
    > cluster [DBG] pgmap v119004: 1737 pgs: 6 active+clean+laggy, 3
    > active+clean+scrubbing+deep, 1728 active+clean; 3.3 TiB data,
    7.8 TiB used,
    > 21 TiB / 29 TiB avail; 142 KiB/s rd, 2.3 MiB/s wr, 212 op/s
    >
    >
    >
    >
    >
    > 2020-12-10T21:40:15.523134+0100 mon.pve1-prg2a (mon.0) 125943 :
    cluster
    > [DBG] osd.4 reported failed by osd.24
    >
    > 2020-12-10T21:40:15.523325+0100 mon.pve1-prg2a (mon.0) 125944 :
    cluster
    > [DBG] osd.39 reported failed by osd.24
    >
    > 2020-12-10T21:40:16.112299+0100 mon.pve1-prg2a (mon.0) 125946 :
    cluster
    > [WRN] Health check failed: 0 slow ops, oldest one blocked for 32
    sec, osd.8
    > has slow ops (SLOW_OPS)
    >
    > 2020-12-10T21:40:16.202867+0100 mon.pve1-prg2a (mon.0) 125947 :
    cluster
    > [DBG] osd.4 reported failed by osd.34
    >
    > 2020-12-10T21:40:16.202986+0100 mon.pve1-prg2a (mon.0) 125948 :
    cluster
    > [INF] osd.4 failed (root=default,host=pve1-prg2a) (2 reporters from
    > different host after 24.000267 >= grace 22.361677)
    >
    > 2020-12-10T21:40:16.373925+0100 mon.pve1-prg2a (mon.0) 125949 :
    cluster
    > [DBG] osd.39 reported failed by osd.6
    >
    > 2020-12-10T21:40:16.865608+0100 mon.pve1-prg2a (mon.0) 125951 :
    cluster
    > [DBG] osd.39 reported failed by osd.8
    >
    > 2020-12-10T21:40:17.125917+0100 mon.pve1-prg2a (mon.0) 125952 :
    cluster
    > [WRN] Health check failed: 1 osds down (OSD_DOWN)
    >
    > 2020-12-10T21:40:17.139006+0100 mon.pve1-prg2a (mon.0) 125953 :
    cluster
    > [DBG] osdmap e12819: 40 total, 39 up, 30 in
    >
    > 2020-12-10T21:40:17.140248+0100 mon.pve1-prg2a (mon.0) 125954 :
    cluster
    > [DBG] osd.39 reported failed by osd.21
    >
    > 2020-12-10T21:40:17.344244+0100 mgr.pve2-prg2a (mgr.91575377)
    118757 :
    > cluster [DBG] pgmap v119012: 1737 pgs: 9 peering, 61
    stale+active+clean, 1
    > active+clean+scrubbing+deep, 7 active+clean+laggy, 1659
    active+clean; 3.3
    > TiB data, 7.8 TiB used, 14 TiB / 22 TiB avail; 44 KiB/s rd, 2.5
    MiB/s wr,
    > 107 op/s
    >
    > 2020-12-10T21:40:17.378069+0100 mon.pve1-prg2a (mon.0) 125955 :
    cluster
    > [DBG] osd.39 reported failed by osd.26
    >
    > 2020-12-10T21:40:17.424429+0100 mon.pve1-prg2a (mon.0) 125956 :
    cluster
    > [DBG] osd.39 reported failed by osd.18
    >
    > 2020-12-10T21:40:17.829447+0100 mon.pve1-prg2a (mon.0) 125957 :
    cluster
    > [DBG] osd.39 reported failed by osd.36
    >
    > 2020-12-10T21:40:17.847373+0100 mon.pve1-prg2a (mon.0) 125958 :
    cluster
    > [DBG] osd.39 reported failed by osd.1
    >
    > 2020-12-10T21:40:17.858371+0100 mon.pve1-prg2a (mon.0) 125959 :
    cluster
    > [DBG] osd.39 reported failed by osd.17
    >
    > 2020-12-10T21:40:17.915755+0100 mon.pve1-prg2a (mon.0) 125960 :
    cluster
    > [DBG] osd.39 reported failed by osd.28
    >
    >
    >
    >
    >
    > 2020-12-10T21:40:24.151192+0100 mon.pve1-prg2a (mon.0) 125986 :
    cluster
    > [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data
    > availability: 1 pg peering)
    >
    > 2020-12-10T21:40:24.608038+0100 mon.pve1-prg2a (mon.0) 125987 :
    cluster
    > [WRN] Health check update: 0 slow ops, oldest one blocked for 37
    sec, osd.8
    > has slow ops (SLOW_OPS)
    >
    > 2020-12-10T21:40:25.375322+0100 mgr.pve2-prg2a (mgr.91575377)
    118761 :
    > cluster [DBG] pgmap v119017: 1737 pgs: 1
    active+clean+scrubbing+deep, 6
    > active+undersized, 40 active+undersized+wait, 13
    > active+undersized+degraded, 101 active+undersized+degraded+wait, 3
    > stale+active+clean, 2 active+clean+laggy, 1571 active+clean; 3.3
    TiB data,
    > 7.8 TiB used, 21 TiB / 29 TiB avail; 4.0 MiB/s wr, 179 op/s;
    74043/2676420
    > objects degraded (2.766%); 622 KiB/s, 0 objects/s recovering
    >
    > 2020-12-10T21:40:27.379311+0100 mgr.pve2-prg2a (mgr.91575377)
    118762 :
    > cluster [DBG] pgmap v119018: 1737 pgs: 1
    active+clean+scrubbing+deep, 6
    > active+undersized, 40 active+undersized+wait, 13
    > active+undersized+degraded, 101 active+undersized+degraded+wait, 3
    > stale+active+clean, 2 active+clean+laggy, 1571 active+clean; 3.3
    TiB data,
    > 7.8 TiB used, 21 TiB / 29 TiB avail; 3.6 MiB/s wr, 154 op/s;
    74043/2676420
    > objects degraded (2.766%); 510 KiB/s, 0 objects/s recovering
    >
    > 2020-12-10T21:40:29.391905+0100 mgr.pve2-prg2a (mgr.91575377)
    118763 :
    > cluster [DBG] pgmap v119019: 1737 pgs: 1
    active+clean+scrubbing+deep, 11
    > active+undersized, 35 active+undersized+wait, 21
    > active+undersized+degraded, 93 active+undersized+degraded+wait, 3
    > stale+active+clean, 2 active+clean+laggy, 1571 active+clean; 3.3
    TiB data,
    > 7.8 TiB used, 21 TiB / 29 TiB avail; 3.9 MiB/s wr, 170 op/s;
    74043/2676420
    > objects degraded (2.766%); 456 KiB/s, 0 objects/s recovering
    >
    > 2020-12-10T21:40:29.609801+0100 mon.pve1-prg2a (mon.0) 125991 :
    cluster
    > [WRN] Health check update: Degraded data redundancy:
    74043/2676420 objects
    > degraded (2.766%), 114 pgs degraded (PG_DEGRADED)
    >
    > 2020-12-10T21:40:29.609845+0100 mon.pve1-prg2a (mon.0) 125992 :
    cluster
    > [WRN] Health check update: 0 slow ops, oldest one blocked for 42
    sec, osd.8
    > has slow ops (SLOW_OPS)
    >
    > 2020-12-10T21:40:31.395715+0100 mgr.pve2-prg2a (mgr.91575377)
    118764 :
    > cluster [DBG] pgmap v119020: 1737 pgs: 1
    active+clean+scrubbing+deep, 11
    > active+undersized, 35 active+undersized+wait, 21
    > active+undersized+degraded, 93 active+undersized+degraded+wait, 3
    > stale+active+clean, 2 active+clean+laggy, 1571 active+clean; 3.3
    TiB data,
    > 7.8 TiB used, 21 TiB / 29 TiB avail; 154 KiB/s rd, 2.1 MiB/s wr,
    151 op/s;
    > 74043/2676420 objects degraded (2.766%); 425 KiB/s, 0 objects/s
    recovering
    >
    > 2020-12-10T21:40:32.797773+0100 osd.8 (osd.8) 19 : cluster [WRN]
    slow
    > request osd_op(client.91721949.0:169935 1.3b
    > 1:dc12cc94:::rbd_header.0373d6a5f7de48:head [watch ping cookie
    > 139741691437056] snapc 0=[] ondisk+write+known_if_redirected e12816)
    > initiated 2020-12-10T21:39:59.813054+0100 currently delayed
    >
    > 2020-12-10T21:40:33.399277+0100 mgr.pve2-prg2a (mgr.91575377)
    118765 :
    > cluster [DBG] pgmap v119021: 1737 pgs: 1
    active+clean+scrubbing+deep, 13
    > active+undersized, 33 active+undersized+wait, 28
    > active+undersized+degraded, 86 active+undersized+degraded+wait, 3
    > stale+active+clean, 2 active+clean+laggy, 1571 active+clean; 3.3
    TiB data,
    > 7.8 TiB used, 21 TiB / 29 TiB avail; 156 KiB/s rd, 1.6 MiB/s wr,
    144 op/s;
    > 74043/2676420 objects degraded (2.766%)
    >
    > 2020-12-10T21:40:33.841597+0100 osd.8 (osd.8) 20 : cluster [WRN]
    slow
    > request osd_op(client.91721949.0:169935 1.3b
    > 1:dc12cc94:::rbd_header.0373d6a5f7de48:head [watch ping cookie
    > 139741691437056] snapc 0=[] ondisk+write+known_if_redirected e12816)
    > initiated 2020-12-10T21:39:59.813054+0100 currently delayed
    >
    > 2020-12-10T21:40:34.611650+0100 mon.pve1-prg2a (mon.0) 125996 :
    cluster
    > [WRN] Health check update: 0 slow ops, oldest one blocked for 47
    sec,
    > daemons [osd.22,osd.8] have slow ops. (SLOW_OPS)
    >
    > 2020-12-10T21:40:34.612166+0100 mon.pve1-prg2a (mon.0) 125997 :
    cluster
    > [INF] osd.39 failed (root=default,host=pve5-prg2a) (5 reporters from
    > different host after 39.438374 >= grace 37.721344)
    >
    > 2020-12-10T21:40:34.615286+0100 mon.pve1-prg2a (mon.0) 125998 :
    cluster
    > [WRN] Health check update: 2 osds down (OSD_DOWN)
    >
    > 2020-12-10T21:40:34.621801+0100 mon.pve1-prg2a (mon.0) 125999 :
    cluster
    > [DBG] osdmap e12821: 40 total, 38 up, 30 in
    >
    > 2020-12-10T21:40:34.880626+0100 osd.8 (osd.8) 21 : cluster [WRN]
    slow
    > request osd_op(client.91721949.0:169935 1.3b
    > 1:dc12cc94:::rbd_header.0373d6a5f7de48:head [watch ping cookie
    > 139741691437056] snapc 0=[] ondisk+write+known_if_redirected e12816)
    > initiated 2020-12-10T21:39:59.813054+0100 currently delayed
    >
    >
    >
    >
    >
    > 8/0 sis=12819) [28,35] r=-1 lpr=12819 pi=[12507,12819)/1
    crt=12818'17226265
    > lcod 12818'17226264 mlcod 0'0 unknown NOTIFY mbc={}] exit Start
    0.000020 0
    > 0.000000
    >
    >    -823> 2020-12-10T21:40:44.079+0100 7f5310751700  5 osd.4
    pg_epoch: 12822
    > pg[1.cc( v 12818'17226265 (12812'17224439,12818'17226265]
    > local-lis/les=12507/12508 n=1566 ec=171/171 lis/c=12507/12507
    > les/c/f=12508/12508/0 sis=12819) [28,35] r=-1 lpr=12819
    pi=[12507,12819)/1
    > crt=12818'17226265 lcod 12818'17226264 mlcod 0'0 unknown NOTIFY
    mbc={}]
    > enter Started/Stray
    >
    >    -822> 2020-12-10T21:40:44.191+0100 7f5324f7a700  5 prioritycache
    > tune_memory target: 4294967296 mapped: 3494428672 unmapped:
    383787008 heap:
    > 3878215680 old mem: 2845415832 new mem: 2845415832
    >
    >    -821> 2020-12-10T21:40:44.311+0100 7f532ca01700  1 heartbeat_map
    > is_healthy 'OSD::osd_op_tp thread 0x7f530ef4e700' had timed out
    after 15
    >
    >    -820> 2020-12-10T21:40:44.311+0100 7f532ca01700  1 heartbeat_map
    > is_healthy 'OSD::osd_op_tp thread 0x7f5312f56700' had timed out
    after 15
    >
    >    -819> 2020-12-10T21:40:44.311+0100 7f532ca01700  1 osd.4
    12822 is_healthy
    > false -- internal heartbeat failed
    >
    >    -818> 2020-12-10T21:40:44.311+0100 7f532ca01700  1 osd.4
    12822 not
    > healthy; waiting to boot
    >
    >    -817> 2020-12-10T21:40:44.311+0100 7f532ca01700  1 osd.4
    12822 tick
    > checking mon for new map
    >
    >    -816> 2020-12-10T21:40:44.311+0100 7f530e74d700  5 osd.4
    12822 heartbeat
    > osd_stat(store_statfs(0x7c5f5b0000/0x2e36c0000/0xba40000000, data
    > 0x3af1c2f453/0x3afd380000, compress 0x0/0x0/0x0, omap 0x30256c, meta
    > 0x2e33bda94), peers [3,5,6,7,8,9,13,20,28,36] op hist
    > [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,22])
    >
    >    -815> 2020-12-10T21:40:44.403+0100 7f5323337700 10 monclient:
    tick
    >
    >    -814> 2020-12-10T21:40:44.403+0100 7f5323337700 10 monclient:
    > _check_auth_rotating have uptodate secrets (they expire after
    > 2020-12-10T21:40:14.409098+0100)
    >
    >    -813> 2020-12-10T21:40:44.403+0100 7f5323337700 10
    log_client  log_queue
    > is 2 last_log 7 sent 5 num 2 unsent 2 sending 2
    >
    >    -812> 2020-12-10T21:40:44.403+0100 7f5323337700 10
    log_client  will send
    > 2020-12-10T21:40:44.063147+0100 osd.4 (osd.4) 6 : cluster [WRN]
    Monitor
    > daemon marked osd.4 down, but it is still running
    >
    >    -811> 2020-12-10T21:40:44.403+0100 7f5323337700 10
    log_client  will send
    > 2020-12-10T21:40:44.063157+0100 osd.4 (osd.4) 7 : cluster [DBG]
    map e12822
    > wrongly marked me down at e12819
    >
    >    -810> 2020-12-10T21:40:44.403+0100 7f5323337700 10 monclient:
    > _send_mon_message to mon.pve1-prg2a at v2:10.104.200.11:3300/0
    <http://10.104.200.11:3300/0>
    >
    >    -809> 2020-12-10T21:40:45.191+0100 7f5324f7a700  5 prioritycache
    > tune_memory target: 4294967296 mapped: 3494428672 unmapped:
    383787008 heap:
    > 3878215680 old mem: 2845415832 new mem: 2845415832
    >
    >    -808> 2020-12-10T21:40:45.351+0100 7f532ca01700  1 heartbeat_map
    > is_healthy 'OSD::osd_op_tp thread 0x7f530ef4e700' had timed out
    after 15
    >
    >    -807> 2020-12-10T21:40:45.351+0100 7f532ca01700  1 heartbeat_map
    > is_healthy 'OSD::osd_op_tp thread 0x7f5312f56700' had timed out
    after 15
    >
    >
    >
    >
    >
    > -3> 2020-12-10T21:42:10.466+0100 7f5324f7a700  5 prioritycache
    tune_memory
    > target: 4294967296 mapped: 3510689792 unmapped: 367525888 heap:
    3878215680
    > old mem: 2845415832 new mem: 2845415832
    >
    >      -2> 2020-12-10T21:42:10.694+0100 7f532ca01700  1 heartbeat_map
    > is_healthy 'OSD::osd_op_tp thread 0x7f530ef4e700' had timed out
    after 15
    >
    >      -1> 2020-12-10T21:42:10.694+0100 7f532ca01700  1 heartbeat_map
    > is_healthy 'OSD::osd_op_tp thread 0x7f530ef4e700' had suicide
    timed out
    > after 150
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@ceph.io
    <mailto:ceph-users@ceph.io>
    > To unsubscribe send an email to ceph-users-le...@ceph.io
    <mailto:ceph-users-le...@ceph.io>

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to