[ceph-users] Re: ceph pg mark_unfound_lost delete results in confused ceph
Hi, just in case someone else might run into this or similar issues. The following helped to solve the issue: 1. restarting the active mgr brought: pg 10.17 is stuck inactive for 18m, current state unknown, last acting [] .. the pg into inactive without last acting 2. so we recreated the pg ( as there is anyway no data ): ceph osd force-create-pg 10.17 --yes-i-really-mean-it Before the command: #ceph pg 10.17 query Error ENOENT: i don't have pgid 10.17 after the command: # ceph pg 10.17 query { "snap_trimq": "[]", "snap_trimq_len": 0, "state": "active+clean", "epoch": 14555, "up": [ 5, 6 ], "acting": [ 5, 6 [...] -- Mit freundlichen Gruessen / Best regards Oliver Dzombic Layer7 Networks mailto:i...@layer7.net Anschrift: Layer7 Networks GmbH Zum Sonnenberg 1-3 63571 Gelnhausen HRB 96293 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic UST ID: DE259845632 On 15/01/2024 22:24, Oliver Dzombic wrote: Hi, after osd.15 died in the wrong moment there is: #ceph health detail [WRN] PG_AVAILABILITY: Reduced data availability: 1 pg stale pg 10.17 is stuck stale for 3d, current state stale+active+undersized+degraded, last acting [15] [WRN] PG_DEGRADED: Degraded data redundancy: 172/57063399 objects degraded (0.000%), 1 pg degraded, 1 pg undersized pg 10.17 is stuck undersized for 3d, current state stale+active+undersized+degraded, last acting [15] which will never resolv as there is no osd.15 anymore. So a ceph pg 10.17 mark_unfound_lost delete was executed. ceph seems to be a bit confused about pg 10.17 now: While this worked before, its not working anymore # ceph pg 10.17 query Error ENOENT: i don't have pgid 10.17 And while this was pointing to 15 the map changed now to 5 and 6 ( which is correct ): # ceph pg map 10.17 osdmap e14425 pg 10.17 (10.17) -> up [5,6] acting [5,6] According to ceph health, ceph assumes that osd.15 is still somehow in charge. The pg map seems to think that 10.17 is on osd.5 and osd.6 But pg 10.17 seems not to be really existing, as a query will fail. Any idea whats going wrong and howto fix this? Thank you! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] ceph pg mark_unfound_lost delete results in confused ceph
Hi, after osd.15 died in the wrong moment there is: #ceph health detail [WRN] PG_AVAILABILITY: Reduced data availability: 1 pg stale pg 10.17 is stuck stale for 3d, current state stale+active+undersized+degraded, last acting [15] [WRN] PG_DEGRADED: Degraded data redundancy: 172/57063399 objects degraded (0.000%), 1 pg degraded, 1 pg undersized pg 10.17 is stuck undersized for 3d, current state stale+active+undersized+degraded, last acting [15] which will never resolv as there is no osd.15 anymore. So a ceph pg 10.17 mark_unfound_lost delete was executed. ceph seems to be a bit confused about pg 10.17 now: While this worked before, its not working anymore # ceph pg 10.17 query Error ENOENT: i don't have pgid 10.17 And while this was pointing to 15 the map changed now to 5 and 6 ( which is correct ): # ceph pg map 10.17 osdmap e14425 pg 10.17 (10.17) -> up [5,6] acting [5,6] According to ceph health, ceph assumes that osd.15 is still somehow in charge. The pg map seems to think that 10.17 is on osd.5 and osd.6 But pg 10.17 seems not to be really existing, as a query will fail. Any idea whats going wrong and howto fix this? Thank you! -- Mit freundlichen Gruessen / Best regards Oliver Dzombic Layer7 Networks mailto:i...@layer7.net Anschrift: Layer7 Networks GmbH Zum Sonnenberg 1-3 63571 Gelnhausen HRB 96293 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic UST ID: DE259845632 OpenPGP_0x627BE440332A7AD0.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CEPH Mirrors are lacking packages
Hi, thank you for your hint Burkhard! For de.ceph.com i changed the sync source from eu to us ( download.ceph.com ). So at least de.ceph.com should be in sync with the main source within the next 24h. -- Mit freundlichen Gruessen / Best regards Oliver Dzombic Layer7 Networks mailto:i...@layer7.net Anschrift: Layer7 Networks GmbH Zum Sonnenberg 1-3 63571 Gelnhausen HRB 96293 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic UST ID: DE259845632 On 17.04.23 09:39, Burkhard Linke wrote: > Hi, > > > at least eu.ceph.com and de.ceph.com are lacking packages for the > pacific release. All package not start with "c" (e.g. librbd, librados, > radosgw) are missing. > > > Best regards, > > Burkhard Linke > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] rbd: map failed: rbd: sysfs write failed -- (108) Cannot send after transport endpoint shutdown
700 10 librbd::image::CloseRequest: 0x55846e97a510 send_flush_readahead 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::image::CloseRequest: 0x55846e97a510 handle_flush_readahead: r=0 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::image::CloseRequest: 0x55846e97a510 send_shut_down_object_dispatcher 2021-07-01T10:28:44.270+0200 7f8028ff9700 5 librbd::io::ObjectDispatcher: 0x55846e979670 shut_down: 2021-07-01T10:28:44.270+0200 7f8028ff9700 5 librbd::io::ObjectDispatch: 0x55846e97b430 shut_down: 2021-07-01T10:28:44.270+0200 7f8028ff9700 5 librbd::cache::WriteAroundObjectDispatch: 0x7f80180020d0 shut_down: 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::image::CloseRequest: 0x55846e97a510 handle_shut_down_object_dispatcher: r=0 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::image::CloseRequest: 0x55846e97a510 send_flush_op_work_queue 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::image::CloseRequest: 0x55846e97a510 handle_flush_op_work_queue: r=0 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::image::CloseRequest: 0x55846e97a510 handle_flush_image_watcher: r=0 2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::ImageState: 0x7f802c001420 0x7f802c001420 handle_close: r=0 rbd: map failed: (108) Cannot send after transport endpoint shutdown ... which does actually not tell me too much honestly. For now, i am out of idea's, can someone lead a way to analyse / fix this ? New volumes can be created and removed without issues. But they wont map. Thank you! -- Mit freundlichen Gruessen / Best regards Oliver Dzombic Layer7 Networks mailto:i...@layer7.net Anschrift: Layer7 Networks GmbH Zum Sonnenberg 1-3 63571 Gelnhausen HRB 96293 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic UST ID: DE259845632 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io