[ceph-users] Re: ceph pg mark_unfound_lost delete results in confused ceph

2024-01-16 Thread Oliver Dzombic

Hi,

just in case someone else might run into this or similar issues.

The following helped to solve the issue:

1. restarting the active mgr

brought:

pg 10.17 is stuck inactive for 18m, current state unknown, last acting []

.. the pg into inactive without last acting


2. so we recreated the pg ( as there is anyway no data ):

ceph osd force-create-pg 10.17 --yes-i-really-mean-it


Before the command:

#ceph pg 10.17 query
Error ENOENT: i don't have pgid 10.17

after the command:

# ceph pg 10.17 query
{
"snap_trimq": "[]",
"snap_trimq_len": 0,
"state": "active+clean",
"epoch": 14555,
"up": [
5,
6
],
"acting": [
    5,
6
[...]

--
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
Layer7 Networks

mailto:i...@layer7.net

Anschrift:

Layer7 Networks GmbH
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 96293 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic
UST ID: DE259845632

On 15/01/2024 22:24, Oliver Dzombic wrote:

Hi,

after osd.15 died in the wrong moment there is:

#ceph health detail

[WRN] PG_AVAILABILITY: Reduced data availability: 1 pg stale
     pg 10.17 is stuck stale for 3d, current state 
stale+active+undersized+degraded, last acting [15]
[WRN] PG_DEGRADED: Degraded data redundancy: 172/57063399 objects 
degraded (0.000%), 1 pg degraded, 1 pg undersized
     pg 10.17 is stuck undersized for 3d, current state 
stale+active+undersized+degraded, last acting [15]


which will never resolv as there is no osd.15 anymore.

So a

ceph pg 10.17 mark_unfound_lost delete

was executed.


ceph seems to be a bit confused about pg 10.17 now:

While this worked before, its not working anymore
# ceph pg 10.17 query
Error ENOENT: i don't have pgid 10.17


And while this was pointing to 15 the map changed now to 5 and 6 ( which 
is correct ):

# ceph pg map 10.17
osdmap e14425 pg 10.17 (10.17) -> up [5,6] acting [5,6]



According to ceph health, ceph assumes that osd.15 is still somehow in 
charge.


The pg map seems to think that 10.17 is on osd.5 and osd.6

But pg 10.17 seems not to be really existing, as a query will fail.

Any idea whats going wrong and howto fix this?

Thank you!


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph pg mark_unfound_lost delete results in confused ceph

2024-01-15 Thread Oliver Dzombic

Hi,

after osd.15 died in the wrong moment there is:

#ceph health detail

[WRN] PG_AVAILABILITY: Reduced data availability: 1 pg stale
pg 10.17 is stuck stale for 3d, current state 
stale+active+undersized+degraded, last acting [15]
[WRN] PG_DEGRADED: Degraded data redundancy: 172/57063399 objects 
degraded (0.000%), 1 pg degraded, 1 pg undersized
pg 10.17 is stuck undersized for 3d, current state 
stale+active+undersized+degraded, last acting [15]


which will never resolv as there is no osd.15 anymore.

So a

ceph pg 10.17 mark_unfound_lost delete

was executed.


ceph seems to be a bit confused about pg 10.17 now:

While this worked before, its not working anymore
# ceph pg 10.17 query
Error ENOENT: i don't have pgid 10.17


And while this was pointing to 15 the map changed now to 5 and 6 ( which 
is correct ):

# ceph pg map 10.17
osdmap e14425 pg 10.17 (10.17) -> up [5,6] acting [5,6]



According to ceph health, ceph assumes that osd.15 is still somehow in 
charge.


The pg map seems to think that 10.17 is on osd.5 and osd.6

But pg 10.17 seems not to be really existing, as a query will fail.

Any idea whats going wrong and howto fix this?

Thank you!

--
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
Layer7 Networks

mailto:i...@layer7.net

Anschrift:

Layer7 Networks GmbH
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 96293 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic
UST ID: DE259845632


OpenPGP_0x627BE440332A7AD0.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CEPH Mirrors are lacking packages

2023-04-17 Thread Oliver Dzombic
Hi,

thank you for your hint Burkhard!

For de.ceph.com i changed the sync source from eu to us (
download.ceph.com ).

So at least de.ceph.com should be in sync with the main source within
the next 24h.


-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
Layer7 Networks

mailto:i...@layer7.net

Anschrift:

Layer7 Networks GmbH
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 96293 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic
UST ID: DE259845632
On 17.04.23 09:39, Burkhard Linke wrote:
> Hi,
> 
> 
> at least eu.ceph.com and de.ceph.com are lacking packages for the
> pacific release. All package not start with "c" (e.g. librbd, librados,
> radosgw) are missing.
> 
> 
> Best regards,
> 
> Burkhard Linke
> 
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rbd: map failed: rbd: sysfs write failed -- (108) Cannot send after transport endpoint shutdown

2021-07-01 Thread Oliver Dzombic
700 10
librbd::image::CloseRequest: 0x55846e97a510 send_flush_readahead
2021-07-01T10:28:44.270+0200 7f8028ff9700 10
librbd::image::CloseRequest: 0x55846e97a510 handle_flush_readahead: r=0
2021-07-01T10:28:44.270+0200 7f8028ff9700 10
librbd::image::CloseRequest: 0x55846e97a510 send_shut_down_object_dispatcher
2021-07-01T10:28:44.270+0200 7f8028ff9700  5
librbd::io::ObjectDispatcher: 0x55846e979670 shut_down:
2021-07-01T10:28:44.270+0200 7f8028ff9700  5 librbd::io::ObjectDispatch:
0x55846e97b430 shut_down:
2021-07-01T10:28:44.270+0200 7f8028ff9700  5
librbd::cache::WriteAroundObjectDispatch: 0x7f80180020d0 shut_down:
2021-07-01T10:28:44.270+0200 7f8028ff9700 10
librbd::image::CloseRequest: 0x55846e97a510
handle_shut_down_object_dispatcher: r=0
2021-07-01T10:28:44.270+0200 7f8028ff9700 10
librbd::image::CloseRequest: 0x55846e97a510 send_flush_op_work_queue
2021-07-01T10:28:44.270+0200 7f8028ff9700 10
librbd::image::CloseRequest: 0x55846e97a510 handle_flush_op_work_queue: r=0
2021-07-01T10:28:44.270+0200 7f8028ff9700 10
librbd::image::CloseRequest: 0x55846e97a510 handle_flush_image_watcher: r=0
2021-07-01T10:28:44.270+0200 7f8028ff9700 10 librbd::ImageState:
0x7f802c001420 0x7f802c001420 handle_close: r=0
rbd: map failed: (108) Cannot send after transport endpoint shutdown



... which does actually not tell me too much honestly.


For now, i am out of idea's, can someone lead a way to analyse / fix this ?


New volumes can be created and removed without issues. But they wont map.



Thank you!

-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
Layer7 Networks

mailto:i...@layer7.net

Anschrift:

Layer7 Networks GmbH
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 96293 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic
UST ID: DE259845632
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io