[ceph-users] Re: 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create
Thanks, Josh. The cluster is managed by cephadm. On Thu, 1 Jun 2023, 23:07 Josh Baergen, wrote: > Hi Zakhar, > > I'm going to guess that it's a permissions issue arising from > https://github.com/ceph/ceph/pull/48804, which was included in 16.2.13. > You may need to change the directory permissions, assuming that you manage > the directories yourself. If this is managed by cephadm or something like > that, then that seems like some sort of missing migration in the upgrade. > > Josh > > On Thu, Jun 1, 2023 at 12:34 PM Zakhar Kirpichenko > wrote: > >> Hi, >> >> I'm having an issue with crash daemons on Pacific 16.2.13 hosts. >> ceph-crash >> throws the following error on all hosts: >> >> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; >> please create >> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; >> please create >> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; >> please create >> >> ceph-crash runs in docker, the container has the directory mounted: -v >> >> /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z >> >> The mount works correctly: >> >> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# >> ls >> -al crash/posted/ >> total 8 >> drwx-- 2 nobody nogroup 4096 May 6 2021 . >> drwx-- 3 nobody nogroup 4096 May 6 2021 .. >> >> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# >> touch crash/posted/a >> >> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# >> docker exec -it c0cd2b8022d8 bash >> >> [root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/ >> total 8 >> drwx-- 2 nobody nobody 4096 Jun 1 18:26 . >> drwx-- 3 nobody nobody 4096 May 6 2021 .. >> -rw-r--r-- 1 root root 0 Jun 1 18:26 a >> >> I.e. the directory actually exists and is correctly mounted in the crash >> container, yet ceph-crash says it doesn't exist. How can I convince it >> that the directory is there? >> >> Best regards, >> Zakhar >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: reef v18.1.0 QE Validation status
Still awaiting for approvals: rados - Radek fs - Kotresh and Patrick upgrade/pacific-x - good as is, Laura? upgrade/quicny-x - good as is, Laura? upgrade/reef-p2p - N/A powercycle - Brad On Tue, May 30, 2023 at 9:50 AM Yuri Weinstein wrote: > > Details of this release are summarized here: > > https://tracker.ceph.com/issues/61515#note-1 > Release Notes - TBD > > Seeking approvals/reviews for: > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to > merge https://github.com/ceph/ceph/pull/51788 for > the core) > rgw - Casey > fs - Venky > orch - Adam King > rbd - Ilya > krbd - Ilya > upgrade/octopus-x - deprecated > upgrade/pacific-x - known issues, Ilya, Laura? > upgrade/reef-p2p - N/A > clients upgrades - not run yet > powercycle - Brad > ceph-volume - in progress > > Please reply to this email with approval and/or trackers of known > issues/PRs to address them. > > gibba upgrade was done and will need to be done again this week. > LRC upgrade TBD > > TIA ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create
Hi Zakhar, I'm going to guess that it's a permissions issue arising from https://github.com/ceph/ceph/pull/48804, which was included in 16.2.13. You may need to change the directory permissions, assuming that you manage the directories yourself. If this is managed by cephadm or something like that, then that seems like some sort of missing migration in the upgrade. Josh On Thu, Jun 1, 2023 at 12:34 PM Zakhar Kirpichenko wrote: > Hi, > > I'm having an issue with crash daemons on Pacific 16.2.13 hosts. ceph-crash > throws the following error on all hosts: > > ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; > please create > ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; > please create > ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; > please create > > ceph-crash runs in docker, the container has the directory mounted: -v > > /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z > > The mount works correctly: > > 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# ls > -al crash/posted/ > total 8 > drwx-- 2 nobody nogroup 4096 May 6 2021 . > drwx-- 3 nobody nogroup 4096 May 6 2021 .. > > 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# > touch crash/posted/a > > 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# > docker exec -it c0cd2b8022d8 bash > > [root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/ > total 8 > drwx-- 2 nobody nobody 4096 Jun 1 18:26 . > drwx-- 3 nobody nobody 4096 May 6 2021 .. > -rw-r--r-- 1 root root 0 Jun 1 18:26 a > > I.e. the directory actually exists and is correctly mounted in the crash > container, yet ceph-crash says it doesn't exist. How can I convince it > that the directory is there? > > Best regards, > Zakhar > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: PGs incomplete - Data loss
Hi, the short answer is yes, but without knowing anything about the cluster or what happened exactly it's a wild guess. In general, you can use the ceph-objectstore-tool [1] to export a PG (one replica or chunk) from an OSD and import it to a different OSD. I have to add, I never had to do this myself so all I can do is point you to the docs. Here's an example [2] of the command. Depending on the size of your PGs you'll need plenty of local disk space (or some network mount etc.) for the export. Note that OSDs have to be shut down in order for the objectstore-tool to work on the OSD. Maybe try it in a test environment first to get familiar with how it works. Regards, Eugen [1] https://docs.ceph.com/en/pacific/man/8/ceph-objectstore-tool/ [2] https://hawkvelt.id.au/post/2022-4-5-ceph-pg-export-import/ Zitat von Benno Wulf : Hi guys, I'm awake since 36h and try to restore a broken ceph Pool (2 PGs incomplete) My vm are all broken. Some Boot, some Dont Boot... Also I have 5 removed disk with Data of that Pool "in my Hands" - Dont ask... So my question is it possible to restore Data of these other disks and "add" them thee others for healing? Best regards Ben ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cluster without messenger v1, new MON still binds to port 6789
Hi, > On 1 Jun 2023, at 12:50, Robert Sander wrote: > > a cluster has ms_bind_msgr1 set to false in the config database. > > Newly created MONs still listen on port 6789 and add themselves as providing > messenger v1 into the monmap. > > How do I change that? > > Shouldn't the MONs use the config for ms_bind_msgr1? This config setting for listen, not for a "new mon born" To disable msgr1 for mon completely, you should run command "ceph mon dump", and then use the v2 address and mon name as arg for a command, like this: `ceph mon set-addrs mon1 v2:10.10.10.1:3300` This will set only v2 address for your new mon k ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create
Hi, I'm having an issue with crash daemons on Pacific 16.2.13 hosts. ceph-crash throws the following error on all hosts: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create ceph-crash runs in docker, the container has the directory mounted: -v /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z The mount works correctly: 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# ls -al crash/posted/ total 8 drwx-- 2 nobody nogroup 4096 May 6 2021 . drwx-- 3 nobody nogroup 4096 May 6 2021 .. 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# touch crash/posted/a 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# docker exec -it c0cd2b8022d8 bash [root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/ total 8 drwx-- 2 nobody nobody 4096 Jun 1 18:26 . drwx-- 3 nobody nobody 4096 May 6 2021 .. -rw-r--r-- 1 root root 0 Jun 1 18:26 a I.e. the directory actually exists and is correctly mounted in the crash container, yet ceph-crash says it doesn't exist. How can I convince it that the directory is there? Best regards, Zakhar ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: reef v18.1.0 QE Validation status
Hi Yuri, I'd like to get https://github.com/ceph/ceph/pull/51821 in as well if we can. Mark On 5/30/23 11:50, Yuri Weinstein wrote: Details of this release are summarized here: https://tracker.ceph.com/issues/61515#note-1 Release Notes - TBD Seeking approvals/reviews for: rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to merge https://github.com/ceph/ceph/pull/51788 for the core) rgw - Casey fs - Venky orch - Adam King rbd - Ilya krbd - Ilya upgrade/octopus-x - deprecated upgrade/pacific-x - known issues, Ilya, Laura? upgrade/reef-p2p - N/A clients upgrades - not run yet powercycle - Brad ceph-volume - in progress Please reply to this email with approval and/or trackers of known issues/PRs to address them. gibba upgrade was done and will need to be done again this week. LRC upgrade TBD TIA ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.clyso.com/jobs/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: reef v18.1.0 QE Validation status
thanks Yuri, i'm happy to approve rgw based on the latest run in https://pulpito.ceph.com/yuriw-2023-05-31_19:25:20-rgw-reef-release-distro-default-smithi/. there are still some failures that we're tracking, but nothing that should block the rc On Wed, May 31, 2023 at 3:22 PM Yuri Weinstein wrote: > > Casey > > I will rerun rgw and we will see. > Stay tuned. > > On Wed, May 31, 2023 at 10:27 AM Casey Bodley wrote: > > > > On Tue, May 30, 2023 at 12:54 PM Yuri Weinstein wrote: > > > > > > Details of this release are summarized here: > > > > > > https://tracker.ceph.com/issues/61515#note-1 > > > Release Notes - TBD > > > > > > Seeking approvals/reviews for: > > > > > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to > > > merge https://github.com/ceph/ceph/pull/51788 for > > > the core) > > > rgw - Casey > > > > the rgw suite had several new test_rgw_throttle.sh failures that i > > haven't seen before: > > > > qa/workunits/rgw/test_rgw_throttle.sh: line 3: ceph_test_rgw_throttle: > > command not found > > > > those only show up on rhel8 jobs, and none of your later reef runs fail > > this way > > > > Yuri, is it possible that the suite-branch was mixed up somehow? the > > ceph "sha1: be098f4642e7d4bbdc3f418c5ad703e23d1e9fe0" didn't match the > > workunit "sha1: 4a02f3f496d9039326c49bf1fbe140388cd2f619" > > > > > fs - Venky > > > orch - Adam King > > > rbd - Ilya > > > krbd - Ilya > > > upgrade/octopus-x - deprecated > > > upgrade/pacific-x - known issues, Ilya, Laura? > > > upgrade/reef-p2p - N/A > > > clients upgrades - not run yet > > > powercycle - Brad > > > ceph-volume - in progress > > > > > > Please reply to this email with approval and/or trackers of known > > > issues/PRs to address them. > > > > > > gibba upgrade was done and will need to be done again this week. > > > LRC upgrade TBD > > > > > > TIA > > > ___ > > > ceph-users mailing list -- ceph-users@ceph.io > > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Cluster without messenger v1, new MON still binds to port 6789
Hi, a cluster has ms_bind_msgr1 set to false in the config database. Newly created MONs still listen on port 6789 and add themselves as providing messenger v1 into the monmap. How do I change that? Shouldn't the MONs use the config for ms_bind_msgr1? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm does not honor container_image default value
I reported the bug here: https://tracker.ceph.com/issues/61553 Am 15.05.23 um 14:50 schrieb Adam King: I think with the `config set` commands there is logic to notify the relevant mgr modules and update their values. That might not exist with `config rm`, so it's still using the last set value. Looks like a real bug. Curious what happens if the mgr restarts after the `config rm`. Whether it goes back to the default image in that case or not. Might take a look later. smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io