[ceph-users] Re: 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create

2023-06-01 Thread Zakhar Kirpichenko
Thanks, Josh. The cluster is managed by cephadm.

On Thu, 1 Jun 2023, 23:07 Josh Baergen,  wrote:

> Hi Zakhar,
>
> I'm going to guess that it's a permissions issue arising from
> https://github.com/ceph/ceph/pull/48804, which was included in 16.2.13.
> You may need to change the directory permissions, assuming that you manage
> the directories yourself. If this is managed by cephadm or something like
> that, then that seems like some sort of missing migration in the upgrade.
>
> Josh
>
> On Thu, Jun 1, 2023 at 12:34 PM Zakhar Kirpichenko 
> wrote:
>
>> Hi,
>>
>> I'm having an issue with crash daemons on Pacific 16.2.13 hosts.
>> ceph-crash
>> throws the following error on all hosts:
>>
>> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
>> please create
>> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
>> please create
>> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
>> please create
>>
>> ceph-crash runs in docker, the container has the directory mounted: -v
>>
>> /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z
>>
>> The mount works correctly:
>>
>> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
>> ls
>> -al crash/posted/
>> total 8
>> drwx-- 2 nobody nogroup 4096 May  6  2021 .
>> drwx-- 3 nobody nogroup 4096 May  6  2021 ..
>>
>> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
>> touch crash/posted/a
>>
>> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
>> docker exec -it c0cd2b8022d8 bash
>>
>> [root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/
>> total 8
>> drwx-- 2 nobody nobody 4096 Jun  1 18:26 .
>> drwx-- 3 nobody nobody 4096 May  6  2021 ..
>> -rw-r--r-- 1 root   root  0 Jun  1 18:26 a
>>
>> I.e. the directory actually exists and is correctly mounted in the crash
>> container, yet ceph-crash says it doesn't exist. How can I convince it
>> that the directory is there?
>>
>> Best regards,
>> Zakhar
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef v18.1.0 QE Validation status

2023-06-01 Thread Yuri Weinstein
Still awaiting for approvals:

rados - Radek
fs - Kotresh and Patrick

upgrade/pacific-x - good as is, Laura?
upgrade/quicny-x - good as is, Laura?
upgrade/reef-p2p - N/A
powercycle - Brad

On Tue, May 30, 2023 at 9:50 AM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/61515#note-1
> Release Notes - TBD
>
> Seeking approvals/reviews for:
>
> rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
> merge https://github.com/ceph/ceph/pull/51788 for
> the core)
> rgw - Casey
> fs - Venky
> orch - Adam King
> rbd - Ilya
> krbd - Ilya
> upgrade/octopus-x - deprecated
> upgrade/pacific-x - known issues, Ilya, Laura?
> upgrade/reef-p2p - N/A
> clients upgrades - not run yet
> powercycle - Brad
> ceph-volume - in progress
>
> Please reply to this email with approval and/or trackers of known
> issues/PRs to address them.
>
> gibba upgrade was done and will need to be done again this week.
> LRC upgrade TBD
>
> TIA
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create

2023-06-01 Thread Josh Baergen
Hi Zakhar,

I'm going to guess that it's a permissions issue arising from
https://github.com/ceph/ceph/pull/48804, which was included in 16.2.13. You
may need to change the directory permissions, assuming that you manage the
directories yourself. If this is managed by cephadm or something like that,
then that seems like some sort of missing migration in the upgrade.

Josh

On Thu, Jun 1, 2023 at 12:34 PM Zakhar Kirpichenko  wrote:

> Hi,
>
> I'm having an issue with crash daemons on Pacific 16.2.13 hosts. ceph-crash
> throws the following error on all hosts:
>
> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
> please create
> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
> please create
> ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
> please create
>
> ceph-crash runs in docker, the container has the directory mounted: -v
>
> /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z
>
> The mount works correctly:
>
> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# ls
> -al crash/posted/
> total 8
> drwx-- 2 nobody nogroup 4096 May  6  2021 .
> drwx-- 3 nobody nogroup 4096 May  6  2021 ..
>
> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
> touch crash/posted/a
>
> 18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
> docker exec -it c0cd2b8022d8 bash
>
> [root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/
> total 8
> drwx-- 2 nobody nobody 4096 Jun  1 18:26 .
> drwx-- 3 nobody nobody 4096 May  6  2021 ..
> -rw-r--r-- 1 root   root  0 Jun  1 18:26 a
>
> I.e. the directory actually exists and is correctly mounted in the crash
> container, yet ceph-crash says it doesn't exist. How can I convince it
> that the directory is there?
>
> Best regards,
> Zakhar
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: PGs incomplete - Data loss

2023-06-01 Thread Eugen Block

Hi,

the short answer is yes, but without knowing anything about the  
cluster or what happened exactly it's a wild guess.
In general, you can use the ceph-objectstore-tool [1] to export a PG  
(one replica or chunk) from an OSD and import it to a different OSD. I  
have to add, I never had to do this myself so all I can do is point  
you to the docs. Here's an example [2] of the command. Depending on  
the size of your PGs you'll need plenty of local disk space (or some  
network mount etc.) for the export. Note that OSDs have to be shut  
down in order for the objectstore-tool to work on the OSD. Maybe try  
it in a test environment first to get familiar with how it works.


Regards,
Eugen

[1] https://docs.ceph.com/en/pacific/man/8/ceph-objectstore-tool/
[2] https://hawkvelt.id.au/post/2022-4-5-ceph-pg-export-import/

Zitat von Benno Wulf :


Hi guys,
I'm awake since 36h and try to restore a broken ceph Pool (2 PGs incomplete)

My vm are all broken. Some Boot, some Dont Boot...

Also I have 5 removed disk with Data of that Pool "in my Hands" - Dont ask...

So my question is it possible to restore Data of these other disks  
and "add" them thee others for healing?


Best regards
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cluster without messenger v1, new MON still binds to port 6789

2023-06-01 Thread Konstantin Shalygin
Hi,

> On 1 Jun 2023, at 12:50, Robert Sander  wrote:
> 
> a cluster has ms_bind_msgr1 set to false in the config database.
> 
> Newly created MONs still listen on port 6789 and add themselves as providing 
> messenger v1 into the monmap.
> 
> How do I change that?
> 
> Shouldn't the MONs use the config for ms_bind_msgr1?

This config setting for listen, not for a "new mon born"
To disable msgr1 for mon completely, you should run command "ceph mon dump", 
and then use the v2 address and mon name as arg for a command, like this:

`ceph mon set-addrs mon1 v2:10.10.10.1:3300`

This will set only v2 address for your new mon


k

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create

2023-06-01 Thread Zakhar Kirpichenko
Hi,

I'm having an issue with crash daemons on Pacific 16.2.13 hosts. ceph-crash
throws the following error on all hosts:

ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
please create
ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
please create
ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist;
please create

ceph-crash runs in docker, the container has the directory mounted: -v
/var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/crash:/var/lib/ceph/crash:z

The mount works correctly:

18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]# ls
-al crash/posted/
total 8
drwx-- 2 nobody nogroup 4096 May  6  2021 .
drwx-- 3 nobody nogroup 4096 May  6  2021 ..

18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
touch crash/posted/a

18:26 [root@ceph02 /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86]#
docker exec -it c0cd2b8022d8 bash

[root@ceph02 /]# ls -al /var/lib/ceph/crash/posted/
total 8
drwx-- 2 nobody nobody 4096 Jun  1 18:26 .
drwx-- 3 nobody nobody 4096 May  6  2021 ..
-rw-r--r-- 1 root   root  0 Jun  1 18:26 a

I.e. the directory actually exists and is correctly mounted in the crash
container, yet ceph-crash says it doesn't exist. How can I convince it
that the directory is there?

Best regards,
Zakhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef v18.1.0 QE Validation status

2023-06-01 Thread Mark Nelson

Hi Yuri,

I'd like to get https://github.com/ceph/ceph/pull/51821 in as well if we 
can.


Mark

On 5/30/23 11:50, Yuri Weinstein wrote:

Details of this release are summarized here:

https://tracker.ceph.com/issues/61515#note-1
Release Notes - TBD

Seeking approvals/reviews for:

rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
merge https://github.com/ceph/ceph/pull/51788 for
the core)
rgw - Casey
fs - Venky
orch - Adam King
rbd - Ilya
krbd - Ilya
upgrade/octopus-x - deprecated
upgrade/pacific-x - known issues, Ilya, Laura?
upgrade/reef-p2p - N/A
clients upgrades - not run yet
powercycle - Brad
ceph-volume - in progress

Please reply to this email with approval and/or trackers of known
issues/PRs to address them.

gibba upgrade was done and will need to be done again this week.
LRC upgrade TBD

TIA
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Best Regards,
Mark Nelson
Head of R (USA)

Clyso GmbH
p: +49 89 21552391 12
a: Loristraße 8 | 80335 München | Germany
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef v18.1.0 QE Validation status

2023-06-01 Thread Casey Bodley
thanks Yuri,

i'm happy to approve rgw based on the latest run in
https://pulpito.ceph.com/yuriw-2023-05-31_19:25:20-rgw-reef-release-distro-default-smithi/.
there are still some failures that we're tracking, but nothing that
should block the rc

On Wed, May 31, 2023 at 3:22 PM Yuri Weinstein  wrote:
>
> Casey
>
> I will rerun rgw and we will see.
> Stay tuned.
>
> On Wed, May 31, 2023 at 10:27 AM Casey Bodley  wrote:
> >
> > On Tue, May 30, 2023 at 12:54 PM Yuri Weinstein  wrote:
> > >
> > > Details of this release are summarized here:
> > >
> > > https://tracker.ceph.com/issues/61515#note-1
> > > Release Notes - TBD
> > >
> > > Seeking approvals/reviews for:
> > >
> > > rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
> > > merge https://github.com/ceph/ceph/pull/51788 for
> > > the core)
> > > rgw - Casey
> >
> > the rgw suite had several new test_rgw_throttle.sh failures that i
> > haven't seen before:
> >
> > qa/workunits/rgw/test_rgw_throttle.sh: line 3: ceph_test_rgw_throttle:
> > command not found
> >
> > those only show up on rhel8 jobs, and none of your later reef runs fail 
> > this way
> >
> > Yuri, is it possible that the suite-branch was mixed up somehow? the
> > ceph "sha1: be098f4642e7d4bbdc3f418c5ad703e23d1e9fe0" didn't match the
> > workunit "sha1: 4a02f3f496d9039326c49bf1fbe140388cd2f619"
> >
> > > fs - Venky
> > > orch - Adam King
> > > rbd - Ilya
> > > krbd - Ilya
> > > upgrade/octopus-x - deprecated
> > > upgrade/pacific-x - known issues, Ilya, Laura?
> > > upgrade/reef-p2p - N/A
> > > clients upgrades - not run yet
> > > powercycle - Brad
> > > ceph-volume - in progress
> > >
> > > Please reply to this email with approval and/or trackers of known
> > > issues/PRs to address them.
> > >
> > > gibba upgrade was done and will need to be done again this week.
> > > LRC upgrade TBD
> > >
> > > TIA
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> >
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cluster without messenger v1, new MON still binds to port 6789

2023-06-01 Thread Robert Sander

Hi,

a cluster has ms_bind_msgr1 set to false in the config database.

Newly created MONs still listen on port 6789 and add themselves as 
providing messenger v1 into the monmap.


How do I change that?

Shouldn't the MONs use the config for ms_bind_msgr1?

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephadm does not honor container_image default value

2023-06-01 Thread Daniel Krambrock

I reported the bug here: https://tracker.ceph.com/issues/61553

Am 15.05.23 um 14:50 schrieb Adam King:

I think with the `config set` commands there is logic to notify the
relevant mgr modules and update their values. That might not exist with
`config rm`, so it's still using the last set value. Looks like a real bug.
Curious what happens if the mgr restarts after the `config rm`. Whether it
goes back to the default image in that case or not. Might take a look later.



smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io