[ceph-users] Re: [ext] Re: cephadm auto disk preparation and OSD installation incomplete

2024-03-22 Thread Kuhring, Mathias
a lot, Mathias -Original Message- From: Kuhring, Mathias Sent: Friday, March 22, 2024 5:38 PM To: Eugen Block ; ceph-users@ceph.io Subject: [ceph-users] Re: [ext] Re: cephadm auto disk preparation and OSD installation incomplete Hey Eugen, Thank you for the quick reply. The 5 missing disks

[ceph-users] Re: [ext] Re: cephadm auto disk preparation and OSD installation incomplete

2024-03-22 Thread Kuhring, Mathias
cephadm' thing). I remember reports in earlier versions of ceph-volume (probably pre-cephadm) where not all OSDs were created if the host had many disks to deploy. But I can't find those threads right now. And it's strange that on the second cluster no OSD is created at all, bu

[ceph-users] cephadm auto disk preparation and OSD installation incomplete

2024-03-20 Thread Kuhring, Mathias
Dear ceph community, We have trouble with new disks not being properly prepared resp. OSDs not being fully installed by cephadm. We just added one new node each with ~40 HDDs each to two of our ceph clusters. In one cluster all but 5 disks got installed automatically. In the other none got instal

[ceph-users] Re: [ext] CephFS pool not releasing space after data deletion

2023-12-05 Thread Kuhring, Mathias
46 PM Frank Schilder wrote: >> Hi Mathias, >> >> have you made any progress on this? Did the capacity become available >> eventually? >> >> Best regards, >> ===== >> Frank Schilder >> AIT Risø Campus >> Bygning 109, rum S14 &g

[ceph-users] Re: [ext] CephFS pool not releasing space after data deletion

2023-10-27 Thread Kuhring, Mathias
scrubbing is actually active. I would appreciate any further ideas. Thanks a lot. Best Wishes, Mathias On 10/23/2023 12:42 PM, Kuhring, Mathias wrote: > Dear Ceph users, > > Our CephFS is not releasing/freeing up space after deleting hundreds of > terabytes of data. > By now, this

[ceph-users] CephFS pool not releasing space after data deletion

2023-10-23 Thread Kuhring, Mathias
Dear Ceph users, Our CephFS is not releasing/freeing up space after deleting hundreds of terabytes of data. By now, this drives us in a "nearfull" osd/pool situation and thus throttles IO. We are on ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable). Recently, we m

[ceph-users] CephFS snapshots: impact of moving data

2023-06-22 Thread Kuhring, Mathias
Dear Ceph community, We want to restructure (i.e. move around) a lot of data (hundreds of terabyte) in our CephFS. And now I was wondering what happens within snapshots when I move data around within a snapshotted folder. I.e. do I need to account for a lot increased storage usage due to older

[ceph-users] Daily failed capability releases, slow ops, fully stuck IO

2023-02-28 Thread Kuhring, Mathias
Dear Ceph community, since about two or three weeks, we have CephFS clients regularly failing to respond to capability releases accompanied OSD slow ops. By now, this happens daily every time clients get more active (e.g. during nightly backups). We mostly observe it with a handful of highly a

[ceph-users] Re: [ext] Re: Re: kernel client osdc ops stuck and mds slow reqs

2023-02-28 Thread Kuhring, Mathias
er client, but I assume that's more related to monitors failing due to full disks: [Mi Feb 22 05:59:52 2023] libceph: mon2 (1)172.16.62.12:6789 socket closed (con state OPEN) [Mi Feb 22 05:59:52 2023] libceph: mon2 (1)172.16.62.12:6789 session lost, hunting for new mon [Mi Feb 22 05:59:52

[ceph-users] Re: [ext] Re: Re: kernel client osdc ops stuck and mds slow reqs

2023-02-21 Thread Kuhring, Mathias
On 2/21/2023 1:00 AM, Xiubo Li wrote: > > On 20/02/2023 22:28, Kuhring, Mathias wrote: >> Hey Dan, hey Ilya >> >> I know this issue is two years old already, but we are having similar >> issues. >> >> Do you know, if the fixes got ever backported to RHEL

[ceph-users] Re: kernel client osdc ops stuck and mds slow reqs

2023-02-20 Thread Kuhring, Mathias
Hey Dan, hey Ilya I know this issue is two years old already, but we are having similar issues. Do you know, if the fixes got ever backported to RHEL kernels? Not looking for el7 but rather el8 fixes. Wondering if the patches were backported and we shouldn't actually see these issues. Or if you

[ceph-users] Re: [ERR] OSD_SCRUB_ERRORS: 2 scrub errors

2023-01-09 Thread Kuhring, Mathias
Hey all, I'd like to pick up on this topic, since we also see regular scrub errors recently. Roughly one per week for around six weeks now. It's always a different PG and the repair command always helps after a while. But the regular re-occurrence seems it bit unsettling. How to best troubleshoo

[ceph-users] Re: [ext] Copying large file stuck, two cephfs-2 mounts on two cluster

2023-01-03 Thread Kuhring, Mathias
AND different snapshots or original data behave the same. On 12/22/2022 4:27 PM, Kuhring, Mathias wrote: Dear ceph community, We have two ceph cluster of equal size, one main and one mirror, both using cephadm and on version ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (

[ceph-users] Copying large file stuck, two cephfs-2 mounts on two cluster

2022-12-22 Thread Kuhring, Mathias
Dear ceph community, We have two ceph cluster of equal size, one main and one mirror, both using cephadm and on version ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable) We are stuck with copying a large file (~ 64G) between the CephFS file systems of the two c

[ceph-users] CephFS Snapshot Mirroring slow due to repeating attribute sync

2022-08-23 Thread Kuhring, Mathias
Dear Ceph developers and users, We are using ceph version 17.2.1 (ec95624474b1871a821a912b8c3af68f8f8e7aa1) quincy (stable). We are using cephadm since version 15 octopus. We mirror several CephFS directories from our main cluster our to a second mirror cluster. In particular with bigger direct

[ceph-users] Re: [ext] Re: snap_schedule MGR module not available after upgrade to Quincy

2022-07-07 Thread Kuhring, Mathias
wrote: > Hello Mathias, > > On 06.07.22 18:27, Kuhring, Mathias wrote: >> Hey Andreas, >> >> thanks for the info. >> >> We also had our MGR reporting crashes related to the module. >> >> We have a second cluster as mirror which we also updated to Qui

[ceph-users] Re: [ext] Re: snap_schedule MGR module not available after upgrade to Quincy

2022-07-06 Thread Kuhring, Mathias
ead of ioctx.remove(SNAP_DB_OBJECT_NAME) it should be > ioctx.remove_object(SNAP_DB_OBJECT_NAME). > > (According to my understanding of > https://docs.ceph.com/en/latest/rados/api/python/.) > > Best regards, > > Andreas > > > On 01.07.22 18:05, Kuhring,

[ceph-users] Re: [ext] Re: cephadm orch thinks hosts are offline

2022-07-01 Thread Kuhring, Mathias
We found a fix for our issue ceph orch reporting wrong/outdated service information: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/DAFXD46NALFAFUBQEYODRIFWSD6SH2OL/ In our case our DNS settings were messed up on the cluster hosts AND also within the MGR daemon containers (ceph

[ceph-users] snap_schedule MGR module not available after upgrade to Quincy

2022-07-01 Thread Kuhring, Mathias
Dear Ceph community, After upgrading our cluster to Quincy with cephadm (ceph orch upgrade start --image quay.io/ceph/ceph:v17.2.1), I struggle to re-activate the snapshot schedule module: 0|0[root@osd-1 ~]# ceph mgr module enable snap_schedule 0|1[root@osd-1 ~]# ceph mgr module ls | grep snap

[ceph-users] Re: Orchestrator informations wrong and outdated

2022-07-01 Thread Kuhring, Mathias
We noticed that our DNS settings were inconsistent and partially wrong. The NetworkManager somehow set useless nameservers in the /etc/resolv.conf of our hosts. But in particular, the DNS settings in the MGR containers needed fixing as well. I fixed etc/resolv.conf on our hosts and in the contain

[ceph-users] Re: [ext] Re: cephadm orch thinks hosts are offline

2022-06-29 Thread Kuhring, Mathias
Hey all, just want to note that I'm also looking for some kind of way to restart/reset/refresh orchestrator. But in my case it's not the hosts but the services which are presumably wrongly reported and outdated: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NHEVEM3ESJYXZ4LPJ24B

[ceph-users] Orchestrator informations wrong and outdated

2022-06-29 Thread Kuhring, Mathias
Dear Ceph community, we are in the curious situation that typical orchestrator queries provide wrong or outdated information about different services. E.g. `ceph orch ls` will report wrong numbers on active services. Or `ceph orch ps` reports many OSDs as "starting" and many services with an old

[ceph-users] Re: [ext] Re: Rename / change host names set with `ceph orch host add`

2022-06-23 Thread Kuhring, Mathias
m/issues/54571 and should be resolved as of 16.2.9 and 17.2.1. so hopefully removing these legacy daemon dirs won't be necessary in the future. Thanks, - Adam King On Thu, Jun 23, 2022 at 6:42 AM Kuhring, Mathias mailto:mathias.kuhr...@bih-charite.de>> wrote: Hey Adam, thanks aga

[ceph-users] Re: [ext] Re: Rename / change host names set with `ceph orch host add`

2022-06-23 Thread Kuhring, Mathias
n or mgr around. If you make sure all the important services are deployed by label, explicit hosts etc. (just not count) then there should be no risk of any daemons moving at all and this is a pretty safe operation. On Fri, May 20, 2022 at 3:36 AM Kuhring, Mathias mailto:mathias.kuhr...@bih-chari

[ceph-users] Re: [ext] Recover from "Module 'progress' has failed"

2022-05-31 Thread Kuhring, Mathias
. On 5/5/2022 12:49 PM, Kuhring, Mathias wrote: > Dear Ceph community, > > We are having an issue with the MGR progress module: > >     Module 'progress' has failed: ('e7fb29e3-9caf-4b20-b930-cee8474526bb',) > > We are currently on ceph version 16.2.7 &g

[ceph-users] Re: [ext] Re: Rename / change host names set with `ceph orch host add`

2022-05-20 Thread Kuhring, Mathias
ial case. Anyway, removing and re-adding the host is the only (reasonable) way to change what it has stored for the hostname that I can remember. Let me know if that doesn't work, - Adam King On Thu, May 19, 2022 at 1:41 PM Kuhring, Mathias mailto:mathias.kuhr...@bih-charite.de>> wrote:

[ceph-users] Rename / change host names set with `ceph orch host add`

2022-05-19 Thread Kuhring, Mathias
Dear ceph users, one of our cluster is complaining about plenty of stray hosts and daemons. Pretty much all of them. [WRN] CEPHADM_STRAY_HOST: 6 stray host(s) with 280 daemon(s) not managed by cephadm     stray host osd-mirror-1 has 47 stray daemons: ['mgr.osd-mirror-1.ltmyyh', 'mon.osd-mirro

[ceph-users] Re: [ext] Re: Moving data between two mounts of the same CephFS

2022-05-19 Thread Kuhring, Mathias
this operation a one off or a regular occurance? If it is a one off > then I would do it as adminstrator. If it is a regular occurance I > would look into re-arranging the filesystem layout to make this > possible. > > Regards > magnus > > > On Wed, 2022-05-18 at 13:34

[ceph-users] Moving data between two mounts of the same CephFS

2022-05-18 Thread Kuhring, Mathias
Dear Ceph community, Let's say I want to make different sub-directories of my CephFS separately available on a client system, i.e. without exposing the parent directories (because it contains other sensitive data, for instance). I can simply mount specific different folders, as primitively ill

[ceph-users] Recover from "Module 'progress' has failed"

2022-05-05 Thread Kuhring, Mathias
Dear Ceph community, We are having an issue with the MGR progress module:     Module 'progress' has failed: ('e7fb29e3-9caf-4b20-b930-cee8474526bb',) We are currently on ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable). I'm aware that there are already issues an

[ceph-users] Ceph Dashboard: The Object Gateway Service is not configured

2022-01-20 Thread Kuhring, Mathias
Dear all, recently, our dashboard is not able to connect to our RGW anymore: Error connecting to Object Gateway: RGW REST API failed request with status code 404 (b'{"Code":"NoSuchKey","BucketName":"admin","RequestId":"tx0f84ffa8b34579fa' b'a-0061e93872-4bc673c-ext-default-primary