[ceph-users] Re: wrong public_ip after blackout / poweroutage

2024-06-21 Thread Joachim Kraftmayer
Hi, maybe not a direct answer to your question. But i would generally recommend you to verify your ceph config with our ceph analyzer. Simply upload your ceph report there and in a few seconds you will receive feedback on the configuration:https://analyzer.clyso.com/ *Joachim Kraftmayer

[ceph-users] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Joachim Kraftmayer
Hi Dietmar, have you already blocked all cephfs clients? Joachim *Joachim Kraftmayer* CEO | p: +49 89 2152527-21 | e: joachim.kraftma...@clyso.com a: Loristr. 8 | 80335 Munich | Germany | w: https://clyso.com | Utting a. A. | HR: Augsburg | HRB 25866 | USt. ID: DE275430677 Am Mi., 19. Juni

[ceph-users] Re: CephFS as Offline Storage

2024-05-22 Thread Joachim Kraftmayer
I have already installed multiple one node ceph cluster with cephfs for non-productive workloads in the last few years. Had no major issue, e.g. once a broken HDD. The question is what kind of EC or replication you will use. Also only powered off the node in a clean and healthy state ;-) What

[ceph-users] Re: activating+undersized+degraded+remapped

2024-03-17 Thread Joachim Kraftmayer - ceph ambassador
also helpful is the output of: cephpg{poolnum}.{pg-id}query ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph Foundation Member https://www.clyso.com/ Am 16.03.24 um 13:52 schrieb Eugen Block: Yeah, the whole story would help to

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Joachim Kraftmayer - ceph ambassador
Hi, another short note regarding the documentation, the paths are designed for a package installation. the paths for container installation look a bit different e.g.: /var/lib/ceph//osd.y/ Joachim ___ ceph ambassador DACH ceph consultant since 2012 Clyso

[ceph-users] Re: Stickyness of writing vs full network storage writing

2023-10-28 Thread Joachim Kraftmayer - ceph ambassador
Hi, I know similar requirements, the motivation and the need behind them. We have chosen a clear approach to this, which also does not make the whole setup too complicated to operate. 1.) Everything that doesn't require strong consistency we do with other tools, especially when it comes to

[ceph-users] Re: Remove empty orphaned PGs not mapped to a pool

2023-10-05 Thread Joachim Kraftmayer - ceph ambassador
@Eugen We have seen the same problems 8 years ago. I can only recommend never to use cache tiering in production. At Cephalocon this was part of my talk and as far as I remember cache tiering will also disappear from ceph soon. Cache tiering has been deprecated in the Reef release as it has

[ceph-users] Re: Balancer blocked as autoscaler not acting on scaling change

2023-10-04 Thread Joachim Kraftmayer - ceph ambassador
Hi, we have often seen strange behavior and also interesting pg targets from pg_autoscaler in the last years. That's why we disable it globally. The commands: ceph osd reweight-by-utilization ceph osd test-reweight-by-utilization are from the time before the upmap balancer was introduced and

[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster

2023-09-12 Thread Joachim Kraftmayer - ceph ambassador
Another the possibility is also the ceph mon discovery via DNS: https://docs.ceph.com/en/quincy/rados/configuration/mon-lookup-dns/#looking-up-monitors-through-dns Regards, Joachim ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph

[ceph-users] Re: replacing all disks in a stretch mode ceph cluster

2023-07-19 Thread Joachim Kraftmayer - ceph ambassador
Hi, short note if you replace the disks with large disks, the weight of the osd and host will change and this will force data migration. Perhaps you read a bit more about the upmap balancer, if you want to avoid data migration during the upgrade phase. Regards, Joachim

[ceph-users] Re: CEPH orch made osd without WAL

2023-07-10 Thread Joachim Kraftmayer - ceph ambassador
you can also test it directly with ceph bench, if the WAL is on the flash device: https://www.clyso.com/blog/verify-ceph-osd-db-and-wal-setup/ Joachim ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph Foundation Member

[ceph-users] Re: Rook on bare-metal?

2023-07-06 Thread Joachim Kraftmayer - ceph ambassador
Hello we have been following rook since 2018 and have had our experiences both on bare-metal and in the hyperscalers. In the same way, we have been following cephadm from the beginning. Meanwhile, we have been using both in production for years and the decision which orchestrator to use

[ceph-users] Re: Deleting millions of objects

2023-05-17 Thread Joachim Kraftmayer - ceph ambassador
Hi Rok, try this: rgw_delete_multi_obj_max_num - Max number of objects in a single multi-object delete request   (int, advanced)   Default: 1000   Can update at runtime: true   Services: [rgw] config set WHO: client. or client.rgw KEY: rgw_delete_multi_obj_max_num VALUE: 1

[ceph-users] Re: CEPH Version choice

2023-05-15 Thread Joachim Kraftmayer - ceph ambassador
Jens Galsgaard: https://www.youtube.com/playlist?list=PLrBUGiINAakPd9nuoorqeOuS9P9MTWos3 -Original Message- From: Marc Sent: Monday, May 15, 2023 4:42 PM To: Joachim Kraftmayer - ceph ambassador ; Frank Schilder ; Tino Todino Cc: ceph-users@ceph.io Subject: [ceph-users] Re: CEPH Ver

[ceph-users] Re: cephadm does not honor container_image default value

2023-05-15 Thread Joachim Kraftmayer - ceph ambassador
Don't know if it helps, but we have also experienced something similar with osd images. We changed the image tag from version to sha and it did not happen again. ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph Foundation Member

[ceph-users] Re: CEPH Version choice

2023-05-15 Thread Joachim Kraftmayer - ceph ambassador
Hi, I know the problems that Frank has raised. However, it should also be mentioned that many critical bugs have been fixed in the major versions. We are working on the fixes ourselves. We and others have written a lot of tools for ourselves in the last 10 years to improve migration/update

[ceph-users] Re: Veeam backups to radosgw seem to be very slow

2023-04-26 Thread Joachim Kraftmayer - ceph ambassador
"bucket does not exist" or "permission denied". Had received similar error messages with another client program. The default region did not match the region of the cluster. ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph Foundation

[ceph-users] Re: OSD_TOO_MANY_REPAIRS on random OSDs causing clients to hang

2023-04-26 Thread Joachim Kraftmayer - ceph ambassador
Hello Thomas, I would strongly recommend you to read the messages on the mailing list regarding ceph version 16.2.11,16.2.12 and 16.2.13. Joachim ___ ceph ambassador DACH ceph consultant since 2012 Clyso GmbH - Premier Ceph Foundation Member

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread Joachim Kraftmayer
I would create a new cluster with Quincy and would migrate the data from the old to the new cluster bucket by bucket. Nautilus is out of support and I would recommend at least to use a ceph version that is receiving Backports. huxia...@horebdata.cn schrieb am Di., 25. Apr. 2023, 18:30: > Dear

[ceph-users] Re: Misplaced objects greater than 100%

2023-04-06 Thread Joachim Kraftmayer
Perhaps this option triggered the crush map change: osd crush update on start Each time the OSD starts, it verifies it is in the correct location in the CRUSH map and, if it is not, it moves itself. https://docs.ceph.com/en/quincy/rados/operations/crush-map/ Joachim Johan Hattne schrieb am

[ceph-users] Re: Ceph Failure and OSD Node Stuck Incident

2023-03-31 Thread Joachim Kraftmayer
Hi Peter, I would recommend from my experience to replace the Samsung Evo SSDs, with Datacenter SSDs. Regards, Joachim Clyso GmbH - Ceph Foundation Member schrieb am Do., 30. März 2023, 16:37: > We encountered a Ceph failure where the system became

[ceph-users] Re: Ceph performance problems

2023-03-22 Thread Joachim Kraftmayer
Hi Dominik, if you need performance, the default configuration is not designed for that. When tuning performance, please pay attention to the sources, if benchmark or similar is described there, you must pay attention to whether it is suitable for productive operation. Regards, Joachim

[ceph-users] Re: s3 compatible interface

2023-03-21 Thread Joachim Kraftmayer
Hi, maybe I should have mentioned the zipper project as well, watched both IBM and SUSE presentations at FOSDEM 2023. I personally follow the zipper project with great interest. Joachim ___ Ceph Foundation Member Am 21.03.23 um 01:27 schrieb Matt Benjamin:

[ceph-users] Re: Very slow backfilling/remapping of EC pool PGs

2023-03-21 Thread Joachim Kraftmayer
Which Ceph version are you running, is mclock active? Joachim ___ Clyso GmbH - Ceph Foundation Member Am 21.03.23 um 06:53 schrieb Gauvain Pocentek: Hello all, We have an EC (4+2) pool for RGW data, with HDDs + SSDs for WAL/DB. This pool has 9 servers with

[ceph-users] Re: Out of Memory after Upgrading to Nautilus

2021-05-05 Thread Joachim Kraftmayer
Hi Christoph, can you send me the ceph config set ... command you used and/or the ceph config dump output? Regards, Joachim Clyso GmbH Homepage: https://www.clyso.com Am 05.05.2021 um 16:30 schrieb Christoph Adomeit: I manage a historical cluster of severak ceph nodes with each 128 GB

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Joachim Kraftmayer
Create a new crush rule with the correct failure domain, test it properly and assign it to the pool(s). -- Beste Grüße, Joachim Kraftmayer ___ Clyso GmbH Am 05.05.2021 um 15:11 schrieb Andres Rojas Guerrero: Nice observation, how can avoid this problem? El

[ceph-users] Re: Ceph cluster not recover after OSD down

2021-05-05 Thread Joachim Kraftmayer
Hi Andres, the crush rule with ID 1 distributes your EC chunks over the osds without considering the ceph host. As Robert already suspected. Greetings, Joachim ___ Clyso GmbH Homepage: https://www.clyso.com Am 05.05.2021 um 13:16 schrieb Andres Rojas

[ceph-users] Re: Need Clarification on Maintenance Shutdown Procedure

2021-03-02 Thread Joachim Kraftmayer
Hello Dave, I recommend you read this docu: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/administration_guide/understanding-process-managemnet-for-ceph#powering-down-and-rebooting-a-red-hat-ceph-storage-cluster-management Regards, Joachim

[ceph-users] Re: Increasing QD=1 performance (lowering latency)

2021-02-11 Thread Joachim Kraftmayer
Hi Wido, do you know what happened to mellanox's ceph rdma project of 2018? We will test ARM Ampere for all-flash this half-year and probably get the opportunity to experiment with software defined memory. Regards, Joachim ___ Clyso GmbH Am 08.02.2021 um

[ceph-users] Re: Debian repo for ceph-iscsi

2020-12-18 Thread Joachim Kraftmayer
Hello Chris, you are looking for this: https://packages.debian.org/buster-backports/ceph-iscsi Regards, Joachim -- ___ Clyso GmbH Am 11.12.2020 um 19:11 schrieb Chris Palmer: I just went to setup an iscsi gateway on a Debian Buster / Octopus cluster and hit

[ceph-users] Re: Increase number of objects in flight during recovery

2020-12-03 Thread Joachim Kraftmayer
Hi Frank, this values we used to reduce the recovery impact before luminous. #reduce recovery impact osd max backfills osd recovery max active osd recovery max single start osd recovery op priority osd recovery threads osd backfill scan max osd backfill scan min I do not know how many osds and

[ceph-users] Re: pgs stuck backfill_toofull

2020-11-02 Thread Joachim Kraftmayer
Stefan, I agree with you. In Jewel the recovery process is not really throttled by default. With Luminous and later you benefit from dynamic resharding and the too big OMAPs handling. Regars, Joachim ___ Clyso GmbH Am 29.10.2020 um 21:30 schrieb Stefan

[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread Joachim Kraftmayer
A few years ago Dan van der Ster and I were working on two similar scripts for increasing pgs. Just have a look at the following link: https://github.com/cernceph/ceph-scripts/blob/master/tools/split/ceph-gentle-split ___ Clyso GmbH Am 18.08.2020 um

[ceph-users] Re: Not able to start object gateway

2020-04-27 Thread Joachim Kraftmayer
Hello Sailaja, Do you still have this problem? Have you checked the crush rule for your pools to see if the data distribution rule is met? Regards, Joachim ___ Clyso GmbH Homepage: https://www.clyso.com Am 24.04.2020 um 16:02 schrieb Sailaja Yedugundla: I