[ceph-users] Re: How to specify id on newly created OSD with Ceph Orchestrator
On 7/23/24 08:24, Iztok Gregori wrote: Am I missing something obvious or with Ceph orchestrator there are non way to specify an id during the OSD creation? Why would you want to do that? A new OSD always gets the lowest available ID. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm has a small wart
On 7/18/24 21:50, Tim Holloway wrote: I've been setting up a cookbook OSD creation process and as I walked through the various stages, I noted that the /etc/redhat-release file said "CentOS Stream 8". This is the case because the orachestrator cephadm uses container images based on CentOS 8. When you execute "cephadm shell" it starts a container with that image for you. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm for Ubuntu 24.04
Hi, On 7/12/24 10:47, tpDev Tester wrote: Finally, I'm looking for a solution for production use and it would be great if I don't have to leave the usual Ubuntu procedures, especially when it comes to updates. We are also confused about the "RC vs. LTS"-thing. I would suggest to use Ubuntu 22.04 LTS as the base operating system. You can use cephadm on top of that without issues. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Use of db_slots in DriveGroup specification?
Hi, On 7/11/24 09:01, Eugen Block wrote: apparently, db_slots is still not implemented. I just tried it on a test cluster with 18.2.2: I am thinking about a PR to correct the documentation. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] use of db_slots in DriveGroup specification?
Hi, what is the purpose of the db_slots attribute in a DriveGroup specification? My interpretation of the documentation is that I can define how many OSDs use one db device. https://docs.ceph.com/en/reef/cephadm/services/osd/#additional-options "db_slots - How many OSDs per DB device" The default for the cephadm orchestrator is to create as many DB volumes on a DB device as needed for the number of OSDs. In a scenario where there are empty slots for HDDs and an existing DB device should not be used fully "db_slots" could be used. But even if db_slots is in an OSD service spec the orchestrator will only create as many DB volumes as there are HDDs currently available. There is a discussion from 2021 where the use of "block_db_size" and "limit" is suggested: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/6EVOYOHS3BTTNLKBRGLPTZ76HPNLP6FC/#6EVOYOHS3BTTNLKBRGLPTZ76HPNLP6FC Shouldn't db_slots make that easier? Is this a bug in the orchestrator? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cannot delete service by ceph orchestrator
On 29.06.24 12:11, Alex from North wrote: osd.osd_using_paths 108 9m ago - prometheus ?:9095 3/3 9m ago 6d ceph1;ceph6;ceph10;count:3 And then ceph orch rm root@ceph1:~/ceph-rollout# ceph orch rm osd.osd_using_paths Invalid service 'osd.osd_using_paths'. Use 'ceph orch ls' to list available services. cannot find it for some reason. unmanaged: true As long as there are OSDs that were created by a drivegroup specification you cannot really delete the service. It is set to unmanaged and therefor will not create any new OSDs. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Slow down RGW updates via orchestrator
Hi, On 6/26/24 11:49, Boris wrote: Is there a way to only update 1 daemon at a time? You can use the feature "staggered upgrade": https://docs.ceph.com/en/reef/cephadm/upgrade/#staggered-upgrade Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph crash :-(
Hi, On 13.06.24 20:29, Ranjan Ghosh wrote: Other Ceph nodes run on 18.2 which came with the previous Ubuntu version. I wonder if I could easily switch to Ceph packages or whether that would cause even more problems. Perhaps it's more advisable to wait until Ubuntu releases proper packages. Read the Ceph documentation about upgrading a Ceph cluster. You cannot just upgrade packages on one node and reboot it. There is a certain order to follow. This is why it's bad to use the packages shipped by the distribution: When upgrading the distribution on one node you also upgrade the Ceph packages. download.ceph.com has packages for Ubuntu 22.04 and nothing for 24.04. Therefor I would assume Ubuntu 24.04 is not a supported platform for Ceph (unless you use the cephadm orchestrator and container). BTW: Please keep the discussion on the mailing list. Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph crash :-(
On 13.06.24 18:18, Ranjan Ghosh wrote: What's more APT says I now got a Ceph Version (19.2.0~git20240301.4c76c50-0ubuntu6) which doesn't even have any official release notes: Ubuntu 24.04 ships with that version from a git snapshot. You have to ask Canonical why they did this. I would not use Ceph packages shipped from a distribution but always the ones from download.ceph.com or even better the container images that come with the orchestrator. Why version do your other Ceph nodes run on? Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: tuning for backup target cluster
Hi, On 6/4/24 16:15, Anthony D'Atri wrote: I've wondered for years what the practical differences are between using a namespace and a conventional partition. Namespaces show up as separate block devices in the kernel. The orchestrator will not touch any devices that contain a partition table or logical volume signatures. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Update OS with clean install
Hi, On 6/4/24 14:35, Sake Ceph wrote: * Store host labels (we use labels to deploy the services) * Fail-over MDS and MGR services if running on the host * Remove host from cluster * Add host to cluster again with correct labels AFAIK the steps above are not necessary. It should be sufficient to do these: * Set host in maintenance mode * Reinstall host with newer OS * Configure host with correct settings (for example cephadm user SSH key etc.) * Unset maintenance mode for the host * For OSD hosts run ceph cephadm osd activate Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: tuning for backup target cluster
On 6/4/24 12:47, Lukasz Borek wrote: Using cephadm, is it possible to cut part of the NVME drive for OSD and leave rest space for RocksDB/WALL? Not out of the box. You could check if your devices support NVMe namespaces and create more than one namespace on the device. The kernel then sees multiple block devices and for the orchestrator they are completely separate. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to create custom container that exposes a listening port?
On 5/31/24 16:07, Robert Sander wrote: extra_container_args: - "--publish 8080/tcp" Never mind, in the custom container service specification it's "args", not "extra_container_args". Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] How to create custom container that exposes a listening port?
Hi, with the custom container service it is possible to deploy other services with the cephadm orchestrator: https://docs.ceph.com/en/reef/cephadm/services/custom-container/ The application inside the container wants to listen to a port (say 8080/tcp). How do I tell podman to publish this port? I tried adding this to the service specification: extra_container_args: - "--publish 8080/tcp" but this results in this error: Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument 'extra_container_args' https://docs.ceph.com/en/reef/cephadm/services/#extra-container-arguments Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to setup NVMeoF?
Hi, On 5/30/24 14:18, Frédéric Nass wrote: ceph config set mgr mgr/cephadm/container_image_nvmeof "quay.io/ceph/nvmeof:1.2.13" Thanks for the hint. With that the orchestrator deploys the current container image. But: It suddenly listens on port 5499 instead of 5500 and: # podman run -it quay.io/ceph/nvmeof-cli:latest --server-address 10.128.8.29 --server-port 5500 subsystem add --subsystem nqn.2016-06.io.spdk:cephtest29 Failure adding subsystem nqn.2016-06.io.spdk:cephtest29: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:10.128.8.29:5500: Failed to connect to remote host: Connection refused" debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:10.128.8.29:5500: Failed to connect to remote host: Connection refused {grpc_status:14, created_time:"2024-05-30T13:59:33.24226686+00:00"}" # podman run -it quay.io/ceph/nvmeof-cli:latest --server-address 10.128.8.29 --server-port 5499 subsystem add --subsystem nqn.2016-06.io.spdk:cephtest29 Failure adding subsystem nqn.2016-06.io.spdk:cephtest29: <_InactiveRpcError of RPC that terminated with: status = StatusCode.UNIMPLEMENTED details = "Method not found!" debug_error_string = "UNKNOWN:Error received from peer ipv4:10.128.8.29:5499 {created_time:"2024-05-30T13:59:49.678809906+00:00", grpc_status:12, grpc_message:"Method not found!"}" Is this not production ready? Why is it in the documentation for a released Ceph version? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to setup NVMeoF?
Hi, On 5/30/24 11:58, Robert Sander wrote: I am trying to follow the documentation at https://docs.ceph.com/en/reef/rbd/nvmeof-target-configure/ to deploy an NVMe over Fabric service. It looks like the cephadm orchestrator in this 18.2.2 cluster uses the image quay.io/ceph/nvmeof:0.0.2 which is 9 months old. When I try to redeploy the daemon with the latest image ceph orch daemon redeploy nvmeof.nvme01.cephtest29.gookea --image quay.io/ceph/nvmeof:latest it tells me: Error EINVAL: Cannot redeploy nvmeof.nvme01.cephtest29.gookea with a new image: Supported types are: mgr, mon, crash, osd, mds, rgw, rbd-mirror, cephfs-mirror, ceph-exporter, iscsi, nfs How do I set the container image for this service? ceph config set nvmeof container_image quay.io/ceph/nvmeof:latest does not work with Error EINVAL: unrecognized config target 'nvmeof' Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] How to setup NVMeoF?
Hi, I am trying to follow the documentation at https://docs.ceph.com/en/reef/rbd/nvmeof-target-configure/ to deploy an NVMe over Fabric service. Step 2b of the configuration section is currently the showstopper. First the command says: error: the following arguments are required: --host-name/-t Then it tells me (after adding --host-name): error: unrecognized arguments: --gateway-name XXX and when I remove --gateway-name the error is: both gateway_name and traddr or neither must be specified So I am stuck in a kind of a loop here. Is there a working description for NVMe over TCP available? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Rebalance OSDs after adding disks?
On 5/30/24 08:53, tpDev Tester wrote: Can someone please point me to the docs how I can expand the capacity of the pool without such problems. Please show the output of ceph status ceph df ceph osd df tree ceph osd crush rule dump ceph osd pool ls detail Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: We are using ceph octopus environment. For client can we use ceph quincy?
On 5/27/24 09:28, s.dhivagar@gmail.com wrote: We are using ceph octopus environment. For client can we use ceph quincy? Yes. -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm basic questions: image config, OS reimages
On 5/16/24 17:50, Robert Sander wrote: cephadm osd activate HOST would re-activate the OSDs. Small but important typo: It's ceph cephadm osd activate HOST Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm basic questions: image config, OS reimages
Hi, On 5/16/24 17:44, Matthew Vernon wrote: cephadm --image docker-registry.wikimedia.org/ceph shell ...but is there a good way to arrange for cephadm to use the already-downloaded image without having to remember to specify --image each time? You could create a shell alias: alias cephshell="cephadm --image docker-registry.wikimedia.org/ceph shell" * OS reimages We do OS upgrades by reimaging the server (which doesn't touch the storage disks); on an old-style deployment you could then use ceph-volume to re-start the OSDs and away you went; how does one do this in a cephadm cluster? [I presume involves telling cephadm to download a new image for podman to use and suchlike] cephadm osd activate HOST would re-activate the OSDs. Before doing maintenance on a host run ceph orch host maintenance enter HOST and the orchestrator will stop the OSDs and set them to noout and will try to move other services away from the host if possible. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS crash in interval_set: FAILED ceph_assert(p->first <= start)
On 5/9/24 07:22, Xiubo Li wrote: We are disscussing the same issue in slack thread https://ceph-storage.slack.com/archives/C04LVQMHM9B/p1715189877518529. Why is there a discussion about a bug off-list on a proprietary platform? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: MDS 17.2.7 crashes at rejoin
Hi, would an update to 18.2 help? Regards -- Robert Sander Heinlein Support GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 93818 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] MDS 17.2.7 crashes at rejoin
7f1921e91700 0 mds.0.cache failed to open ino 0x40a58db err -22/-22 -22> 2024-05-06T16:07:15.534+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a58f4 err -22/-22 -21> 2024-05-06T16:07:15.534+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a5fc4 err -22/-22 -20> 2024-05-06T16:07:15.534+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a634d err -22/-22 -19> 2024-05-06T16:07:15.534+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a63bf err -22/-22 -18> 2024-05-06T16:07:15.542+ 7f1921e91700 0 mds.0.cache failed to open ino 0x200012bc94c err -22/0 -17> 2024-05-06T16:07:15.542+ 7f1921e91700 0 mds.0.cache failed to open ino 0x200012bc980 err -22/-22 -16> 2024-05-06T16:07:15.546+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a58cf err -22/-22 -15> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a58db err -22/-22 -14> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a58f4 err -22/-22 -13> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a5fc4 err -22/-22 -12> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a634d err -22/-22 -11> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x40a63bf err -22/-22 -10> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x200012bfd9c err -22/-22 -9> 2024-05-06T16:07:15.550+ 7f1921e91700 0 mds.0.cache failed to open ino 0x200012bfb78 err -22/-22 -8> 2024-05-06T16:07:15.554+ 7f1921e91700 0 mds.0.cache failed to open ino 0x35b0274 err -22/0 -7> 2024-05-06T16:07:15.554+ 7f1921e91700 0 mds.0.cache failed to open ino 0x35b2e32 err -22/-22 -6> 2024-05-06T16:07:15.554+ 7f1921e91700 0 mds.0.cache failed to open ino 0x35b388d err -22/-22 -5> 2024-05-06T16:07:15.562+ 7f1921e91700 0 mds.0.cache failed to open ino 0x35b0274 err -22/0 -4> 2024-05-06T16:07:15.562+ 7f1921e91700 0 mds.0.cache failed to open ino 0x35b2e32 err -22/-22 -3> 2024-05-06T16:07:15.562+ 7f1921e91700 0 mds.0.cache failed to open ino 0x35b388d err -22/-22 -2> 2024-05-06T16:07:15.562+ 7f1921e91700 0 mds.0.cache failed to open ino 0x4d5a226 err -22/-22 -1> 2024-05-06T16:07:15.634+ 7f1921e91700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDCache.cc: In function 'void MDCache::rejoin_send_rejoins()' thread 7f1921e91700 time 2024-05-06T16:07:15.635683+ /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.2.7/rpm/el8/BUILD/ceph-17.2.7/src/mds/MDCache.cc: 4086: FAILED ceph_assert(auth >= 0) ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f1930ad94a3] 2: /usr/lib64/ceph/libceph-common.so.2(+0x269669) [0x7f1930ad9669] 3: (MDCache::rejoin_send_rejoins()+0x216b) [0x5614ac8747eb] 4: (MDCache::process_imported_caps()+0x1993) [0x5614ac872353] 5: (Context::complete(int)+0xd) [0x5614ac6e182d] 6: (MDSContext::complete(int)+0x5f) [0x5614aca41f4f] 7: (void finish_contexts > >(ceph::common::CephContext*, std::vector >&, int)+0x8d) [0x5614ac6e6f5d] 8: (OpenFileTable::_open_ino_finish(inodeno_t, int)+0x156) [0x5614aca765a6] 9: (MDSContext::complete(int)+0x5f) [0x5614aca41f4f] 10: (void finish_contexts > >(ceph::common::CephContext*, std::vector >&, int)+0x8d) [0x5614ac6e6f5d] 11: (MDCache::open_ino_finish(inodeno_t, MDCache::open_ino_info_t&, int)+0x138) [0x5614ac867168] 12: (MDCache::_open_ino_backtrace_fetched(inodeno_t, ceph::buffer::v15_2_0::list&, int)+0x290) [0x5614ac87ff90] 13: (MDSContext::complete(int)+0x5f) [0x5614aca41f4f] 14: (MDSIOContextBase::complete(int)+0x534) [0x5614aca426e4] 15: (Finisher::finisher_thread_entry()+0x18d) [0x7f1930b7884d] 16: /lib64/libpthread.so.0(+0x81ca) [0x7f192fac81ca] 17: clone() How do we solve this issue? Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Remove failed OSD
Hi, On 04.05.24 07:29, Zakhar Kirpichenko wrote: If I try to remove the OSD again, I get an error: # ceph orch daemon rm osd.19 --force Error EINVAL: Unable to find daemon(s) ['osd.19'] The command to remove and OSD in an cephadm orchestrated cluster is ceph orch osd rm 19 as per https://docs.ceph.com/en/reef/cephadm/services/osd/#remove-an-osd This will make sure that the OSD is not needed any more (data is drained etc). Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph recipe for nfs exports
On 4/29/24 17:21, Roberto Maggi @ Debian wrote: I can mount it correctly, but when I try to write or touch any file in it, it returns me "Permission denied" Works as expected. You have set squash to all, now every NFS client is mapped to nobody on the server. But only root is able to write to the CephFS at first. Set squash to "no_root_squash" to be able to write as root to the NFS share. Create a directory and change its permissions to someone else. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph Squid released?
On 4/29/24 09:36, Alwin Antreich wrote: Who knows. I don't see any packages on download.ceph.com <http://download.ceph.com> for Squid. Ubuntu has them: https://packages.ubuntu.com/noble/ceph Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph Squid released?
On 4/29/24 08:50, Alwin Antreich wrote: well it says it in the article. The upcoming Squid release serves as a testament to how the Ceph project continues to deliver innovative features to users without compromising on quality. I believe it is more a statement of having new members and tiers and to sound the marketing drums a bit. :) The Ubuntu 24.04 release notes also claim that this release comes with Ceph Squid: https://discourse.ubuntu.com/t/noble-numbat-release-notes/39890 Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Ceph Squid released?
Hi, https://www.linuxfoundation.org/press/introducing-ceph-squid-the-future-of-storage-today Does the LF know more than the mailing list? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Add node-exporter using ceph orch
On 4/26/24 15:47, Vahideh Alinouri wrote: The result of this command shows one of the servers in the cluster, but I have node-exporter daemons on all servers. The default service specification looks like this: service_type: node-exporter service_name: node-exporter placement: host_pattern: '*' If you apply this YAML code the orchestrator should deploy one node-exporter daemon to each host of the cluster. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Add node-exporter using ceph orch
On 4/26/24 12:15, Vahideh Alinouri wrote: Hi guys, I have tried to add node-exporter to the new host in ceph cluster by the command mentioned in the document. ceph orch apply node-exporter hostname Usually a node-exporter daemon is deployed on all cluster hosts by the node-exporter service and its placement strategy. What does your node-exporter service look like? ceph orch ls node-exporter --export Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph recipe for nfs exports
Hi, On 4/24/24 09:39, Roberto Maggi @ Debian wrote: ceph orch host add cephstage01 10.20.20.81 --labels _admin,mon,mgr,prometheus,grafana ceph orch host add cephstage02 10.20.20.82 --labels _admin,mon,mgr,prometheus,grafana ceph orch host add cephstage03 10.20.20.83 --labels _admin,mon,mgr,prometheus,grafana ceph orch host add cephstagedatanode01 10.20.20.84 --labels osd,nfs,prometheus ceph orch host add cephstagedatanode02 10.20.20.85 --labels osd,nfs,prometheus ceph orch host add cephstagedatanode03 10.20.20.86 --labels osd,nfs,prometheus --> network setup and daemons deploy ceph config set mon public_network 10.20.20.0/24,192.168.7.0/24 ceph orch apply mon --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83" ceph orch apply mgr --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83" ceph orch apply prometheus --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86" ceph orch apply grafana --placement="cephstage01:10.20.20.81,cephstage02:10.20.20.82,cephstage03:10.20.20.83,cephstagedatanode01:10.20.20.84,cephstagedatanode02:10.20.20.85,cephstagedatanode03:10.20.20.86" Two remarks here: - You are labeling all the hosts but then you use a hostname based placement strategy for the services. Why not use the labels for placing the services? - Usually you only need one Prometheus, one Grafana and one Alert-Manager in the cluster. There is no need to deploy these on each host. ceph nfs export create cephfs --cluster-id nfs-cephfs --pseudo-path /mnt --fsname vol1 --> nfs mount mount -t nfs -o nfsvers=4.1,proto=tcp 192.168.7.80:/mnt /mnt/ceph is my recipe correct? Apart from the remarks above it should get you a working NFS export. - Although I can mount the export I can't write on it You have not specified a value for --squash when creating the export. Your CephFS is empty and the root directory is only writable by the root user, but this gets "squashed" to nobody when using NFS. - I can't understand how to use the sdc disks for journaling When all your devices are SSD you do not need "journaling" (which is today the RocksDB and WAL). - I can't understand the concept of "pseudo path" This is an NFSv4 concept. It allows to mount a virtual root of the NFS server and access all exports below it without having to mount each one separately. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Have a problem with haproxy/keepalived/ganesha/docker
Hi, On 16.04.24 10:49, Eugen Block wrote: I believe I can confirm your suspicion, I have a test cluster on Reef 18.2.1 and deployed nfs without HAProxy but with keepalived [1]. Stopping the active NFS daemon doesn't trigger anything, the MGR notices that it's stopped at some point, but nothing else seems to happen. There is currently no failover for NFS. The ingress service (haproxy + keepalived) that cephadm deploys for an NFS cluster does not have a health check configured. Haproxy does not notice if a backend NFS server dies. This does not matter as there is no failover and the NFS client cannot be "load balanced" to another backend NFS server. There is no use to configure an ingress service currently without failover. The NFS clients have to remount the NFS share in case of their current NFS server dies anyway. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Call for Interest: Managed SMB Protocol Support
Hi, On 3/22/24 19:56, Alexander E. Patrakov wrote: In fact, I am quite skeptical, because, at least in my experience, every customer's SAMBA configuration as a domain member is a unique snowflake, and cephadm would need an ability to specify arbitrary UID mapping configuration to match what the customer uses elsewhere - and the match must be precise. Yes, there has to be a great flexibility possible in the configuration of the SMB service. BTW: It would be great of the orchestrator could configure Ganesha to export NFs shares with Kerberos security, but this is off-topic in this thread. Oh, and by the way, we have this strangely low-numbered group that everybody gets wrong unless they set "idmap config CORP : range = 500-99". This is because Debian changed the standard minimum uid/gid somewhere in the 2000s. And if you have an "old" company running Debian since before then you have user IDs and group IDs in the range 500 - 1000. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Upgrading from Reef v18.2.1 to v18.2.2
Hi, On 3/21/24 14:50, Michael Worsham wrote: Now that Reef v18.2.2 has come out, is there a set of instructions on how to upgrade to the latest version via using Cephadm? Yes, there is: https://docs.ceph.com/en/reef/cephadm/upgrade/ Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSD does not die when disk has failures
Hi, On 3/19/24 13:00, Igor Fedotov wrote: translating EIO to upper layers rather than crashing an OSD is a valid default behavior. One can alter this by setting bluestore_fail_eio parameter to true. What benefit lies in this behavior when in the end client IO stalls? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: PGs with status active+clean+laggy
Hi, On 3/5/24 13:05, ricardom...@soujmv.com wrote: I have a ceph quincy cluster with 5 nodes currently. But only 3 with SSDs. Do not mix HDDs and SSDs in the same pool. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Uninstall ceph rgw
On 3/5/24 11:05, Albert Shih wrote: But I like to clean up and «erase» everything about rgw ? not only to try to understand but also because I think I mixted up between realm and zonegroup... Remove the service with "ceph orch rm …" and then remove all the pools the rgw services has created. They usually have "rgw" in their name. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Upgraded 16.2.14 to 16.2.15
Hi, On 3/5/24 08:57, Eugen Block wrote: extra_entrypoint_args: - '--mon-rocksdb-options=write_buffer_size=33554432,compression=kLZ4Compression,level_compaction_dynamic_level_bytes=true,bottommost_compression=kLZ4HCCompression,max_background_jobs=4,max_subcompactions=2' When I try this on my test cluster with Reef 18.2.1 the orchestrator tells me: # ceph orch apply -i mon.yml Error EINVAL: ServiceSpec: __init__() got an unexpected keyword argument 'extra_entrypoint_args' It's a documented feature: https://docs.ceph.com/en/reef/cephadm/services/#cephadm-extra-entrypoint-args Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm and Ceph.conf
On 2/26/24 15:24, Michael Worsham wrote: So how would I be able to put in configurations like this into it? [global] fsid = 46620486-b8a6-11ee-bf23-6510c4d9efa7 mon_host = [v2:10.20.27.10:3300/0,v1:10.20.27.10:6789/0] [v2:10.20.27.11:3300/0,v1:10.20.27.11:6789/0] osd pool default size = 3 osd pool default min size = 2 osd pool default pg num = 256 osd pool default pgp num = 256 mon_max_pg_per_osd = 800 osd max pg per osd hard ratio = 10 mon allow pool delete = true auth cluster required = cephx auth service required = cephx auth client required = cephx ms_mon_client_mode = crc [client.radosgw.mon1] host = ceph-mon1 log_file = /var/log/ceph/client.radosgw.mon1.log rgw_dns_name = ceph-mon1 rgw_frontends = "beast port=80 num_threads=500" rgw_crypt_require_ssl = false ceph config assimilate-conf may be of help here. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm and Ceph.conf
On 2/26/24 14:24, Michael Worsham wrote: I deployed a Ceph reef cluster using cephadm. When it comes to the ceph.conf file, which file should I be editing for making changes to the cluster - the one running under the docker container or the local one on the Ceph monitors? None of both. You can adjust settings with "ceph config" or the Configuration tab of the Dashboard. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Some questions about cephadm
Hi, On 2/26/24 13:22, wodel youchi wrote: No didn't work, the bootstrap is still downloading the images from quay. For the image locations of the monitoring stack you have to create an initical ceph.conf like it is mentioned in the chapter you referred earlier: https://docs.ceph.com/en/reef/cephadm/install/#deployment-in-an-isolated-environment Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Some questions about cephadm
Hi, On 26.02.24 11:08, wodel youchi wrote: Then I tried to deploy using this command on the admin node: cephadm --image 192.168.2.36:4000/ceph/ceph:v17 bootstrap --mon-ip 10.1.0.23 --cluster-network 10.2.0.0/16 After the boot strap I found that it still downloads the images from the internet, even the ceph image itself, I see two images one from my registry the second from quay. To quote the docs: you can run cephadm bootstrap -h to see all of cephadm’s available options. These options are available: --registry-url REGISTRY_URL url for custom registry --registry-username REGISTRY_USERNAME username for custom registry --registry-password REGISTRY_PASSWORD password for custom registry --registry-json REGISTRY_JSON json file with custom registry login info (URL, Username, Password) Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Understanding subvolumes
On 01.02.24 00:20, Matthew Melendy wrote: In our department we're getting starting with Ceph 'reef', using Ceph FUSE client for our Ubuntu workstations. So far so good, except I can't quite figure out one aspect of subvolumes. AFAIK subvolumes were introduced to be used with Kubernetes and other cloud technologies. If you run a classical file service on top of CephFS you usually do not need subvolumes but can go with normal quotas on directories. Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Questions about the CRUSH details
On 1/25/24 13:32, Janne Johansson wrote: It doesn't take OSD usage into consideration except at creation time or OSD in/out/reweighing (or manual displacements with upmap and so forth), so this is why "ceph df" will tell you a pool has X free space, where X is "smallest free space on the OSDs on which this pool lies, times the number of OSDs". Given the pseudorandom placement of objects to PGs, there is nothing to prevent you from having the worst luck ever and all the objects you create end up on the OSD with least free space. This is why you need a decent amount of PGs, to not run into statistical edge cases. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How many pool for cephfs
Hi, On 1/24/24 10:08, Albert Shih wrote: 99.99% because I'm newbie with ceph and don't understand clearly how the autorisation work with cephfs ;-) I strongly recommend you to ask for a expierenced Ceph consultant that helps you design and setup your storage cluster. It looks like you try to make design decisions that will heavily influence performance of the system. If I say 20-30 it's because I currently have on my classic ZFS/NFS server around 25 «datasets» exported to various server. The next question is how would the "consumers" access the filesystem: Via NFS or mounted directly. Even with the second option you can separate client access via CephX keys as David already wrote. Ok. I got for my ceph cluster two set of servers, first set are for services (mgr,mon,etc.) with ssd and don't currently run any osd (but still have 2 ssd not used), I also got a second set of server with HDD and 2 SSD. The data pool will be on the second set (with HDD). Where should I run the MDS and on which osd ? Do you intend to use the Ceph cluster only for archival storage? Hwo large is your second set of Ceph nodes, how many HDDs in each? Do you intend to use the SSDs for the OSDs' RocksDB? Where do you plan to store the metadata pools for CephFS? They should be stored on fats media. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How many pool for cephfs
Hi, On 1/24/24 09:40, Albert Shih wrote: Knowing I got two class of osd (hdd and ssd), and I have a need of ~ 20/30 cephfs (currently and that number will increase with time). Why do you need 20 - 30 separate CephFS instances? and put all my cephfs inside two of them. Or should I create for each cephfs a couple of pool metadata/data ? Each CephFS instance needs their own pools, at least two (data + metadata) per instance. And each CephFS needs at least one MDS running, better with an additional cold or even hot standby MDS. Il will also need to have ceph S3 storage, same question, should I have a designated pool for S3 storage or can/should I use the same cephfs_data_replicated/erasure pool ? No, S3 needs its own pools. It cannot re-use CephFS pools. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm orchestrator and special label _admin in 17.2.7
Hi, more strang behaviour: When I isssue "ceph mgr fail" a backup MGR takes over and updates all config files on all hosts including /etc/ceph/ceph.conf. At first I thought that this was the solution but when I now remove the _admin label and add it again the new MGR also does not update /etc/ceph/ceph.conf. Only when I again do "ceph mgr fail" the new MGR will update /etc/ceph/ceph.conf on the hosts labeled with _admin. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm orchestrator and special label _admin in 17.2.7
On 1/18/24 14:28, Eugen Block wrote: Is your admin keyring under management? There is no issue with the admin keyring but with ceph.conf. The config setting "mgr mgr/cephadm/manage_etc_ceph_ceph_conf" is set to true and "mgr mgr/cephadm/manage_etc_ceph_ceph_conf_hosts" was at "*", so all hosts. I have set that to "label:_admin". It still does not put ceph.conf into /etc/ceph when adding the label _admin. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Cephadm orchestrator and special label _admin in 17.2.7
Hi, On 1/18/24 14:07, Eugen Block wrote: I just tired that in my test cluster, removed the ceph.conf and admin keyring from /etc/ceph and then added the _admin label to the host via 'ceph orch' and both were created immediately. This is strange, I only get this: 2024-01-18T11:47:07.870079+0100 mgr.cephtest32.ybltym [INF] Added label _admin to host cephtest23 2024-01-18T11:47:07.878786+0100 mgr.cephtest32.ybltym [INF] Updating cephtest23:/var/lib/ceph/ba37db20-2b13-11eb-b8a9-871ba11409f6/config/ceph.conf 2024-01-18T11:47:08.045347+0100 mgr.cephtest32.ybltym [INF] Updating cephtest23:/etc/ceph/ceph.client.admin.keyring 2024-01-18T11:47:08.212303+0100 mgr.cephtest32.ybltym [INF] Updating cephtest23:/var/lib/ceph/ba37db20-2b13-11eb-b8a9-871ba11409f6/config/ceph.client.admin.keyring Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Cephadm orchestrator and special label _admin in 17.2.7
Hi, According to the documentation¹ the special host label _admin instructs the cephadm orchestrator to place a valid ceph.conf and the ceph.client.admin.keyring into /etc/ceph of the host. I noticed that (at least) on 17.2.7 only the keyring file is placed in /etc/ceph, but not ceph.conf. Both files are placed into the /var/lib/ceph//config directory. Has something changed? ¹: https://docs.ceph.com/en/quincy/cephadm/host-management/#special-host-labels Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm bootstrap on 3 network clusters
Hi Luis, On 1/3/24 16:12, Luis Domingues wrote: My issue is that mon1 cannot connect via SSH to itself using pub network, and bootstrap fail at the end when cephadm tries to add mon1 to the list of hosts. Why? The public network should not have any restrictions between the Ceph nodes. Same with the cluster network. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm bootstrap on 3 network clusters
Hi, On 1/3/24 14:51, Luis Domingues wrote: But when I bootstrap my cluster, I set my MON IP and CLUSTER NETWORK, and then the bootstrap process tries to add my bootstrap node using the MON IP. IMHO the bootstrap process has to run directly on the first node. The MON IP is local to this node. It is used to determine the public network. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph Docs: active releases outdated
Hi Eugen, the release info is current only in the latest branch of the documentation: https://docs.ceph.com/en/latest/releases/ Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
On 22.12.23 11:46, Marc wrote: Does podman have this still, what dockers has. That if you kill the docker daemon all tasks are killed? Podman does not come with a daemon to start containers. The Ceph orchestrator creates systemd units to start the daemons in podman containers. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
Hi, On 22.12.23 11:41, Albert Shih wrote: for n in 1-100 Put off line osd on server n Uninstall docker on server n Install podman on server n redeploy on server n end Yep, that's basically the procedure. But first try it on a test cluster. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
On 21.12.23 22:27, Anthony D'Atri wrote: It's been claimed to me that almost nobody uses podman in production, but I have no empirical data. I even converted clusters from Docker to podman while they stayed online thanks to "ceph orch redeploy". Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
Hi, On 21.12.23 15:13, Nico Schottelius wrote: I would strongly recommend k8s+rook for new clusters, also allows running Alpine Linux as the host OS. Why would I want to learn Kubernetes before I can deploy a new Ceph cluster when I have no need for K8s at all? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
Hi, On 21.12.23 19:11, Albert Shih wrote: What is the advantage of podman vs docker ? (I mean not in general but for ceph). Docker comes with the Docker daemon that when it gets an update has to be restarted and restarts all containers. For a storage system not the best procedure. Everything needed for the Ceph containers is provided by podman. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Building new cluster had a couple of questions
Hi, On 12/21/23 14:50, Drew Weaver wrote: #1 cephadm or ceph-ansible for management? cephadm. The ceph-ansible project writes in its README: NOTE: cephadm is the new official installer, you should consider migrating to cephadm. https://github.com/ceph/ceph-ansible #2 Since the whole... CentOS thing... what distro appears to be the most straightforward to use with Ceph? I was going to try and deploy it on Rocky 9. Any distribution with a recent systemd, podman, LVM2 and time synchronization is viable. I prefer Debian, others RPM-based distributions. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: EC Profiles & DR
On 12/5/23 10:06, duluxoz wrote: I'm confused - doesn't k4 m2 mean that you can loose any 2 out of the 6 osds? Yes, but OSDs are not a good failure zone. The host is the smallest failure zone that is practicable and safe against data loss. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: EC Profiles & DR
On 12/5/23 10:01, duluxoz wrote: Thanks David, I knew I had something wrong :-) Just for my own edification: Why is k=2, m=1 not recommended for production? Considered to "fragile", or something else? It is the same as a replicated pool with size=2. Only one host can go down. After that you risk to lose data. Erasure coding is possible with a cluster size of 10 nodes or more. With smaller clusters you have to go with replicated pools. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph osd dump_historic_ops
On 12/1/23 10:33, Phong Tran Thanh wrote: ceph daemon osd.8 dump_historic_ops show the error, the command run on node with osd.8 Can't get admin socket path: unable to get conf option admin_socket for osd: b"error parsing 'osd': expected string of the form TYPE.ID, valid types are: auth, mon, osd, mds, mgr, client\n" I am running ceph cluster reef version by cephadmin install When the daemons run in containers managed by the cephadm orchestrator the socket file has a different location and the command line tool ceph (run outisde the container) does not find it automatically. You can run # ceph daemon /var/run/ceph/$FSID/ceph-osd.$OSDID.asok dump_historic_ops to use the socket outside the container. Or you enter the container with # cephadm enter --name osd.$OSDID and then execute # ceph daemon osd.$OSDID dump_historic_ops inside the container. $FSID is the UUID of the Ceph cluster, $OSDID is the OSD id. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Where is a simple getting started guide for a very basic cluster?
On 11/28/23 17:50, Leo28C wrote: Problem is I don't have the cephadm command on every node. Do I need it on all nodes or just one of them? I tried installing it via curl but my ceph version is 14.2.22 which is not on the archive anymore so the curl command returns a 404 error html file. How do I get cephadm for 14.2? There is no cephadm for Ceph 14 as the orchestrator was first introduced in version 15. Why are you talking about version 14 now anyhow? As long as your nodes fulfill the requirements for cephadm you can install the latest version of Ceph. PS: Please reply to the list. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Where is a simple getting started guide for a very basic cluster?
Hi, have you read https://docs.ceph.com/en/reef/cephadm/install/ ? Bootstrapping a new cluster should be as easy as # cephadm bootstrap --mon-ip ** if the nodes fulfill the requirements: - Python 3 - Systemd - Podman or Docker for running containers - Time synchronization (such as chrony or NTP) - LVM2 for provisioning storage devices Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph storage pool error
Hi, On 11/7/23 12:35, necoe0...@gmail.com wrote: Ceph 3 clusters are running and the 3rd cluster gave an error, it is currently offline. I want to get all the remaining data in 2 clusters. Instead of fixing ceph, I just want to save the data. How can I access this data and connect to the pool? Can you help me?1 and 2 clusters are working. I want to view my data from them and then transfer them to another place. How can I do this? I have never used Ceph before. Please send the output of: ceph -s ceph health detail ceph osd df tree Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe
Hi, On 11/2/23 13:05, Mohamed LAMDAOUAR wrote: when I ran this command, I got this error (because the database of the osd was on the boot disk) The RocksDB part of the OSD was on the failed SSD? Then the OSD is lost and cannot be recovered. The RocksDB contains the information where each object is stored on the OSD data partition and without it nobody knows where each object is. The data is lost. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe
On 11/2/23 12:48, Mohamed LAMDAOUAR wrote: I reinstalled the OS on a new SSD disk. How can I rebuild my cluster with only one mons. If there is one MON still operating you can try to extract its monmap and remove all the other MONs from it with the monmaptool: https://docs.ceph.com/en/latest/man/8/monmaptool/ https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/#recovering-a-monitor-s-broken-monmap This way the remaining MON will be the only one in the map and will have quorum and the cluster will work again. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe
Hi, On 11/2/23 11:28, Mohamed LAMDAOUAR wrote: I have 7 machines on CEPH cluster, the service ceph runs on a docker container. Each machine has 4 hdd of data (available) and 2 nvme sssd (bricked) During a reboot, the ssd bricked on 4 machines, the data are available on the HDD disk but the nvme is bricked and the system is not available. is it possible to recover the data of the cluster (the data disk are all available) You can try to recover the MON db from the OSDs, as they keep a copy of it: https://docs.ceph.com/en/reef/rados/troubleshooting/troubleshooting-mon/#monitor-store-failures Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to deal with increasing HDD sizes ? 1 OSD for 2 LVM-packed HDDs ?
On 10/18/23 09:25, Renaud Jean Christophe Miel wrote: Hi, Use case: * Ceph cluster with old nodes having 6TB HDDs * Add new node with new 12TB HDDs Is it supported/recommended to pack 2 6TB HDDs handled by 2 old OSDs into 1 12TB LVM disk handled by 1 new OSD ? The 12 TB HDD will get double the IO than one of the 6 TB HDDs. But it will still only be able to handle about 120 IOPs. This makes the newer larger HDDs a bottleneck when run in the same pool. If you are not planning to decommission the smaller HDDs it is recommended to use the larger ones in a separate pool for performance reasons. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm configuration in git
On 10/11/23 11:42, Kamil Madac wrote: Is it possible to do it with cephadm? Is it possible to have some config files in git and then apply same cluster configuration on multiple clusters? Or is this approach not aligned with cephadm and we should do it different way? You can export the service specifications with "ceph orch ls --export" and import the YAML file with "ceph orch apply -i …". This does not cover the hosts in the cluster. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Status of IPv4 / IPv6 dual stack?
On 9/18/23 11:19, Stefan Kooman wrote: IIIRC, the "enable dual" stack PR's were more or less "accidentally" merged So this looks like a big NO on the dual stack support for Ceph. I just need an answer, I do not need dual stack support. It would be nice if the documentation was a little be clearer on this topic. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Status of IPv4 / IPv6 dual stack?
Hi, as the documentation sends mixed signals in https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode "Note Binding to IPv4 is enabled by default, so if you just add the option to bind to IPv6 you’ll actually put yourself into dual stack mode." and https://docs.ceph.com/en/latest/rados/configuration/msgr2/#address-formats "Note The ability to bind to multiple ports has paved the way for dual-stack IPv4 and IPv6 support. That said, dual-stack operation is not yet supported as of Quincy v17.2.0." just the quick questions: Is a dual stacked networking with IPv4 and IPv6 now supported or not? From which version on is it considered stable? Are OSDs now able to register themselves with two IP addresses in the cluster map? MONs too? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph services failing to start after OS upgrade
On 12.09.23 14:51, hansen.r...@live.com.au wrote: I have a ceph cluster running on my proxmox system and it all seemed to upgrade successfully however after the reboot my ceph-mon and my ceph-osd services are failing to start or are crashing by the looks of it. You should ask that question on the Proxmox forum at https://forum.proxmox.com/ as they distribute their own Ceph packages. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Separating Mons and OSDs in Ceph Cluster
Hi, On 9/9/23 09:34, Ramin Najjarbashi wrote: The primary goal is to deploy new Monitors on different servers without causing service interruptions or disruptions to data availability. Just do that. New MONs will be added to the mon map which will be distributed to all running components. All OSDs will immediately know about the new MONs. The same goes when removing an old MON. After that you have to update the ceph.conf on each host to make the change "reboot safe". No need to restart any other component including OSDs. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Status of diskprediction MGR module?
On 8/28/23 13:26, Konstantin Shalygin wrote: The module don't have new commits for more than two year So diskprediction_local is unmaintained. Will it be removed? It looks like a nice feature but when you try to use it it's useless. I suggest to use smartctl_exporter [1] for monitoring drives health I tried to deploy that with cephadm as a custom container. Follow-up questions: How do I tell cephadm that smartctl_exporter has to run in a priviledged container as root with all the devices? How do I tell the cephadm managed Prometheus that it can scrape these new exporters? How do I add a dashboard in cephadm managed Grafana that shows the values from smartctl_exporter? Where do I get such a dashboard? How do I add alerts to the cephadm managed Alert-Manager? Where do I get useful alert definitions for smartctl_exporter metrics? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Status of diskprediction MGR module?
Hi, Several years ago the diskprediction module was added to the MGR collecting SMART data from the OSDs. There were local and cloud modes available claiming different accuracies. Now only the local mode remains. What is the current status of that MGR module (diskprediction_local)? We have a cluster where SMART data is available from the disks (tested with smartctl and visible in the Ceph dashboard), but even with an enabled diskprediction_local module no health and lifetime info is shown. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm orchestrator does not restart daemons [was: ceph orch upgrade stuck between 16.2.7 and 16.2.13]
On 8/16/23 12:10, Eugen Block wrote: I don't really have a good idea right now, but there was a thread [1] about ssh sessions that are not removed, maybe that could have such an impact? And if you crank up the debug level to 30, do you see anything else? It was something similar. There were leftover ceph-volume processes running on some of the OSD nodes. After killing them the cephadm orchestrator is now able to resume the upgrade. As we also restarted the MGR processes (with systemctl restart CONTAINER) there were no leftover SSH sessions. But the still running ceph-volume processes must have used a lock that blocked new cephadm commands. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] cephadm orchestrator does not restart daemons [was: ceph orch upgrade stuck between 16.2.7 and 16.2.13]
On 8/15/23 16:36, Adam King wrote: with the log to cluster level already on debug, if you do a "ceph mgr fail" what does cephadm log to the cluster before it reports sleeping? It should at least be doing something if it's responsive at all. Also, in "ceph orch ps" and "ceph orch device ls" are the REFRESHED columns reporting that they've refreshed the info recently (last 10 minutes for daemons, last 30 minutes for devices)? They have been refreshed very recently. The issue seems to be a bit larger than just the not working upgrade. We are now not even able to restart a daemon. When I issue the command # ceph orch daemon restart crash.cephmon01 these two lines show up in the cephadm log but nothing else happens: 2023-08-16T10:35:41.640027+0200 mgr.cephmon01 [INF] Schedule restart daemon crash.cephmon01 2023-08-16T10:35:41.640497+0200 mgr.cephmon01 [DBG] _kick_serve_loop The container for crash.cephmon01 does not get restarted. It looks like the service loop does not get executed. Can we see what jobs are in this queue and why they do not get executed? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph orch upgrade stuck between 16.2.7 and 16.2.13
On 8/15/23 11:16, Curt wrote: Probably not the issue, but do all your osd servers have internet access? I've had a similar experience when one of our osd servers default gateway got changed, so it was just waiting to download and took a bit to timeout. Yes, all nodes can manually pull the image from quay.io. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph orch upgrade stuck between 16.2.7 and 16.2.13
On 8/15/23 11:02, Eugen Block wrote: I guess I would start looking on the nodes where it failed to upgrade OSDs and check out the cephadm.log as well as syslog. Did you see progress messages in the mgr log for the successfully updated OSDs (or MON/MGR)? The issue is that there is no information on which OSD cephadm tries to upgrade next. There is no failure reported. It seems to just sit there and wait for something. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] ceph orch upgrade stuck between 16.2.7 and 16.2.13
Hi, A healthy 16.2.7 cluster should get an upgrade to 16.2.13. ceph orch upgrade start --ceph-version 16.2.13 did upgrade MONs, MGRs and 25% of the OSDs and is now stuck. We tried several "ceph orch upgrade stop" and starts again. We "failed" the active MGR but no progress. We set the debug logging with "ceph config set mgr mgr/cephadm/log_to_cluster_level debug" but it only tells that it starts: 2023-08-15T09:05:58.548896+0200 mgr.cephmon01 [INF] Upgrade: Started with target quay.io/ceph/ceph:v16.2.13 How can we check what is happening (or not happening) here? How do we get cephadm to complete the task? Current status is: # ceph orch upgrade status { "target_image": "quay.io/ceph/ceph:v16.2.13", "in_progress": true, "which": "Upgrading all daemon types on all hosts", "services_complete": [], "progress": "", "message": "", "is_paused": false } # ceph -s cluster: id: 3098199a-c7f5-4baf-901c-f178131be6f4 health: HEALTH_WARN There are daemons running an older version of ceph services: mon: 5 daemons, quorum cephmon02,cephmon01,cephmon03,cephmon04,cephmon05 (age 4d) mgr: cephmon03(active, since 8d), standbys: cephmon01, cephmon02 mds: 2/2 daemons up, 1 standby, 2 hot standby osd: 202 osds: 202 up (since 11d), 202 in (since 13d) rgw: 2 daemons active (2 hosts, 1 zones) data: volumes: 2/2 healthy pools: 11 pools, 4961 pgs objects: 98.84M objects, 347 TiB usage: 988 TiB used, 1.3 PiB / 2.3 PiB avail pgs: 4942 active+clean 19 active+clean+scrubbing+deep io: client: 89 MiB/s rd, 598 MiB/s wr, 25 op/s rd, 157 op/s wr progress: Upgrade to quay.io/ceph/ceph:v16.2.13 (0s) [] # ceph versions { "mon": { "ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 5 }, "mgr": { "ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 3 }, "osd": { "ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 48, "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)": 154 }, "mds": { "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)": 5 }, "rgw": { "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)": 2 }, "overall": { "ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)": 56, "ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)": 161 } } Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph 17.2.6 alert-manager receives error 500 from inactive MGR
On 7/27/23 13:27, Eugen Block wrote: [2] https://github.com/ceph/ceph/pull/47011 This PR implements the 204 HTTP code that I see in my test cluster. I wonder why in the same situation the other cluster returns a 500 here. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Ceph 17.2.6 alert-manager receives error 500 from inactive MGR
B] [a9b25e54-f1e1-42eb-90b2-af5aa22769cf] /api/prometheus_receiver Jul 25 09:25:27 mgr002 ceph-mgr[1841]: [dashboard ERROR request] [b'{"status": "500 Internal Server Error", "detail": "The server encountered an unexpected condition which prevented it from fulfilling the request.", "request _id": "a9b25e54-f1e1-42eb-90b2-af5aa22769cf"} '] Jul 25 09:25:27 mgr002 ceph-mgr[1841]: [dashboard INFO request] [:::10.54.226.222:49904] [POST] [500] [0.002s] [513.0B] [a9b25e54-f1e1-42eb-90b2-af5aa22769cf] /api/prometheus_receiver Jul 25 09:25:28 mgr002 ceph-mgr[1841]: mgr handle_mgr_map Activating! Jul 25 09:25:28 mgr002 ceph-mgr[1841]: mgr handle_mgr_map I am now activating We have a test cluster running also with version 17.2.6 where this does not happen. In this test cluster the passive MGRs return an HTTP code 204 when the alert-manager tries to request /api/prometheus_receiver. What is happening here? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Per minor-version view on docs.ceph.com
Hi, On 7/12/23 05:44, Satoru Takeuchi wrote: I have a request about docs.ceph.com. Could you provide per minor-version views on docs.ceph.com? I would like to second that. Sometimes the behaviour of Ceph changes a lot between point releases. If the documentation gets unreliable it does not shine a good light on the project. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Are replicas 4 or 6 safe during network partition? Will there be split-brain?
Hi, On 07.07.23 16:52, jcic...@cloudflare.com wrote: There are two sites, A and B. There are 5 mons, 2 in A, 3 in B. Looking at just one PG and 4 replicas, we have 2 replicas in site A and 2 replicas in site B. Site A holds the primary OSD for this PG. When a network split happens, I/O would still be working in site A since there are still 2 OSDs, even without mon quorum. The site without MON quorum will stop to work completely. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: device class for nvme disk is ssd
On 6/28/23 14:03, Boris Behrens wrote: is it a problem that the device class for all my disks is SSD even all of these disks are NVME disks? If it is just a classification for ceph, so I can have pools on SSDs and NVMEs separated I don't care. But maybe ceph handles NVME disks differently internally? No. When creating the OSD ceph-volume looks at /sys/class/block/DEVICE/queue/rotational to determine if it's an HDD (file contains 1) or not (file contains 0). If you need to distinguish between SSD and NVMe you can manually assign another device class to the OSDs. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph iSCSI GW not working with VMware VMFS and Windows Clustered Storage Volumes (CSV)
On 19.06.23 13:47, Work Ceph wrote: Recently, we had the need to add some VMWare clusters as clients for the iSCSI GW and also Windows systems with the use of Clustered Storage Volumes (CSV), and we are facing a weird situation. In windows for instance, the iSCSI block can be mounted, formatted and consumed by all nodes, but when we add in the CSV it fails with some generic exception. The same happens in VMWare, when we try to use it with VMFS it fails. The iSCSI target used does not support SCSI persistent group reservations when in multipath mode. https://docs.ceph.com/en/quincy/rbd/iscsi-initiators/ AFAIK VMware uses these in VMFS. Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Cluster without messenger v1, new MON still binds to port 6789
Hi, a cluster has ms_bind_msgr1 set to false in the config database. Newly created MONs still listen on port 6789 and add themselves as providing messenger v1 into the monmap. How do I change that? Shouldn't the MONs use the config for ms_bind_msgr1? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Encryption per user Howto
On 5/26/23 12:26, Frank Schilder wrote: It may very well not serve any other purpose, but these are requests we get. If I could provide an encryption key to a ceph-fs kernel at mount time, this requirement could be solved very elegantly on a per-user (request) basis and only making users who want it pay with performance penalties. I understand this use case. But this would still mean that the client encrypts the data. In your case the CephFS mount or with S3 the rados-gateway. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Encryption per user Howto
On 23.05.23 08:42, huxia...@horebdata.cn wrote: Indeed, the question is on server-side encryption with keys managed by ceph on a per-user basis What kind of security to you want to achieve with encryption keys stored on the server side? Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: OSD_TOO_MANY_REPAIRS on random OSDs causing clients to hang
On 26.04.23 13:24, Thomas Hukkelberg wrote: [WRN] OSD_TOO_MANY_REPAIRS: Too many repaired reads on 1 OSDs osd.34 had 9936 reads repaired Are there any messages in the kernel log that indicate this device has read errors? Have you considered replacing the disk? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: How to replace an HDD in a OSD with shared SSD for DB/WAL
Hi, On 21.04.23 05:44, Tao LIU wrote: I build a Ceph Cluster with cephadm. Every cehp node has 4 OSDs. These 4 OSD were build with 4 HDD (block) and 1 SDD (DB). At present , one HDD is broken, and I am trying to replace the HDD,and build the OSD with the new HDD and the free space of the SDD. I did the follows: #ceph osd stop osd.23 #ceph osd out osd.23 #ceph osd crush remove osd.23 #ceph osd rm osd.23 #ceph orch daemon rm osd.23 --force #lvremove /dev/ceph-ae21e618-601e-4273-9185-99180edb8453/osd-block-96eda371-1a3f-4139-9123-24ec1ba362c4 #wipefs -af /dev/sda #lvremove /dev/ceph-e50203a6-8b8e-480f-965c-790e21515395/osd-db-70f7a032-cf2c-4964-b979-2b90f43f2216 #ceph orch daemon add osd compute11:data_devices=/dev/sda,db_devices=/dev/sdc,osds_per_device=1 The OSD can be built, but is always down. Is there anyting that I missed during the building? Assuming /dev/ceph-UUID/osd-db-UUID is the logical volume for the old OSD you could have run this: ceph orch osd rm 23 replace the faulty HDD ceph orch daemon add osd compute11:data_devices=/dev/sda,db_devices=ceph-UUID/osd-db-UUID This will reuse the existing logical volume for the OSD DB. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services
On 18.04.23 06:12, Lokendra Rathour wrote: but if I try mounting from a normal Linux machine with connectivity enabled between Ceph mon nodes, it gives the error as stated before. Have you installed ceph-common on the "normal Linux machine"? Regards -- Robert Sander Heinlein Support GmbH Linux: Akademie - Support - Hosting http://www.heinlein-support.de Tel: 030-405051-43 Fax: 030-405051-19 Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: deploying Ceph using FQDN for MON / MDS Services
On 14.04.23 12:17, Lokendra Rathour wrote: *mount: /mnt/image: mount point does not exist.* Have you created the mount point? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin http://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Zwangsangaben lt. §35a GmbHG: HRB 220009 B / Amtsgericht Berlin-Charlottenburg, Geschäftsführer: Peer Heinlein -- Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Adding new server to existing ceph cluster - with separate block.db on NVME
On 29.03.23 01:09, Robert W. Eckert wrote: I did miss seeing the db_devices part. For ceph orch apply - that would have saved a lot of effort. Does the osds_per_device create the partitions on the db device? No, osds_per_device creates multiple OSDs on one data device, could be useful for NVMe, do no use on HDD. The command automatically creates the number of db slots on the db_device based on how many data_devices you pass it. If you want more slots for the RocksDB then pass it the db_slots parameter. Also is there any way to disable --all-available-devices if it was turned on. The ceph orch apply osd --all-available-devices --unmanaged=true command doesn't seem to disable the behavior of adding new drives. You can set the service to unmanaged when exporting the specification. ceph orch ls osd --export > osd.yml Edit osd.yml and add "unmanaged: true" to the specification. After that ceph orch apply -i osd.yml Or you could just remove the specification with "ceph orch rm NAME". The OSD service will be removed but the OSD will remain. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Adding new server to existing ceph cluster - with separate block.db on NVME
Hi, On 28.03.23 05:42, Robert W. Eckert wrote: I am trying to add a new server to an existing cluster, but cannot get the OSDs to create correctly When I try Cephadm ceph-volume lvm create, it returns nothing but the container info. You are running a containerized cluster with the cephadm orchestrator? Which version? Have you tried ceph orch daemon add osd host1:data_devices=/dev/sda,/dev/sdb,db_devices=/dev/nvme0 as shown on https://docs.ceph.com/en/quincy/cephadm/services/osd/ ? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph cluster out of balance after adding OSDs
On 27.03.23 23:13, Pat Vaughan wrote: Looking at the pools, there are 2 crush rules. Only one pool has a meaningful amount of data, the charlotte.rgw.buckets.data pool. This is the crush rule for that pool. So that pool uses the device class ssd explicitely where the other pools do not care about the device class. The autoscaler is not able to cope with this situation. charlotte.rgw.buckets.data is an erasure coded pool, correct? And the rule was created automatically when you created the erasure coding profile. You should create an erasure coding rule that does not care about the device class and assign it to the pool charlotte.rgw.buckets.data. After that the autoscaler will be able to work again. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph cluster out of balance after adding OSDs
On 27.03.23 16:34, Pat Vaughan wrote: Yes, all the OSDs are using the SSD device class. Do you have multiple CRUSH rules by chance? Are all pools using the same CRUSH rule? Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph cluster out of balance after adding OSDs
On 27.03.23 16:04, Pat Vaughan wrote: we looked at the number of PGs for that pool, and found that there was only 1 for the rgw.data and rgw.log pools, and "osd pool autoscale-status" doesn't return anything, so it looks like that hasn't been working. If you are in this situation, have a look at the crush rules of your pools. If the cluster has multiple device classes (hdd, ssd) then all pools need to use just one device class each. The autoscaler currently does not work when one pool uses just one device class and another pool uses the default crush rule and therefor multiple device classes. Regards -- Robert Sander Heinlein Consulting GmbH Schwedter Str. 8/9b, 10119 Berlin https://www.heinlein-support.de Tel: 030 / 405051-43 Fax: 030 / 405051-19 Amtsgericht Berlin-Charlottenburg - HRB 220009 B Geschäftsführer: Peer Heinlein - Sitz: Berlin ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io