[ceph-users] Ceph commands hang + no CephFS or RBD access

2022-12-01 Thread Neil Brown
Hi all, I have a Ceph 17.2.5 cluster deployed via cephadm. After a few reboots it has now entered a fairly broken state as shown below. I am having trouble even beginning to diagnose this as a lot of the commands just hang. For example “cephadm ps”, “ceph orch ls” just hang forever. Other comm

[ceph-users] Re: MDS crashes to damaged metadata

2022-12-01 Thread Stolte, Felix
Had to reduce the debug level back to normal. Debug Level 20 generated about 70GB log file in one hour. Of course there was no crash in that period. -

[ceph-users] cephx server mgr.a: couldn't find entity name: mgr.a

2022-12-01 Thread Sagara Wijetunga
Hi I'm trying to enable Cephx on a cluster already running without Cephx. Here is what I did. 1. I shutdown the cluster. 2. Enabled Cephx in ceph.conf, Mon and Mgr. 3. Brought the Monitor cluster up. No issue. 4. Tried to bring first Manager up, I'm getting following error: === mgr.a === S

[ceph-users] Re: Cache modes libvirt

2022-12-01 Thread Janne Johansson
> In Ceph/Libvirt docs only cachmodes writetrough and writeback are discussed. > My clients's disks are all set to writeback in the libvirt client > xml-definition. > > For a backup operation, I notice a severe lag on one of my VM's. Such a > backup operation that takes 1 to 2 hours (on a same m

[ceph-users] proxmox hyperconverged pg calculations in ceph pacific, pve 7.2

2022-12-01 Thread Rainer Krienke
Hello, I run a a hyperconverged pve cluster (V7.2) with 11 nodes. Each node has 8 4TB disks. pve and ceph are installed an running. I wanted to create some ceph-pools with each 512 pgs. Since I want to use erasure coding (5+3) when creating a pool one rbd pool for metadata and the data pool

[ceph-users] dashboard version of ceph versions shows N/A

2022-12-01 Thread Simon Oosthoek
Dear list Yesterday we updated our ceph cluster from 15.2.17 to 16.2.10 using packages. Our cluster is a mix of ubuntu 18 and ubuntu 20 with ceph coming from packages in the ceph.com repo. All went well and we now have all nodes running Pacific. However, there's something odd in the dashboar

[ceph-users] Troubleshooting tool for Rook based Ceph clusters

2022-12-01 Thread Subham Rai
Hi everyone, In Rook, we have come up with `kubectl` based krew plugin tool name `rook-ceph`, which will help users with the most commonly faces issues or answer common questions users have like is my cluster is on the right configuration or whether required resources are running and other things,

[ceph-users] Re: radosgw-octopus latest - NoSuchKey Error - some buckets lose their rados objects, but not the bucket index

2022-12-01 Thread J. Eric Ivancich
So it seems like a bucket still has objects listed in the bucket index but the underlying data objects are no longer there. Since you made reference to a customer, I’m guessing the customer does not have direct access to the cluster via `rados` commands, so there’s no chance that they could have

[ceph-users] Re: Tuning CephFS on NVME for HPC / IO500

2022-12-01 Thread Fox, Kevin M
if its this: http://www.acmemicro.com/Product/17848/Kioxia-KCD6XLUL15T3---15-36TB-SSD-NVMe-2-5-inch-15mm-CD6-R-Series-SIE-PCIe-4-0-5500-MB-sec-Read-BiCS-FLASH-TLC-1-DWPD its listed as 1 DWPD with a 5 year warranty. So should be ok. Thanks, Kevin From: Rob

[ceph-users] Re: proxmox hyperconverged pg calculations in ceph pacific, pve 7.2

2022-12-01 Thread Eugen Block
Hi, you need to take into account the number of replicas as well. With 88 OSDs and the default max PGs per OSD of 250 you get the mentioned 22000 PGs (including replica): 88 x 250 = 22000. With EC pools each chunk counts as one replica. So you should consider shrinking your pools or let aut

[ceph-users] Re: proxmox hyperconverged pg calculations in ceph pacific, pve 7.2

2022-12-01 Thread Anthony D'Atri
Arguably the error should say something like " TASK ERROR: error with 'osd pool create': mon_command failed - pg_num 512 size 8 would result in 251 cumulative PGs per OSD (22148 total PG replicas on 88 ‘in’ OSDs), which exceeds the mon_max_pg_per_osd value of 250. IMHO this is sort of mixing

[ceph-users] Re: Tuning CephFS on NVME for HPC / IO500

2022-12-01 Thread Mark Nelson
Hi Manuel, I did the IO500 runs back in 2020 and wrote the cephfs aiori backend for IOR/mdtest.  Not sure about the segfault, it's been a while since I've touched that code.  It was working the last time I used it. :D  Having said that, I don't think that's your issue.   The userland backend

[ceph-users] Re: OSD container won't boot up

2022-12-01 Thread J-P Methot
Update on this, I have figured out what happened. I had ceph packages installed on the node, since this was converted from ceph-deploy to cephadm when we tested Octopus. When I upgraded Ubuntu, it updated the packages and triggered a permission issue which is already documented. This whole mes