[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-31 Thread ceph
Hello Jan, I had the same on two cluster from nautlus to pacific. On both it did help to fire Ceph tell osd.* compact If this had not help, i would go for a recreate of the osds... Hth Mehmet Am 31. März 2023 10:56:42 MESZ schrieb j.kr...@profihost.ag: >Hi, > >we have a very similar

[ceph-users] Re: Misplaced objects greater than 100%

2023-03-31 Thread Johan Hattne
Here goes: # ceph -s cluster: id: e1327a10-8b8c-11ed-88b9-3cecef0e3946 health: HEALTH_OK services: mon: 5 daemons, quorum bcgonen-a,bcgonen-b,bcgonen-c,bcgonen-r0h0,bcgonen-r0h1 (age 16h) mgr: bcgonen-b.furndm(active, since 8d), standbys: bcgonen-a.qmmqxj mds: 1/1

[ceph-users] Re: Misplaced objects greater than 100%

2023-03-31 Thread ceph
Need to know some more about your cluster... Ceph -s Ceph osd df tree Replica or ec? ... Perhaps this can give us some insight Mehmet Am 31. März 2023 18:08:38 MESZ schrieb Johan Hattne : >Dear all; > >Up until a few hours ago, I had a seemingly normally-behaving cluster (Quincy, >17.2.5) with

[ceph-users] Failing to create monitor in a working cluster.

2023-03-31 Thread pmestre
Hello, i've been running a 3 node proxmox cluster with 4 ceph osd for 3 years as a production cluster. As a test for trying to move ceph cluster network, i destroyed one of the 3 working monitors and tried to recreate it. After destroying it, the new monitor refuses to join the cluster, even in

[ceph-users] how to set block.db size

2023-03-31 Thread li . xuehai
hi, how to understand ``Improvements in releases beginning with Nautilus 14.2.12 and Octopus 15.2.6 enable better utilization of arbitrary DB device sizes, and the Pacific release brings experimental dynamic level support. `` in the document

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-31 Thread j . kruse
Hi, we have a very similar situation. We updated from nautilus -> pacific (16.2.11) and saw a rapid increase in the commit_latency and op_w_latency (>10s on some OSDs) after a few hours. We also have nearly exclusive rbd workload. After deleting old snapshots we saw an improvenent, and after

[ceph-users] Re: Ceph Failure and OSD Node Stuck Incident

2023-03-31 Thread Joachim Kraftmayer
Hi Peter, I would recommend from my experience to replace the Samsung Evo SSDs, with Datacenter SSDs. Regards, Joachim Clyso GmbH - Ceph Foundation Member schrieb am Do., 30. März 2023, 16:37: > We encountered a Ceph failure where the system became

[ceph-users] How mClock profile calculation works, and IOPS

2023-03-31 Thread Luis Domingues
Hi, I am reading reading some documentation about mClock and have two questions. First, about the IOPS. Are those IOPS disk IOPS or other kind of IOPS? And what the assumption of those? (Like block size, sequential or random reads/writes)? And the second question, How mClock calculates its

[ceph-users] Misplaced objects greater than 100%

2023-03-31 Thread Johan Hattne
Dear all; Up until a few hours ago, I had a seemingly normally-behaving cluster (Quincy, 17.2.5) with 36 OSDs, evenly distributed across 3 of its 6 nodes. The cluster is only used for CephFS and the only non-standard configuration I can think of is that I had 2 active MDSs, but only 1

[ceph-users] ./install-deps.sh takes several hours

2023-03-31 Thread Arvid Picciani
Hi again, something is very wrong with my hardware it seems and i'm slowly turning insane. I'm trying to debug why ceph has incredibly poor performance for us. we've got - 3 EPYC 7713 dual-cpu systems - datacenter nvme drives (3GB/s top) - 100G infiniband ceph does 800MB/s read max, CPU is

[ceph-users] Re: ceph orch ps mon, mgr, osd shows for version, image and container id

2023-03-31 Thread Adiga, Anantha
Thank you so much Adam. I will check into the older release being used and update the ticket. Anantha From: Adam King Sent: Friday, March 31, 2023 5:46 AM To: Adiga, Anantha Cc: ceph-users@ceph.io Subject: Re: [ceph-users] ceph orch ps mon, mgr, osd shows for version, image and container

[ceph-users] Re: ceph orch ps mon, mgr, osd shows for version, image and container id

2023-03-31 Thread Adam King
I can see the json output for the osd you posted doesn't list any version { "style": "cephadm:v1", "name": "osd.61", "fsid": "8dbfcd81-fee3-49d2-ac0c-e988c8be7178", "systemd_unit": ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61, "enabled": true,

[ceph-users] Re: Ceph Failure and OSD Node Stuck Incident

2023-03-31 Thread Frank Schilder
I have also seen unresponsive OSDs not being detected as "down". My best bet is that the MON ping probes are handled by a different thread or in a queue different from the disk IO queue, which means that "OSD responding to ping probe" does not mean "OSD is making progress". The latter is what

[ceph-users] Re: RGW can't create bucket

2023-03-31 Thread Boris Behrens
Sounds like all user have the problem? so what I would do in my setup now: - start a new rgw client with maximum logging (debug_rgw = 20) on a non public port - test against this endpoint and check logs This might give you more insight. Am Fr., 31. März 2023 um 09:36 Uhr schrieb Kamil Madac <

[ceph-users] Re: Upgrade from 16.2.7. to 16.2.11 failing on OSDs

2023-03-31 Thread Eugen Block
Hi, can you share more logs from the failing OSD? Something like cephadm logs --name osd.4 Zitat von Lo Re Giuseppe : Dear all, On one of our clusters I started the upgrade process from 16.2.7 to 16.2.11. Mon and mgr and crash processes were done easily/quickly, then at the first attempt

[ceph-users] Re: how ceph OSD bench works?

2023-03-31 Thread Luis Domingues
> > OSD bench performs IOs at the objectstore level and the stats are > > reported > > based on the response from those transactions. It performs either > > sequential > > or random IOs (i.e. a random offset into an object) based on the > > arguments > > passed to it. IIRC if number of objects and

[ceph-users] Re: how ceph OSD bench works?

2023-03-31 Thread Marc
> > OSD bench performs IOs at the objectstore level and the stats are > reported > based on the response from those transactions. It performs either > sequential > or random IOs (i.e. a random offset into an object) based on the > arguments > passed to it. IIRC if number of objects and object

[ceph-users] Re: RGW can't create bucket

2023-03-31 Thread Kamil Madac
We checked s3cmd --debug and endpoint is ok (Working with existing buckets works ok with same s3cmd config). From what I read, "max_buckets": 0 means that there is no quota for the number of buckets. There are also users who have "max_buckets": 1000, and those users have the same access_denied

[ceph-users] Re: how ceph OSD bench works?

2023-03-31 Thread Sridhar Seshasayee
Hi Luis, The thing that is odd, is that doing some tests with fio tool, I have > similar results on all disks, and doing the rados bench during 5 minutes as > well. But the OSD bench at startup of the OSD, for mClock to configure > osd_mclock_max_capacity_iops_hdd gives me a very big difference