[ceph-users] Re: Adding OSD's results in slow ops, inactive PG's

2024-01-17 Thread Torkil Svensgaard
On 18/01/2024 07:48, Eugen Block wrote: Hi,  -3281> 2024-01-17T14:57:54.611+ 7f2c6f7ef540  0 osd.431 2154828 load_pgs opened 750 pgs <--- I'd say that's close enough to what I suspected. ;-) Not sure why the "maybe_wait_for_max_pg" message isn't there but I'd give it a try with a

[ceph-users] Re: Adding OSD's results in slow ops, inactive PG's

2024-01-17 Thread Eugen Block
Hi, -3281> 2024-01-17T14:57:54.611+ 7f2c6f7ef540 0 osd.431 2154828 load_pgs opened 750 pgs <--- I'd say that's close enough to what I suspected. ;-) Not sure why the "maybe_wait_for_max_pg" message isn't there but I'd give it a try with a higher osd_max_pg_per_osd_hard_ratio.

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Mark Nelson
Thanks kindly Maged/Bailey!  As always it's a bit of a moving target.  New hardware comes out that reveals bottlenecks in our code.  Doubling up the OSDs sometimes improves things.  We figure out how to make the OSDs faster and the old assumptions stop being correct.  Even newer hardware comes

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Bailey Allison
+1 to this, great article and great research. Something we've been keeping a very close eye on ourselves. Overall we've mostly settled on the old keep it simple stupid methodology with good results. Especially as the benefits have gotten less beneficial the more recent your ceph version, and

[ceph-users] Re: Adding OSD's results in slow ops, inactive PG's

2024-01-17 Thread Torkil Svensgaard
On 17-01-2024 22:20, Eugen Block wrote: Hi, Hi this sounds a bit like a customer issue we had almost two years ago. Basically, it was about mon_max_pg_per_osd (default 250) which was exceeded during the first activating OSD (and the last remaining stopping OSD). You can read all the

[ceph-users] Re: Adding OSD's results in slow ops, inactive PG's

2024-01-17 Thread Eugen Block
Hi, this sounds a bit like a customer issue we had almost two years ago. Basically, it was about mon_max_pg_per_osd (default 250) which was exceeded during the first activating OSD (and the last remaining stopping OSD). You can read all the details in the lengthy thread [1]. But if this

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Maged Mokhtar
Very informative article you did Mark. IMHO if you find yourself with very high per-OSD core count, it may be logical to just pack/add more nvmes per host, you'd be getting the best price per performance and capacity. /Maged On 17/01/2024 22:00, Mark Nelson wrote: It's a little tricky.  In

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Mark Nelson
It's a little tricky.  In the upstream lab we don't strictly see an IOPS or average latency advantage with heavy parallelism by running muliple OSDs per NVMe drive until per-OSD core counts get very high.  There does seem to be a fairly consistent tail latency advantage even at moderately low

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-17 Thread Chris Palmer
On 17/01/2024 16:11, kefu chai wrote: On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer wrote: Updates on both problems: Problem 1 -- The bookworm/reef cephadm package needs updating to accommodate the last change in

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Anthony D'Atri
Conventional wisdom is that with recent Ceph releases there is no longer a clear advantage to this. > On Jan 17, 2024, at 11:56, Peter Sabaini wrote: > > One thing that I've heard people do but haven't done personally with fast > NVMes (not familiar with the IronWolf so not sure if they

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Peter Sabaini
On 17.01.24 11:13, Tino Todino wrote: > Hi folks. > > I had a quick search but found nothing concrete on this so thought I would > ask. > > We currently have a 4 host CEPH cluster with an NVMe pool (1 OSD per host) > and an HDD Pool (1 OSD per host). Both OSD's use a separate NVMe for DB/WAL.

[ceph-users] Re: Debian 12 (bookworm) / Reef 18.2.1 problems

2024-01-17 Thread kefu chai
On Tue, Jan 16, 2024 at 12:11 AM Chris Palmer wrote: > Updates on both problems: > > Problem 1 > -- > > The bookworm/reef cephadm package needs updating to accommodate the last > change in /usr/share/doc/adduser/NEWS.Debian.gz: > >System user home defaults to /nonexistent if

[ceph-users] Adding OSD's results in slow ops, inactive PG's

2024-01-17 Thread Ruben Vestergaard
Hi We have a cluster with which currently looks like so: services: mon: 5 daemons, quorum lazy,jolly,happy,dopey,sleepy (age 13d) mgr: jolly.tpgixt(active, since 25h), standbys: dopey.lxajvk, lazy.xuhetq mds: 1/1 daemons up, 2 standby osd: 449 osds: 425 up (since

[ceph-users] minimal permission set for an rbd client

2024-01-17 Thread cek+ceph
I'm following the guide @ https://docs.ceph.com/en/latest/rbd/rados-rbd-cmds/ but I'm not following why would an `mgr` permission be required to have a functioning RBD client? Thanks. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe

[ceph-users] Re: recommendation for barebones server with 8-12 direct attach NVMe?

2024-01-17 Thread Anthony D'Atri
> Also in our favour is that the users of the cluster we are currently > intending for this have established a practice of storing large objects. That definitely is in your favor. > but it remains to be seen how 60x 22TB behaves in practice. Be sure you don't get SMR drives. > and it's

[ceph-users] Re: recommendation for barebones server with 8-12 direct attach NVMe?

2024-01-17 Thread Gregory Orange
On 16/1/24 11:39, Anthony D'Atri wrote: by “RBD for cloud”, do you mean VM / container general-purposes volumes on which a filesystem is usually built?  Or large archive / backup volumes that are read and written sequentially without much concern for latency or throughput? General purpose

[ceph-users] Re: Stuck in upgrade process to reef

2024-01-17 Thread Igor Fedotov
Hi Jan, w.r.t. osd.0 - if this is the only occurrence then I'd propose simply redeploy the OSD. This looks like some BlueStore metadata inconsistency which could occur long before the upgrade. Likely the upgrade just revealed the issue.  And honestly I can hardly imagine how to investigate

[ceph-users] Re: Stuck in upgrade process to reef

2024-01-17 Thread Jan Marek
Hi Igor, many thanks for advice! I've tried to start osd.1 and it started already, now it's resynchronizing data. I will start daemons one-by-one. What do you mean about osd.0, which have a problem with bluestore fsck? Is there a way to repair it? Sincerely Jan Dne Út, led 16, 2024 at

[ceph-users] Performance impact of Heterogeneous environment

2024-01-17 Thread Tino Todino
Hi folks. I had a quick search but found nothing concrete on this so thought I would ask. We currently have a 4 host CEPH cluster with an NVMe pool (1 OSD per host) and an HDD Pool (1 OSD per host). Both OSD's use a separate NVMe for DB/WAL. These machines are identical (Homogenous) and are

[ceph-users] Re: Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints?

2024-01-17 Thread Götz Reinicke
Hi, We went the „long“ way. - first emptied osd node by node (for each pool), purged all OSDs - moved the OS from centos 7 to ubuntu 20 (reinstalled every node) - removed the cache pool and cleaned up some config - installed all OSDs and moved the data back - upgraded ceph nautilus to octopus

[ceph-users] Re: Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints?

2024-01-17 Thread Marc
I have compiled nautilus for el9 and am going to test adding a el9 osd node the the existing el7 cluster. If that is ok, I will upgrade all nodes first to el9. > -Original Message- > From: Szabo, Istvan (Agoda) > Sent: Wednesday, 17 January 2024 08:09 > To: balli...@45drives.com; Eugen

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-17 Thread Xiubo Li
On 1/13/24 07:02, Özkan Göksu wrote: Hello. I have 5 node ceph cluster and I'm constantly having "clients failing to respond to cache pressure" warning. I have 84 cephfs kernel clients (servers) and my users are accessing their personal subvolumes located on one pool. My users are software

[ceph-users] Re: 1 clients failing to respond to cache pressure (quincy:17.2.6)

2024-01-17 Thread Xiubo Li
On 1/17/24 15:57, Eugen Block wrote: Hi, this is not an easy topic and there is no formula that can be applied to all clusters. From my experience, it is exactly how the discussion went in the thread you mentioned, trial & error. Looking at your session ls output, this reminds of a debug