[ceph-users] Re: tuning for backup target cluster

Darren Soothill Wed, 29 May 2024 12:30:14 -0700

So a few questions I have around this.

What is the network you have for this cluster?

Changing the bluestone_min_alloc_size would be the last thing I would even 
consider. In fact I wouldn’t be changing it as you are in untested territory.

The challenge with making these sort of things perform is to generate lots of 
parallel streams so what ever is doing the uploading needs to be doing parallel 
multipart uploads. There is no mention of the uploading code that is being used.

So with 7 Nodes each with 12 Disks and doing large files like this I would be 
expecting to see 50-70MB/s per useable HDD. By useable I mean if you are doing 
Replicas then you would divide the number of disks by the replica number or in 
your case with EC I would be diving the number of disks by the EC size and 
multiplying by the data part. So divide by 6 and multiply by 4.

So allowing for EC overhead you in theory could get beyond 2.8GBytes/s That is 
the theoretical disk limit I would be looking to exceed.

So now you have the question of do you have enough streams running in parallel? 
Have you tried a benchmarking tool such as minio warp to see what it can 
achieve. 

You haven’t mentioned the number of PG’s you have for each of the pools in 
question. You need to ensure that every pool that is being used has more PG’s 
that the number of disks. If that’s not the case then individual disks could be 
slowing things down.

You also have the metadata pools used by RGW that ideally need to be on NVME.

Because you are using EC then there is the buckets.non-ec pool which is used to 
manage the OMAPS for the multipart uploads this is usually down at 8 PG’s and 
that will be limiting things as well.

Darren Soothill

Want a meeting with me: https://calendar.app.google/MUdgrLEa7jSba3du9

Looking for help with your Ceph cluster? Contact us at https://croit.io/

croit GmbH, Freseniusstr. 31h, 81247 Munich 
CEO: Martin Verges - VAT-ID: DE310638492 
Com. register: Amtsgericht Munich HRB 231263 
Web: https://croit.io/ | YouTube: https://goo.gl/PGE1Bx

> On 25 May 2024, at 14:56, Anthony D'Atri <a...@dreamsnake.net> wrote:
> 
> 
> 
>> Hi Everyone,
>> 
>> I'm putting together a HDD cluster with an ECC pool dedicated to the backup
>> environment. Traffic via s3. Version 18.2,  7 OSD nodes, 12 * 12TB HDD +
>> 1NVME each,
> 
> QLC, man.  QLC.  That said, I hope you're going to use that single NVMe SSD 
> for at least the index pool.  Is this a chassis with universal slots, or is 
> that NVMe device maybe M.2 or rear-cage?
> 
>> Wondering if there is some general guidance for startup setup/tuning in
>> regards to s3 object size.
> 
> Small objects are the devil of any object storage system.
> 
> 
>> Files are read from fast storage (SSD/NVME) and
>> written to s3. Files sizes are 10MB-1TB, so it's not standard s3. traffic.
> 
> Nothing nonstandard about that, though your 1TB objects presumably are going 
> to be MPU.  Having the .buckets.non-ec pool on HDD with objects that large 
> might be really slow to assemble them, you might need to increase timeouts 
> but I'm speculating.
> 
> 
>> Backup for big files took hours to complete.
> 
> Spinners gotta spin.  They're a false economy.
> 
>> My first shot would be to increase default bluestore_min_alloc_size_hdd, to
>> reduce the number of stored objects, but I'm not sure if it's a
>> good direccion?  
> 
> With that workload you *could* increase that to like 64KB, but I don't think 
> it'd gain you much.  
> 
> 
>> Any other parameters worth checking to support such a
>> traffic pattern?
> 
> `ceph df`
> `ceph osd dump | grep pool`
> 
> So we can see what's going on HDD and what's on NVMe.
> 
>> 
>> Thanks!
>> 
>> -- 
>> Łukasz
>> _______________________________________________
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: tuning for backup target cluster

Reply via email to