[ceph-users] Re: CephFS as Offline Storage

2024-05-21 Thread Matt Vandermeulen
I would normally vouch for ZFS for this sort of thing, but the mix of drive sizes will be... and inconvenience, at best. You could get creative with the hierarchy (making zraid{2,3} of mirrors of same-sized drives, or something), but it would be far from ideal. I use ZFS for my own home

[ceph-users] Ceph Day NYC 2024 Slides

2024-04-27 Thread Matt Vandermeulen
Hi folks! Thanks for a great Ceph Day event in NYC! I wanted to make sure I posted my slides before I forget (and encourage others to do the same). Feel free to reach out in the Ceph Slack https://ceph.io/en/community/connect/ How we Operate Ceph at Scale (DigitalOcean): -

[ceph-users] Re: ceph cluster extremely unbalanced

2024-03-24 Thread Matt Vandermeulen
Hi, I would expect that almost every PG in the cluster is going to have to move once you start standardizing CRUSH weights, and I wouldn't want to move data twice. My plan would look something like: - Make sure the cluster is healthy (no degraded PGs) - Set nobackfill, norebalance flags to

[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-02 Thread Matt Vandermeulen
We've had a specific set of drives that we've had to enable bdev_enable_discard and bdev_async_discard for in order to maintain acceptable performance on block clusters. I wrote the patch that Igor mentioned in order to try and send more parallel discards to the devices, but these ones in

[ceph-users] Re: Is there any way to fine tune peering/pg relocation/rebalance?

2023-08-29 Thread Matt Vandermeulen
We have had success using pgremapper[1] for this sort of thing, in both index and data augments. 1. Set nobackfill, norebalance 2. Add OSDs 3. pgremapper cancel-backfill 4. Unset flags 5. Slowly loop `pgremapper undo-upmaps` at our desired rate, or allow the balancer to do this work There's

[ceph-users] Re: Very slow snaptrim operations blocking client I/O

2023-01-29 Thread Matt Vandermeulen
what the difference will be from our case versus a single large volume with a big snapshot. On 2023-01-28 20:45, Victor Rodriguez wrote: On 1/29/23 00:50, Matt Vandermeulen wrote: I've observed a similar horror when upgrading a cluster from Luminous to Nautilus, which had the same effect

[ceph-users] Re: Very slow snaptrim operations blocking client I/O

2023-01-28 Thread Matt Vandermeulen
I've observed a similar horror when upgrading a cluster from Luminous to Nautilus, which had the same effect of an overwhelming amount of snaptrim making the cluster unusable. In our case, we held its hand by setting all OSDs to have zero max trimming PGs, unsetting nosnaptrim, and then

[ceph-users] Re: backups

2022-12-22 Thread Matt Vandermeulen
We have been doing a zfs send piped to s3 uploads for backups. We use awscli for that, since it can take a stream from stdin. We have never considered using cephfs for that. It ultimately ends up looking something like one of the following, depending full/incremental: zfs send -wv

[ceph-users] Re: Ceph Virtual 2022 Day 5 is starting!

2022-11-08 Thread Matt Vandermeulen
wrote: On 11/8/22 15:10, Mike Perez wrote: Hi everyone, Ceph Virtual 2022 is starting! Today's topic is Scale. We will hear from Matt Vandermeulen about how Digital Ocean, a Ceph Foundation Premier member, scales Ceph for their needs. Unfortunately, our other scheduled presentation for today, "

[ceph-users] Re: mgr/prometheus module port 9283 binds only with IPv6 ?

2022-10-10 Thread Matt Vandermeulen
That output suggests that the mgr is configured to only listen on the loopback address. I don't think that's a default... does a `ceph config dump | grep mgr` suggest it's been configured that way? On 2022-10-10 10:56, Ackermann, Christoph wrote: Hello list member after subsequent

[ceph-users] Re: Recommended SSDs for Ceph

2022-09-29 Thread Matt Vandermeulen
I think you're likely to get a lot of mixed opinions and experiences with this question. I might suggest trying to grab a few samples from different vendors, and making sure they meet your needs (throw some workloads at them, qualify them), then make sure your vendors have a reasonable lead

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Matt Vandermeulen
It sounds like this is from a PG merge, so I'm going to _guess_ that you don't want to straight up cancel the current backfill and instead pause it to catch your breath. You can set `nobackfill` and/or `norebalance` which should pause the backfill. Alternatively, use `ceph config set osd.*

[ceph-users] Re: Reducing ceph cluster size in half

2022-02-21 Thread Matt Vandermeulen
This might be easiest to work about in two steps: Draining hosts, and doing a PG merge. You can do it in either order (though thinking about it, doing the merge first will give you more cluster-wide resources to do it faster). Draining the hosts can be done in a few ways, too. If you want

[ceph-users] Re: Is it possible to change device class of a replicated pool?

2022-02-18 Thread Matt Vandermeulen
Yep, just change the CRUSH rule: ceph osd pool set my_cephfs_metadata_pool crush_rule replicated_nvme If you have a rule set called replicated_nvme, that'll set it on the pool named my_cephfs_metadata_pool. Of course this will cause a significant data movement. If you need to add the

[ceph-users] Re: Cephalocon 2022 deadline extended?

2021-12-10 Thread Matt Vandermeulen
It appears to have been, and we have an application that's pending an internal review before we can submit... so we're hopeful that it has been! On 2021-12-10 15:21, Bobby wrote: Hi all, Has the CfP deadline for Cephalcoon 2022 been extended to 19 December 2022? Please confirm if anyone

[ceph-users] Re: The always welcomed large omap

2021-05-31 Thread Matt Vandermeulen
All the index data will be in OMAP, which you can see a listing of with `ceph osd df tree` Do you have large buckets (many, many objects in a single bucket) with few shards? You may have to reshard one (or some) of your buckets. It'll take some reading if you're using multisite, in order to

[ceph-users] Re: Multisite recovering shards

2021-01-29 Thread Matt Vandermeulen
Hi Szabo, For what it's worth, I have a two clusters in a multisite that has never appeared to be synced either, but have never found a single object that can't be found in both clusters. There are always at least a few recovering shards, while the "data sync source" is always "syncing"