[ceph-users] Re: Accumulation of removed_snaps_queue After Deleting Snapshots in Ceph RBD

2024-02-13 Thread Josh Baergen
> 24 active+clean+snaptrim I see snaptrimming happening in your status output - do you know if that was happening before restarting those OSDs? This is the mechanism by which OSDs clean up deleted snapshots, and once all OSDs have completed snaptrim for a given snapshot it should be removed

[ceph-users] Re: Accumulation of removed_snaps_queue After Deleting Snapshots in Ceph RBD

2024-02-09 Thread Josh Baergen
Hello, Which version of Ceph are you using? Are all of your OSDs currently up+in? If you're HEALTH_OK and all OSDs are up, snaptrim should work through the removed_snaps_queue and clear it over time, but I have seen cases where this seems to get stuck and restarting OSDs can help. Josh On Wed,

[ceph-users] Re: How to solve data fixity

2024-02-09 Thread Josh Baergen
MPU etags are an MD5-of-MD5s, FWIW. If the users knows how the parts are uploaded then it can be used to verify contents, both just after upload and then at download time (both need to be validated if you want end-to-end validation - but then you're trusting the system to not change the etag

[ceph-users] Re: XFS on top of RBD, overhead

2024-02-02 Thread Josh Baergen
On Fri, Feb 2, 2024 at 7:44 AM Ruben Vestergaard wrote: > Is the RBD client performing partial object reads? Is that even a thing? Yup! The rados API has both length and offset parameters for reads (https://docs.ceph.com/en/latest/rados/api/librados/#c.rados_aio_read) and writes

[ceph-users] Re: Scrubbing?

2024-01-30 Thread Josh Baergen
Ah, yeah, you hit https://tracker.ceph.com/issues/63389 during the upgrade. Josh On Tue, Jan 30, 2024 at 3:17 AM Jan Marek wrote: > > Hello again, > > I'm sorry, I forgot attach file... :-( > > Sincerely > Jan > > Dne Út, led 30, 2024 at 11:09:44 CET napsal(a) Jan Marek: > > Hello Sridhar, > >

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-01-29 Thread Josh Baergen
; > On Mon, Jan 29, 2024 at 4:47 PM Josh Baergen > wrote: >> >> Make sure you're on a fairly recent version of Ceph before doing this, >> though. >> >> Josh >> >> On Mon, Jan 29, 2024 at 5:05 AM Janne Johansson wrote: >> > >> >

[ceph-users] Re: 6 pgs not deep-scrubbed in time

2024-01-29 Thread Josh Baergen
Make sure you're on a fairly recent version of Ceph before doing this, though. Josh On Mon, Jan 29, 2024 at 5:05 AM Janne Johansson wrote: > > Den mån 29 jan. 2024 kl 12:58 skrev Michel Niyoyita : > > > > Thank you Frank , > > > > All disks are HDDs . Would like to know if I can increase the

[ceph-users] Re: OSD read latency grows over time

2024-01-26 Thread Josh Baergen
> Just curious, can decreasing rocksdb_cf_compact_on_deletion_trigger 16384 > > 4096 hurt performance of HDD OSDs in any way? I have no growing latency on > HDD OSD, where data is stored, but it would be easier to set it to [osd] > section without cherry picking only SSD/NVME OSDs, but for all at

[ceph-users] Re: OSD read latency grows over time

2024-01-26 Thread Josh Baergen
> Do you know if it rocksdb_cf_compact_on_deletion_trigger and > rocksdb_cf_compact_on_deletion_sliding_window can be changed in runtime > without OSD restart? Unfortunately they cannot. You'll want to set them in centralized conf and then restart OSDs for them to take effect. Josh On Fri, Jan

[ceph-users] Re: Logging control

2023-12-19 Thread Josh Baergen
I would start with "ceph tell osd.1 config diff", as I find that output the easiest to read when trying to understand where various config overrides are coming from. You almost never need to use "ceph daemon" in Octopus+ systems since "ceph tell" should be able to access pretty much all commands

[ceph-users] Re: About ceph osd slow ops

2023-12-01 Thread Josh Baergen
Given that this is s3, are the slow ops on index or data OSDs? (You mentioned HDD but I don't want to assume that meant that the osd you mentioned is data) Josh On Fri, Dec 1, 2023 at 7:05 AM VÔ VI wrote: > > Hi Stefan, > > I am running replicate x3 with a failure domain as host and setting >

[ceph-users] Re: resharding RocksDB after upgrade to Pacific breaks OSDs

2023-11-03 Thread Josh Baergen
The ticket has been updated, but it's probably important enough to state on the list as well: The documentation is currently wrong in a way that running the command as documented will cause this corruption. The correct command to run is: ceph-bluestore-tool \ --path \

[ceph-users] Re: Ceph 16.2.14: how to set mon_rocksdb_options to enable RocksDB compression?

2023-10-16 Thread Josh Baergen
in the monitor >> configuration database. >> >> This works rather well for various Ceph components, including the monitors. >> RocksDB options are also applied to monitors correctly, but for some reason >> are being ignored. >> >> /Z >> >> On Sa

[ceph-users] Re: Ceph 16.2.14: how to set mon_rocksdb_options to enable RocksDB compression?

2023-10-14 Thread Josh Baergen
Apologies if you tried this already and I missed it - have you tried configuring that setting in /etc/ceph/ceph.conf (or wherever your conf file is) instead of via 'ceph config'? I wonder if mon settings like this one won't actually apply the way you want because they're needed before the mon has

[ceph-users] Re: ceph osd down doesn't seem to work

2023-10-03 Thread Josh Baergen
Hi Simon, If the OSD is actually up, using 'ceph osd down` will cause it to flap but come back immediately. To prevent this, you would want to 'ceph osd set noup'. However, I don't think this is what you actually want: > I'm thinking (but perhaps incorrectly?) that it would be good to keep the

[ceph-users] Re: Error EPERM: error setting 'osd_op_queue' to 'wpq': (1) Operation not permitted

2023-09-18 Thread Josh Baergen
My guess is that this is because this setting can't be changed at runtime, though if so that's a new enforcement behaviour in Quincy that didn't exist in prior versions. I think what you want to do is 'config set osd osd_op_queue wpq' (assuming you want this set for all OSDs) and then restart

[ceph-users] Re: Backfill Performance for

2023-08-08 Thread Josh Baergen
Hi Jonathan, > - All PGs seem to be backfilling at the same time which seems to be in > violation of osd_max_backfills. I understand that there should be 6 readers > and 6 writers at a time, but I'm seeing a given OSD participate in more > than 6 PG backfills. Is an OSD only considered as

[ceph-users] Re: MON sync time depends on outage duration

2023-07-11 Thread Josh Baergen
Out of curiosity, what is your require_osd_release set to? (ceph osd dump | grep require_osd_release) Josh On Tue, Jul 11, 2023 at 5:11 AM Eugen Block wrote: > > I'm not so sure anymore if that could really help here. The dump-keys > output from the mon contains 42 million osd_snap prefix

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-06-27 Thread Josh Baergen
On Tue, Jun 27, 2023 at 11:50 AM Matthew Booth wrote: > What do you mean by saturated here? FWIW I was using the default cache > size of 1G and each test run only wrote ~100MB of data, so I don't > think I ever filled the cache, even with multiple runs. Ah, my apologies - I saw that fio had been

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-06-27 Thread Josh Baergen
Hi Matthew, We've done a limited amount of work on characterizing the pwl and I think it suffers the classic problem of some writeback caches in that, once the cache is saturated, it's actually worse than just being in writethrough. IIRC the pwl does try to preserve write ordering (unlike the

[ceph-users] Re: 16.2.13: ERROR:ceph-crash:directory /var/lib/ceph/crash/posted does not exist; please create

2023-06-01 Thread Josh Baergen
Hi Zakhar, I'm going to guess that it's a permissions issue arising from https://github.com/ceph/ceph/pull/48804, which was included in 16.2.13. You may need to change the directory permissions, assuming that you manage the directories yourself. If this is managed by cephadm or something like

[ceph-users] Re: Best practice for expanding Ceph cluster

2023-05-04 Thread Josh Baergen
Hi Samuel, Both pgremapper and the CERN scripts were developed against Luminous, and in my experience 12.2.13 has all of the upmap patches needed for the scheme that Janne outlined to work. However, if you have a complex CRUSH map sometimes the upmap balancer can struggle, and I think that's true

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-25 Thread Josh Baergen
Hi Samuel, While the second method would probably work fine in the happy path, if something goes wrong I think you'll be happier having a uniform release installed. In general, we've found the backfill experience to be better on Nautilus than Luminous, so my vote would be for the first method.

[ceph-users] Re: pg wait too long when osd restart

2023-03-13 Thread Josh Baergen
hanism? Besides that, if I don't want to upgrade version in > recently, is a good way that adjust osd_pool_default_read_lease_ratio to > lower? For example, 0.4 or 0.2 to reach the user's tolerance time. > > Yite Gu > > Josh Baergen 于2023年3月10日周五 22:09写道: >> >> Hello

[ceph-users] Re: pg wait too long when osd restart

2023-03-10 Thread Josh Baergen
Hello, When you say "osd restart", what sort of restart are you referring to - planned (e.g. for upgrades or maintenance) or unplanned (OSD hang/crash, host issue, etc.)? If it's the former, then these parameters shouldn't matter provided that you're running a recent enough Ceph with default

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Josh Baergen
caches can improve the IOPS performance of SSDs. Josh On Tue, Feb 28, 2023 at 7:19 AM Boris Behrens wrote: > > Hi Josh, > we upgraded 15.2.17 -> 16.2.11 and we only use rbd workload. > > > > Am Di., 28. Feb. 2023 um 15:00 Uhr schrieb Josh Baergen > : >> >>

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Josh Baergen
Hi Boris, Which version did you upgrade from and to, specifically? And what workload are you running (RBD, etc.)? Josh On Tue, Feb 28, 2023 at 6:51 AM Boris Behrens wrote: > > Hi, > today I did the first update from octopus to pacific, and it looks like the > avg apply latency went up from 1ms

[ceph-users] Re: increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior

2023-02-21 Thread Josh Baergen
Hi Boris, This sounds a bit like https://tracker.ceph.com/issues/53729. https://tracker.ceph.com/issues/53729#note-65 might help you diagnose whether this is the case. Josh On Tue, Feb 21, 2023 at 9:29 AM Boris Behrens wrote: > > Hi, > today I wanted to increase the PGs from 2k -> 4k and

[ceph-users] Re: Adding Labels Section to Perf Counters Output

2023-02-01 Thread Josh Baergen
Do the counters need to be moved under a separate key? That would break anything today that currently tries to parse them. We have quite a bit of internal monitoring that relies on "perf dump" output, but it's mostly not output that I would expect to gain labels in general (e.g. bluestore stats).

[ceph-users] Re: All pgs unknown

2023-01-29 Thread Josh Baergen
This often indicates that something is up with your mgr process. Based on ceph status, it looks like both the mgr and mon had recently restarted. Is that expected? Josh On Sun, Jan 29, 2023 at 3:36 AM Daniel Brunner wrote: > > Hi, > > my ceph cluster started to show HEALTH_WARN, there are no

[ceph-users] Re: Very slow snaptrim operations blocking client I/O

2023-01-27 Thread Josh Baergen
This might be due to tombstone accumulation in rocksdb. You can try to issue a compact to all of your OSDs and see if that helps (ceph tell osd.XXX compact). I usually prefer to do this one host at a time just in case it causes issues, though on a reasonably fast RBD cluster you can often get away

[ceph-users] Re: [SPAM] Ceph upgrade advice - Luminous to Pacific with OS upgrade

2022-12-06 Thread Josh Baergen
> - you will need to love those filestore OSD’s to Bluestore before hitting > Pacific, might even be part of the Nautilus upgrade. This takes some time if > I remember correctly. I don't think this is necessary. It _is_ necessary to convert all leveldb to rocksdb before upgrading to Pacific, on

[ceph-users] Re: Implications of pglog_hardlimit

2022-11-29 Thread Josh Baergen
It's also possible you're running into large pglog entries - any chance you're running RGW and there's an s3:CopyObject workload hitting an object that was uploaded with MPU? https://tracker.ceph.com/issues/56707 If that's the case, you can inject a much smaller value for osd_min_pg_log_entries

[ceph-users] Re: Question about quorum

2022-11-03 Thread Josh Baergen
Hi Murilo, This is briefly referred to by https://docs.ceph.com/en/octopus/rados/deployment/ceph-deploy-mon/, but in order to avoid split brain issues it's common that distributed consensus algorithms require a strict majority in order to maintain quorum. This is why production deployments of

[ceph-users] Re: 1 pg stale, 1 pg undersized

2022-10-27 Thread Josh Baergen
Hi Alexander, I'd be suspicious that something is up with pool 25. Which pool is that? ('ceph osd pool ls detail') Knowing the pool and the CRUSH rule it's using is a good place to start. Then that can be compared to your CRUSH map (e.g. 'ceph osd tree') to see why Ceph is struggling to map that

[ceph-users] Re: Advice on balancing data across OSDs

2022-10-24 Thread Josh Baergen
: > > Hi Josh, > > On Mon, Oct 24, 2022 at 07:20:46AM -0600, Josh Baergen wrote: > > > I've included the osd df output below, along with pool and crush rules. > > > > Looking at these, the balancer module should be taking care of this > > imbalance automaticall

[ceph-users] Re: Advice on balancing data across OSDs

2022-10-24 Thread Josh Baergen
Hi Tim, > I've included the osd df output below, along with pool and crush rules. Looking at these, the balancer module should be taking care of this imbalance automatically. What does "ceph balancer status" say? Josh ___ ceph-users mailing list --

[ceph-users] Re: Iinfinite backfill loop + number of pgp groups stuck at wrong value

2022-10-07 Thread Josh Baergen
As of Nautilus+, when you set pg_num, it actually internally sets pg(p)_num_target, and then slowly increases (or decreases, if you're merging) pg_num and then pgp_num until it reaches the target. The amount of backfill scheduled into the system is controlled by target_max_misplaced_ratio. Josh

[ceph-users] Re: Upgrade from Octopus to Quiny fails on third ceph-mon

2022-09-28 Thread Josh Baergen
FWIW, this is what the Quincy release notes say: LevelDB support has been removed. WITH_LEVELDB is no longer a supported build option. Users should migrate their monitors and OSDs to RocksDB before upgrading to Quincy. Josh On Wed, Sep 28, 2022 at 4:20 AM Eugen Block wrote: > > Hi, > > there

[ceph-users] Re: Balancer Distribution Help

2022-09-23 Thread Josh Baergen
Hey Wyll, > $ pgremapper cancel-backfill --yes # to stop all pending operations > $ placementoptimizer.py balance --max-pg-moves 100 | tee upmap-moves > $ bash upmap-moves > > Repeat the above 3 steps until balance is achieved, then re-enable the > balancer and unset the "no" flags set

[ceph-users] Re: Question about recovery priority

2022-09-23 Thread Josh Baergen
Hi Fulvio, > leads to a much shorter and less detailed page, and I assumed Nautilus > was far behind Quincy in managing this... The only major change I'm aware of between Nautilus and Quincy is that in Quincy the mClock scheduler is able to automatically tune up/down backfill parameters to

[ceph-users] Re: Question about recovery priority

2022-09-22 Thread Josh Baergen
Hi Fulvio, https://docs.ceph.com/en/quincy/dev/osd_internals/backfill_reservation/ describes the prioritization and reservation mechanism used for recovery and backfill. AIUI, unless a PG is below min_size, all backfills for a given pool will be at the same priority. force-recovery will modify

[ceph-users] Re: force-create-pg not working

2022-09-20 Thread Josh Baergen
Hi Jesper, Given that the PG is marked recovery_unfound, I think you need to follow https://docs.ceph.com/en/quincy/rados/troubleshooting/troubleshooting-pg/#unfound-objects. Josh On Tue, Sep 20, 2022 at 12:56 AM Jesper Lykkegaard Karlsen wrote: > > Dear all, > > System: latest Octopus, 8+3

[ceph-users] Re: config db host filter issue

2021-10-20 Thread Josh Baergen
Hey Richard, On Tue, Oct 19, 2021 at 8:37 PM Richard Bade wrote: > user@cstor01 DEV:~$ sudo ceph config set osd/host:cstor01 osd_max_backfills 2 > user@cstor01 DEV:~$ sudo ceph config get osd.0 osd_max_backfills > 2 > ... > Are others able to reproduce? Yes, we've found the same thing on

[ceph-users] Re: Tool to cancel pending backfills

2021-10-01 Thread Josh Baergen
Hi Peter, > When I check for circles I found that running the upmap balancer alone never > seems to create > any kind of circle in the graph By a circle, do you mean something like this? pg 1.a: 1->2 (upmap to put a chunk on 2 instead of 1) pg 1.b: 2->3 pg 1.c: 3->1 If so, then it's not

[ceph-users] Re: Tool to cancel pending backfills

2021-09-27 Thread Josh Baergen
> I have a question regarding the last step. It seems to me that the ceph > balancer is not able to remove the upmaps > created by pgremapper, but instead creates new upmaps to balance the pgs > among osds. The balancer will prefer to remove existing upmaps[1], but it's not guaranteed. The

[ceph-users] Re: Is it normal Ceph reports "Degraded data redundancy" in normal use?

2021-09-16 Thread Josh Baergen
> I assume it's the balancer module. If you write lots of data quickly > into the cluster the distribution can vary and the balancer will try > to even out the placement. The balancer won't cause degradation, only misplaced objects. > Degraded data redundancy: 260/11856050 objects degraded >

[ceph-users] Re: cephfs small files expansion

2021-09-14 Thread Josh Baergen
Hey Seb, > I have a test cluster on which I created pools rbd and cephfs (octopus), when > I copy a directory containing many small files on a pool rbd the USED part of > the ceph df command seems normal on the other hand on cephfs the USED part > seems really abnormal, I tried to change the

[ceph-users] Re: Mon-map inconsistency?

2021-09-06 Thread Josh Baergen
Hi Melanie, On Mon, Sep 6, 2021 at 10:06 AM Desaive, Melanie wrote: > When I execute "ceph mon_status --format json-pretty" from our > ceph-management VM, the correct mon nodes are returned. > > But when I execute "ceph daemon osd.xx config show | grep mon_host" on the > respective storage

[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-09-01 Thread Josh Baergen
y or need to > triger somehow. > > 1 Eyl 2021 Çar 17:07 tarihinde Josh Baergen şunu > yazdı: >> >> Googling for that balancer error message, I came across >> https://tracker.ceph.com/issues/22814, which was closed/wont-fix, and >> some threads that claimed that

[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-09-01 Thread Josh Baergen
1 > type replicated > step take default class ssd > step chooseleaf firstn 0 type host > step emit > } > > pool 54 'rgw.buckets.index' replicated size 3 min_size 1 crush_rule 1 > object_hash rjenkins pg_num 128 pgp_num 128 autoscale_mode warn last_change > 31607 lf

[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-08-31 Thread Josh Baergen
GiB 4.1 GiB 401 GiB > 55.20 0.91 103 up osd.201 > 217 ssd0.87329 1.0 894 GiB 261 GiB 83 GiB 176 GiB 2.3 GiB 634 GiB > 29.15 0.48 89 up osd.217 > > > When I check the balancer status I saw that: ""optimize_result": "So

[ceph-users] Re: After adding New Osd's, Pool Max Avail did not changed.

2021-08-31 Thread Josh Baergen
Hi there, Could you post the output of "ceph osd df tree"? I would highly suspect that this is a result of imbalance, and that's the easiest way to see if that's the case. It would also confirm that the new disks have taken on PGs. Josh On Tue, Aug 31, 2021 at 10:50 AM mhnx wrote: > > I'm

[ceph-users] Re: Is there any way to obtain the maximum number of node failure in ceph without data loss?

2021-07-26 Thread Josh Baergen
23hdd 0.89893 osd.23 up > > 1.0 1.0 > > -45 2.69679 host jceph-n09 > > 24hdd 0.89893 osd.24 up > > 1.0 1.0 > > 25hdd 0.89893 osd.25

[ceph-users] Re: Is there any way to obtain the maximum number of node failure in ceph without data loss?

2021-07-23 Thread Josh Baergen
Hi Jerry, In general, your CRUSH rules should define the behaviour you're looking for. Based on what you've stated about your configuration, after failing a single node or an OSD on a single node, then you should still be able to tolerate two more failures in the system without losing data (or

[ceph-users] Re: NVME hosts added to the clusters and it made old ssd hosts flapping osds

2021-07-08 Thread Josh Baergen
Have you confirmed that all OSD hosts can see each other (on both the front and back networks if you use split networks)? If there's not full connectivity, then that can lead to the issues you see here. Checking the logs on the mons can be helpful, as it will usually indicate why a given OSD is

[ceph-users] Re: ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool

2021-07-06 Thread Josh Baergen
Oh, I just read your message again, and I see that I didn't answer your question. :D I admit I don't know how MAX AVAIL is calculated, and whether it takes things like imbalance into account (it might). Josh On Tue, Jul 6, 2021 at 7:41 AM Josh Baergen wrote: > Hey Wladimir, > > Th

[ceph-users] Re: ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool

2021-07-06 Thread Josh Baergen
(RAW STORAGE/RAW > USED)-(SUM(POOLS/USED)) = 19-17.5 = 1.5 TiB ? > > As it does not seem I would get any more hosts for this setup, > I am seriously thinking of bringing down this Ceph > and setting up instead a Btrfs storing qcow2 images served over > iSC

[ceph-users] Re: ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool

2021-07-05 Thread Josh Baergen
led RBD writes to EC data-pool ? > > Josh Baergen wrote: > > Hey Arkadiy, > > > > If the OSDs are on HDDs and were created with the default > > bluestore_min_alloc_size_hdd, which is still 64KiB in Octopus, then in > > effect data will be allocated from the pool in

[ceph-users] Re: ceph df (octopus) shows USED is 7 times higher than STORED in erasure coded pool

2021-06-29 Thread Josh Baergen
Hey Arkadiy, If the OSDs are on HDDs and were created with the default bluestore_min_alloc_size_hdd, which is still 64KiB in Octopus, then in effect data will be allocated from the pool in 640KiB chunks (64KiB * (k+m)). 5.36M objects taking up 501GiB is an average object size of 98KiB which

[ceph-users] pgremapper released

2021-05-05 Thread Josh Baergen
Hello all, I just wanted to let you know that DigitalOcean has open-sourced a tool we've developed called pgremapper. Originally inspired by CERN's upmap exception table manipulation scripts, pgremapper is a CLI written in Go which exposes a number of upmap-based algorithms for backfill-related

[ceph-users] Re: EC Backfill Observations

2021-04-21 Thread Josh Baergen
Hey Josh, Thanks for the info! > With respect to reservations, it seems like an oversight that > we don't reserve other shards for backfilling. We reserve all > shards for recovery [0]. Very interesting that there is a reservation difference between backfill and recovery. > On the other hand,

[ceph-users] EC Backfill Observations

2021-04-19 Thread Josh Baergen
Hey all, I wanted to confirm my understanding of some of the mechanics of backfill in EC pools. I've yet to find a document that outlines this in detail; if there is one, please send it my way. :) Some of what I write below is likely in the "well, duh" category, but I tended towards completeness.

[ceph-users] Re: ceph Nautilus lost two disk over night everything hangs

2021-03-30 Thread Josh Baergen
I thought that recovery below min_size for EC pools wasn't expected to work until Octopus. From the Octopus release notes: "Ceph will allow recovery below min_size for Erasure coded pools, wherever possible." Josh On Tue, Mar 30, 2021 at 6:53 AM Frank Schilder wrote: > Dear Rainer, > > hmm,

[ceph-users] Re: Do I need to update ceph.conf and restart each OSD after adding more MONs?

2021-03-29 Thread Josh Baergen
As was mentioned in this thread, all of the mon clients (OSDs included) learn about other mons through monmaps, which are distributed when mon membership and election changes. Thus, your OSDs should already know about the new mons. mon_host indicates the list of mons that mon clients should try

[ceph-users] Re: memory consumption by osd

2021-03-29 Thread Josh Baergen
Linux will automatically make use of all available memory for the buffer cache, freeing buffers when it needs more memory for other things. This is why MemAvailable is more useful than MemFree; the former indicates how much memory could be used between Free, buffer cache, and anything else that

[ceph-users] Re: Where has my capacity gone?

2021-01-28 Thread Josh Baergen
Hi George, > May I ask if enabling pool compression helps for the future space > amplification? If the amplification is indeed due to min_alloc_size, then I don't think that compression will help. My understanding is that compression is applied post-EC (and thus probably won't even activate due

[ceph-users] Re: Where has my capacity gone?

2021-01-27 Thread Josh Baergen
On Wed, Jan 27, 2021 at 12:24 AM George Yil wrote: > May I ask if it can be dynamically changed and any disadvantages should be > expected? Unless there's some magic I'm unaware of, there is no way to dynamically change this. Each OSD must be recreated with the new min_alloc_size setting. In

[ceph-users] Re: Where has my capacity gone?

2021-01-26 Thread Josh Baergen
> I created radosgw pools. secondaryzone.rgw.buckets.data pool is configured as EC 8+2 (jerasure). Did you override the default bluestore_min_alloc_size_hdd (64k in that version IIRC) when creating your hdd OSDs? If not, all of the small objects produced by that EC configuration will be leading