[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2023-02-06 Thread Richard Bade
ous to manage these. > > > > > > For example, would the following work-flow apply the correct settings > > > *permanently* across restarts: > > > > > > 1) Prepare OSD on fresh HDD with ceph-volume lvm batch --prepare ... > > > 2) Assign dm_cac

[ceph-users] Re: Nautilus to Octopus when RGW already on Octopus

2023-02-06 Thread Richard Bade
Hi, We're actually on very similar setup to you with 18.04 and Nautilus and thinking about the 20.04 upgrade process. As for your RGW, I think I would not consider the downgrade. I believe the order is about avoiding issues with newer RGW connecting to older mons and osds. Since you're already in

[ceph-users] Re: Undo "radosgw-admin bi purge"

2023-02-21 Thread Richard Bade
Hi Robert, A colleague and I ran into this a few weeks ago. The way we managed to get access back to delete the bucket properly (using radosgw-admin bucket rm) was to reshard the bucket. This created a new bucket index and therefore it was then possible to delete it. If you are looking to get acces

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Richard Bade
Hi, I found the documentation for metadata get to be unhelpful for what syntax to use. I eventually found that it's this: radosgw-admin metadata get bucket:{bucket_name} or radosgw-admin metadata get bucket.instance:{bucket_name}:{instance_id} Hopefully that helps you or someone else struggling wi

[ceph-users] Can I delete rgw log entries?

2023-04-16 Thread Richard Bade
Hi Everyone, I've been having trouble finding an answer to this question. Basically I'm wanting to know if stuff in the .log pool is actively used for anything or if it's just logs that can be deleted. In particular I was wondering about sync logs. In my particular situation I have had some tests o

[ceph-users] Re: Can I delete rgw log entries?

2023-04-20 Thread Richard Bade
Ok, cool. Thanks for clarifying that Daniel and Casey. I'll clean up my sync logs now but leave the rest alone. Rich On Fri, 21 Apr 2023, 05:46 Daniel Gryniewicz, wrote: > On 4/20/23 10:38, Casey Bodley wrote: > > On Sun, Apr 16, 2023 at 11:47 PM Richard Bade wrote: > &g

[ceph-users] Re: Rados gateway data-pool replacement.

2023-04-25 Thread Richard Bade
Hi Gaël, I'm actually embarking on a similar project to migrate EC pool from k=2,m=1 to k=4,m=2 using rgw multi site sync. I just thought I'd check before you do a lot of work for nothing that when you say failure domain that's the crush failure domain you mean, not k and m? If it is failure domain

[ceph-users] Re: [RGW] what is log_meta and log_data config in a multisite config?

2023-06-07 Thread Richard Bade
Hi Gilles, I'm not 100% sure but I believe this is relating to the logs kept for doing incremental sync. When these are false then changes are not tracked and sync doesn't happen. My reference is this Red Hat documentation on configuring zones without replication. https://access.redhat.com/document

[ceph-users] Re: [rgw multisite] Perpetual behind

2023-06-18 Thread Richard Bade
Hi Yixin, One place that I start with trying to figure this out is the sync error logs. You may have already looked here: sudo radosgw-admin sync error list --rgw-zone={zone_name} If there's a lot in there you can trim it to a specific date so you can see if they're still occurring sudo radosgw-adm

[ceph-users] radosgw-admin sync error trim seems to do nothing

2023-08-20 Thread Richard Bade
Hi Matthew, At least for nautilus (14.2.22) i have discovered through trial and error that you need to specify a beginning or end date. Something like this: radosgw-admin sync error trim --end-date="2023-08-20 23:00:00" --rgw-zone={your_zone_name} I specify the zone as there's a error list for eac

[ceph-users] Re: Is it possible (or meaningful) to revive old OSDs?

2023-09-06 Thread Richard Bade
Yes, I agree with Anthony. If your cluster is healthy and you don't *need* to bring them back in it's going to be less work and time to just deploy them as new. I usually set norebalance, purge the osds in ceph, remove the vg from the disks and re-deploy. Then unset norebalance at the end once eve

[ceph-users] Re: Manual resharding with multisite

2023-10-08 Thread Richard Bade
Hi Yixin, I am interested in the answers to your questions also but I think I can provide some useful information for you. We have a multisite setup also where we need to reshard sometimes as the bucket have grown. However we have bucket sync turned off for these buckets as they only reside on one

[ceph-users] Re: Nautilus: Decommission an OSD Node

2023-11-05 Thread Richard Bade
Hi Dave, It's been a few days and I haven't seen any follow up in the list so I'm wondering if the issue is that there was a typo in your osd list? It appears that you have 16 included again in the destination instead of 26? "24,25,16,27,28" I'm not familiar with the pgremapper script so I may be m

[ceph-users] pg's stuck activating on osd create

2024-06-26 Thread Richard Bade
Hi Everyone, I had an issue last night when I was bringing online some osds that I was rebuilding. When the osds created and came online 15pgs got stuck in activating. The first osd (osd.112) seemed to come online ok, but the second one (osd.113) triggered the issue. All the pgs in activating inclu

[ceph-users] Re: Large omap in index pool even if properly sharded and not "OVER"

2024-07-10 Thread Richard Bade
Hi Casey, Thanks for that info on the bilog. I'm in a similar situation with large omap objects and we have also had to reshard buckets on multisite losing the index on the secondary. We also now have a lot of buckets with sync disable so I wanted to check that it's always safe to trim the bilog on

[ceph-users] Re: PG inconsistent with empty inconsistent objects

2021-01-26 Thread Richard Bade
Hi Everyone, I also have seen this inconsistent with empty when you do list-inconsistent-obj $ sudo ceph health detail HEALTH_ERR 1 scrub errors; Possible data damage: 1 pg inconsistent; 1 pgs not deep-scrubbed in time OSD_SCRUB_ERRORS 1 scrub errors PG_DAMAGED Possible data damage: 1 pg inconsist

[ceph-users] Re: PG inconsistent with empty inconsistent objects

2021-01-26 Thread Richard Bade
On Wed, 27 Jan 2021, 12:57 Joe Comeau, wrote: > just issue the commands > > scrub pg deep-scrub 17.1cs > this will deep scrub this pg > > ceph pg repair 17.7ff > repairs the pg > > > > > > >>> Richard Bade 1/26/2021 3:40 PM >>> >

[ceph-users] Re: PG inconsistent with empty inconsistent objects

2021-01-27 Thread Richard Bade
up? Thanks again everyone. Rich On Thu, 28 Jan 2021 at 03:59, Dan van der Ster wrote: > > Usually the ceph.log prints the reason for the inconsistency when it > is first detected by scrubbing. > > -- dan > > On Wed, Jan 27, 2021 at 12:41 AM Richard Bade wrote: > >

[ceph-users] Re: best use of NVMe drives

2021-02-16 Thread Richard Bade
Hi Magnus, I agree with your last suggestion, putting the OSD DB on NVMe would be a good idea. I'm assuming you are referring to the Bluestore DB rather than filestore journal since you mentioned your cluster is Nautilus. We have a cephfs cluster set up in this way and it performs well. We don't ha

[ceph-users] Re: Question about migrating from iSCSI to RBD

2021-03-16 Thread Richard Bade
Hi Justin, I did some testing with iscsi a year or so ago. It was just using standard rbd images in the backend so yes I think your theory of stopping iscsi to release the locks and then providing access to the rbd image would work. Rich On Wed, 17 Mar 2021 at 09:53, Justin Goetz wrote: > > Hell

[ceph-users] Re: any experience on using Bcache on top of HDD OSD

2021-04-19 Thread Richard Bade
Hi, I also have used bcache extensively on filestore with journals on SSD for at least 5 years. This has worked very well in all versions up to luminous. The iops improvement was definitely beneficial for vm disk images in rbd. I am also using it under bluestore with db/wal on nvme on both Luminous

[ceph-users] Re: OSD bootstrap time

2021-06-08 Thread Richard Bade
Hi Jan-Philipp, I've noticed this a couple of times on Nautilus after doing some large backfill operations. It seems the osd map doesn't clear properly after the cluster returns to Health OK and builds up on the mons. I do a "du" on the mon folder e.g. du -shx /var/lib/ceph/mon/ and this shows seve

[ceph-users] Re: CEPH logs to Graylog

2021-07-04 Thread Richard Bade
Hi Milosz, I don't have any experience with the settings you're using so can't help there, but I do log to graylog via syslog. This is what I do, in case it's helpful as a workaround. In ceph.conf global section or config db: log to syslog = true err to syslog = true in rsyslog.conf add preserve

[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim

2021-08-08 Thread Richard Bade
Hi Daniel, I had a similar issue last week after upgrading my test cluster from 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in .20. My issue was a rados gw that I was re-deploying on the latest version. The problem seemed to be related with cephx authentication. It kept display

[ceph-users] Re: Edit crush rule

2021-09-07 Thread Richard Bade
Hi Budai, I agree with Nathan, just switch the crush rule. I've recently done this on one of our clusters. Create a new crush rule the same as your old one except with different failure domain. Then use: ceph osd pool set {pool_name} crush_rule {new_rule_name} Very easy. This may kick off some back

[ceph-users] Re: Balancer vs. Autoscaler

2021-09-22 Thread Richard Bade
If you look at the current pg_num in that pool ls detail command that Dan mentioned you can set the pool pg_num to what that value currently is, which will effectively pause the pg changes. I did this recently when decreasing the number of pg's in a pool, which took several weeks to complete. This

[ceph-users] config db host filter issue

2021-10-19 Thread Richard Bade
Hi Everyone, I think this might be a bug so I'm wondering if anyone else has seen this. The issue is that config db filters for host don't seem to work. I was able to reproduce this on both prod and dev clusters that I tried it on with Nautilus 14.2.22. The osd I'm testing (osd.0) is under this tr

[ceph-users] Re: Centralized config mask not being applied to host

2021-11-25 Thread Richard Bade
Hi Mark, I have noticed exactly the same thing on Nautilus where host didn't work but chassis did work. I posted to this mailing list a few weeks ago. It's very strange that the host filter is not working. I also could not find any errors logged for this, so it looks like it's just ignoring the set

[ceph-users] Ceph Bluestore tweaks for Bcache

2022-04-04 Thread Richard Bade
Hi Everyone, I just wanted to share a discovery I made about running bluestore on top of Bcache in case anyone else is doing this or considering it. We've run Bcache under Filestore for a long time with good results but recently rebuilt all the osds on bluestore. This caused some degradation in per

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-05 Thread Richard Bade
Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Richard Bade > Sent: 05 April 2022 00:07:34 > To: Ceph Users > Subject: [ceph-users] Ceph Bluestore tweaks for Bcache > > Hi Everyone, > I just wanted to share a discovery I made about

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-05 Thread Richard Bade
ce HDD-related settings for a BlueStore > > > Thanks, > > Igor > > On 4/5/2022 1:07 AM, Richard Bade wrote: > > Hi Everyone, > > I just wanted to share a discovery I made about running bluestore on > > top of Bcache in case anyone else is doing this or consideri

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-05 Thread Richard Bade
h osd metadata {osd_id} > > On 05.04.2022, 11:49, "Richard Bade" wrote: > > Hi Frank, yes I changed the device class to HDD but there seems to be some > smarts in the background that apply the different settings that are not > based on the class but some other int

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-05 Thread Richard Bade
Just for completeness for anyone that is following this thread. Igor added that setting in Octopus, so unfortunately I am unable to use it as I am still on Nautilus. Thanks, Rich On Wed, 6 Apr 2022 at 10:01, Richard Bade wrote: > > Thanks Igor for the tip. I'll see if I can use thi

[ceph-users] Re: [Warning Possible spam] Re: Ceph Bluestore tweaks for Bcache

2022-04-07 Thread Richard Bade
ce being detected as non-rotational after step 2. Is this > assumption correct? > > Thanks and best regards, > ===== > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Richard Bade > Sent: 06 April 2022 00:43

[ceph-users] Re: [Warning Possible spam] Re: [Warning Possible spam] Re: Ceph Bluestore tweaks for Bcache

2022-04-07 Thread Richard Bade
Hi Frank, Yes, I think you have got to the crux of the issue. > - some_config_value_hdd is used for "rotational=0" devices and > - osd/class:hdd values are used for "device_class=hdd" OSDs, The class is something that is user defined and you can actually define your own class names. By default the

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-10 Thread Richard Bade
ly used). > > Best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Richard Bade > Sent: 08 April 2022 00:08 > To: Frank Schilder > Cc: Igor Fedotov; Ceph Users > Subject

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-10 Thread Richard Bade
bluestore_prefer_deferred_size_hdd value of 32768 works well for me but there may be other values that are better. Rich On Mon, 11 Apr 2022 at 09:23, Richard Bade wrote: > > Hi Frank, > Thanks for your insight on this. I had done a bunch of testing on this > over a year ago and found improvement

[ceph-users] Re: Ceph Bluestore tweaks for Bcache

2022-04-10 Thread Richard Bade
Ok, further testing and thinking... Frank, you mentioned about creating the osd without cache so that it'd be picked up as HDD not SSD. Also back in this thread Aleksandr mentioned that the parameter rotational in sysfs is used for this. So I checked what this parameter is being set to with bcache

[ceph-users] Re: replaced osd's get systemd errors

2022-04-21 Thread Richard Bade
Yeah, I've seen this happen when replacing osds. Like Eugen said, there's some services that get created for mounting the volumes. You can disable them like this: systemctl disable ceph-volume@lvm-{osdid}-{fsid}.service list the contents of /etc/systemd/system/multi-user.target.wants/ceph-volume@l

[ceph-users] Re: Ceph OSD purge doesn't work while rebalancing

2022-04-26 Thread Richard Bade
I agree that it would be better if it was less sensitive to unrelated backfill. I've noticed this recently too, especially if you're purging multiple osds (like a whole host). The first one succeeds but the next one fails even though I have no rebalance set and the osd was already out. I guess if m

[ceph-users] Re: Unbalanced Cluster

2022-05-04 Thread Richard Bade
Hi David, I think that part of the problem with unbalanced osds is that your EC rule k=7,m=2 gives 9 total chunks and you have 9 total servers. This is essentially tying cephs hands as it has no choice where to put the pg's. Assuming a failure domain of host then each EC shard needs to be on a diff

[ceph-users] Re: Unbalanced Cluster

2022-05-05 Thread Richard Bade
Hi David, Something else you could try with that other pool, if it contains little or no data, is to reduce the PG number. This does cause some backfill operations as it does a pg merge but this doesn't take long if the pg is virtually empty. The autoscaler has a mode where it can make recommendati

[ceph-users] Re: DM-Cache for spinning OSDs

2022-05-17 Thread Richard Bade
Hey Felix, I run bcache pretty much in the way you're describing, but we have smaller spinning disks (4TB). We mostly share a 1TB NVMe between 6x osd's with 33GB db/wal per osd and the rest shared bcache cache. The performance is definitely improved over not running cache. We run this mostly for rb

[ceph-users] osd_disk_thread_ioprio_class deprecated?

2022-05-17 Thread Richard Bade
Hi Everyone, I've been going through our config trying to remove settings that are nolonger relevant or which are now the default setting. The osd_disk_thread_ioprio_class and osd_disk_thread_ioprio_priority settings come up a few times in the mailing list but nolonger appear in the ceph documentat

[ceph-users] Re: osd_disk_thread_ioprio_class deprecated?

2022-05-18 Thread Richard Bade
> See this PR > https://github.com/ceph/ceph/pull/19973 > Doing "git log -Sosd_disk_thread_ioprio_class -u > src/common/options.cc" in the Ceph source indicates that they were > removed in commit 3a331c8be28f59e2b9d952e5b5e864256429d9d5 which first > appeared in Mimic. Thanks Matthew and Josh for

[ceph-users] Re: Request for Info: What has been your experience with bluestore_compression_mode?

2022-08-18 Thread Richard Bade
Hi Laura, We have used pool compression in the past and found it to work well. We had it on 4/2 EC pool and found data ended up near 1:1 pool:raw. We were storing backup data in this cephfs pool, however we changed the backup product and as the data is now encrypted at rest by the application the b

[ceph-users] Small HDD cluster, switch from Bluestore to Filestore

2019-08-13 Thread Richard Bade
Hi Everyone, There's been a few threads around about small HDD (spinning disk) clusters and performance on Bluestore. One recently from Christian (http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-August/036385.html) was particularly interesting to us as we have a very similar setup to what