[ceph-users] Announcing Ceph Day NYC 2024 - April 26th!

2024-03-07 Thread Dan van der Ster
Hi everyone, Ceph Days are coming to New York City again this year, co-hosted by Bloomberg Engineering and Clyso! We're planning a full day of Ceph content, well timed to learn about the latest and greatest Squid release. https://ceph.io/en/community/events/2024/ceph-days-nyc/ We're opening

[ceph-users] Re: Performance improvement suggestion

2024-02-20 Thread Dan van der Ster
inconsistencies quickly, which would also work against any potential speedup. Cheers, Dan -- Dan van der Ster CTO Clyso GmbH w: https://clyso.com | e: dan.vanders...@clyso.com Try our Ceph Analyzer!: https://analyzer.clyso.com/ We are hiring: https://www.clyso.com/jobs/ On Wed, Jan 31, 2024, 11:49 quag

[ceph-users] Re: Help: Balancing Ceph OSDs with different capacity

2024-02-07 Thread Dan van der Ster
Hi Jasper, I suggest to disable all the crush-compat and reweighting approaches. They rarely work out. The state of the art is: ceph balancer on ceph balancer mode upmap ceph config set mgr mgr/balancer/upmap_max_deviation 1 Cheers, Dan -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722

[ceph-users] Re: Ceph Nautilous 14.2.22 slow OSD memory leak?

2024-01-10 Thread Dan van der Ster
Hi Samuel, It can be a few things. A good place to start is to dump_mempools of one of those bloated OSDs: `ceph daemon osd.123 dump_mempools` Cheers, Dan -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722 | a: Vancouver, Canada w: https://clyso.com | e: dan.vanders...@clyso.com We

[ceph-users] Re: How balancer module balance data

2023-11-27 Thread Dan van der Ster
Hi, For the reason you observed, I normally set upmap_max_deviation = 1 on all clusters I get my hands on. Cheers, Dan -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722 | a: Vancouver, Canada w: https://clyso.com | e: dan.vanders...@clyso.com We are hiring: https://www.clyso.com/jobs

[ceph-users] Re: Issue with CephFS (mds stuck in clientreplay status) since upgrade to 18.2.0.

2023-11-27 Thread Dan van der Ster
Hi Giuseppe, There are likely one or two clients whose op is blocking the reconnect/replay. If you increase debug_mds perhaps you can find the guilty client and disconnect it / block it from mounting. Or for a more disruptive recovery you can try this "Deny all reconnect to clients " option:

[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files

2023-11-24 Thread Dan van der Ster
Hi Sebastian, You can find some more discussion and fixes for this type of fs corruption here: https://www.spinics.net/lists/ceph-users/msg76952.html -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722 | a: Vancouver, Canada w: https://clyso.com | e: dan.vanders...@clyso.com We are hiring

[ceph-users] Re: RGW access logs with bucket name

2023-11-02 Thread Dan van der Ster
all RGWs. Thanks! Dan -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722 | a: Vancouver, Canada w: https://clyso.com | e: dan.vanders...@clyso.com We are hiring: https://www.clyso.com/jobs/ On Mon, Oct 30, 2023 at 7:19 AM Casey Bodley wrote: > > another option is to enable the rgw o

[ceph-users] Re: RGW access logs with bucket name

2023-10-28 Thread Dan van der Ster
t name at level 1. Cheers, Dan -- Dan van der Ster CTO Clyso GmbH p: +49 89 215252722 | a: Vancouver, Canada w: https://clyso.com | e: dan.vanders...@clyso.com Try our Ceph Analyzer: https://analyzer.clyso.com On Thu, Mar 30, 2023 at 4:15 AM Boris Behrens wrote: > > Sadly not. > I only s

[ceph-users] Ceph Leadership Team notes 10/25

2023-10-25 Thread Dan van der Ster
Hi all, Here are this week's notes from the CLT: * Collective review of the Reef/Squid "State of Cephalopod" slides. * Smoke test suite was unscheduled but it's back on now. * Releases: * 17.2.7: about to start building last week, delayed by a few issues

[ceph-users] Re: index object in shard begins with hex 80

2023-07-18 Thread Dan van der Ster
Hi Chris, Those objects are in the so called "ugly namespace" of the rgw, used to prefix special bucket index entries. // No UTF-8 character can begin with 0x80, so this is a safe indicator // of a special bucket-index entry for the first byte. Note: although // it has no impact, the 2nd, 3rd,

[ceph-users] Re: Cluster down after network outage

2023-07-12 Thread Dan van der Ster
On Wed, Jul 12, 2023 at 1:26 AM Frank Schilder wrote: Hi all, > > one problem solved, another coming up. For everyone ending up in the same > situation, the trick seems to be to get all OSDs marked up and then allow > recovery. Steps to take: > > - set noout, nodown, norebalance, norecover > -

[ceph-users] Re: MON sync time depends on outage duration

2023-07-10 Thread Dan van der Ster
gt; Believe me, I know... but there's not much they can currently do > > about it, quite a long story... But I have been telling them that > > for months now. Anyway, I will make some suggestions and report back > > if it worked in this case as well. > > > > Th

[ceph-users] Re: Planning cluster

2023-07-10 Thread Dan van der Ster
Hi Jan, On Sun, Jul 9, 2023 at 11:17 PM Jan Marek wrote: > Hello, > > I have a cluster, which have this configuration: > > osd pool default size = 3 > osd pool default min size = 1 > Don't use min_size = 1 during regular stable operations. Instead, use min_size = 2 to ensure data safety, and

[ceph-users] Re: CephFS snapshots: impact of moving data

2023-07-06 Thread Dan van der Ster
Hi Mathias, Provided that both subdirs are within the same snap context (subdirs below where the .snap is created), I would assume that in the mv case, the space usage is not doubled: the snapshots point at the same inode and it is just linked at different places in the filesystem. However, if

[ceph-users] Re: Ceph Quarterly (CQ) - Issue #1

2023-07-06 Thread Dan van der Ster
Thanks Zac! I only see the txt attachment here. Where can we get the PDF A4 and letter renderings? Cheers, Dan __ Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com On Mon, Jul 3, 2023 at 10:29 AM Zac Dover wrote: > The

[ceph-users] Re: Cannot get backfill speed up

2023-07-06 Thread Dan van der Ster
Hi Jesper, Indeed many users reported slow backfilling and recovery with the mclock scheduler. This is supposed to be fixed in the latest quincy but clearly something is still slowing things down. Some clusters have better luck reverting to osd_op_queue = wpq. (I'm hoping by proposing this

[ceph-users] Re: pg_num != pgp_num - and unable to change.

2023-07-06 Thread Dan van der Ster
Hi Jesper, > In earlier versions of ceph (without autoscaler) I have only experienced > that setting pg_num and pgp_num took immidiate effect? That's correct -- in recent Ceph (since nautilus) you cannot manipulate pgp_num directly anymore. There is a backdoor setting (set pgp_num_actual ...)

[ceph-users] Re: MON sync time depends on outage duration

2023-07-06 Thread Dan van der Ster
Hi Eugen! Yes that sounds familiar from the luminous and mimic days. Check this old thread: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/F3W2HXMYNF52E7LPIQEJFUTAD3I7QE25/ (that thread is truncated but I can tell you that it worked for Frank). Also the even older referenced

[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-05-31 Thread Dan van der Ster
Hi Janek, A few questions and suggestions: - Do you have multi-active MDS? In my experience back in nautilus if something went wrong with mds export between mds's, the mds log/journal could grow unbounded like you observed until that export work was done. Static pinning could help if you are not

[ceph-users] Re: Newer linux kernel cephfs clients is more trouble?

2023-05-29 Thread Dan van der Ster
Hi, Sorry for poking this old thread, but does this issue still persist in the 6.3 kernels? Cheers, Dan __ Clyso GmbH | https://www.clyso.com On Wed, Dec 7, 2022 at 3:42 AM William Edwards wrote: > > > > Op 7 dec. 2022 om 11:59 heeft Stefan Kooman het volgende >

[ceph-users] Re: Pacific - MDS behind on trimming

2023-05-26 Thread Dan van der Ster
Hi Emmanuel, In my experience MDS getting behind on trimming normally happens for one of two reasons. Either your client workload is simply too expensive for your metadata pool OSDs to keep up (and btw some ops are known to be quite expensive such as setting xattrs or deleting files). Or I've

[ceph-users] Re: pg upmap primary

2023-05-04 Thread Dan van der Ster
Hello, After you delete the OSD, the now "invalid" upmap rule will be automatically removed. Cheers, Dan __ Clyso GmbH | https://www.clyso.com On Wed, May 3, 2023 at 10:13 PM Nguetchouang Ngongang Kevin wrote: > > Hello, I have a question, when happened when i

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-02 Thread Dan van der Ster
Hi Janek, That assert is part of a new corruption check added in 16.2.12 -- see https://github.com/ceph/ceph/commit/1771aae8e79b577acde749a292d9965264f20202 The abort is controlled by a new option: +Option("mds_abort_on_newly_corrupt_dentry", Option::TYPE_BOOL, Option::LEVEL_ADVANCED) +

[ceph-users] Re: How to find the bucket name from Radosgw log?

2023-04-26 Thread Dan van der Ster
Hi, Your cluster probably has dns-style buckets enabled. .. In that case the path does not include the bucket name, and neither does the rgw log. Do you have a frontend lb like haproxy? You'll find the bucket names there. -- Dan __ Clyso GmbH | https://www.clyso.com

[ceph-users] Re: How to control omap capacity?

2023-04-26 Thread Dan van der Ster
Hi, Simplest solution would be to add a few OSDs. -- dan __ Clyso GmbH | https://www.clyso.com On Tue, Apr 25, 2023 at 2:58 PM WeiGuo Ren wrote: > > I have two osds. these osd are used to rgw index pool. After a lot of > stress tests, these two osds were written

[ceph-users] Re: For suggestions and best practices on expanding Ceph cluster and removing old nodes

2023-04-26 Thread Dan van der Ster
Thanks Tom, this is a very useful post! I've added our docs guy Zac in cc: IMHO this would be useful in a "Tips & Tricks" section of the docs. -- dan __ Clyso GmbH | https://www.clyso.com On Wed, Apr 26, 2023 at 7:46 AM Thomas Bennett wrote: > > I would second

[ceph-users] Re: Massive OMAP remediation

2023-04-26 Thread Dan van der Ster
Hi Ben, Are you compacting the relevant osds periodically? ceph tell osd.x compact (for the three osds holding the bilog) would help reshape the rocksdb levels to least perform better for a little while until the next round of bilog trims. Otherwise, I have experience deleting ~50M object

[ceph-users] Re: mons excessive writes to local disk and SSD wearout

2023-02-24 Thread Dan van der Ster
Hi Andrej, That doesn't sound right -- I checked a couple of our clusters just now and the mon filesystem is writing at just a few 100kBps. debug_mon = 10 should clarify the root cause. Perhaps it's logm from some persistent slow ops? Cheers, Dan On Fri, Feb 24, 2023 at 7:36 AM Andrej

[ceph-users] Re: ceph noout vs ceph norebalance, which is better for minor maintenance

2023-02-15 Thread Dan van der Ster
down. This can be useful to pause backfilling e.g. when you are adding or removing hosts to a cluster. -- dan On Wed, Feb 15, 2023 at 2:58 PM Dan van der Ster wrote: > > Hi Will, > > There are some misconceptions in your mail. > > 1. "noout" is a flag used to prevent the

[ceph-users] Re: ceph noout vs ceph norebalance, which is better for minor maintenance

2023-02-15 Thread Dan van der Ster
Hi Will, There are some misconceptions in your mail. 1. "noout" is a flag used to prevent the down -> out transition after an osd is down for several minutes. (Default 5 minutes). 2. "norebalance" is a flag used to prevent objects from being backfilling to a different OSD *if the PG is not

[ceph-users] Re: Frequent calling monitor election

2023-02-09 Thread Dan van der Ster
Hi Frank, Check the mon logs with some increased debug levels to find out what the leader is busy with. We have a similar issue (though, daily) and it turned out to be related to the mon leader timing out doing a SMART check. See https://tracker.ceph.com/issues/54313 for how I debugged that.

[ceph-users] Re: mon scrub error (scrub mismatch)

2023-01-03 Thread Dan van der Ster
Hi Frank, Can you work backwards in the logs to when this first appeared? The scrub error is showing that mon.0 has 78 auth keys and the other two have 77. So you'd have query the auth keys of each mon to see if you get a different response each time (e.g. ceph auth list), and compare with what

[ceph-users] Re: cephfs ceph.dir.rctime decrease

2022-12-19 Thread Dan van der Ster
Hi, Yes this is a known issue -- an mtime can be in the future, and an rctime won't go backwards. There was an earlier attempt to allow fixing the rctimes but this got stuck and needs effort to bring it up to date: https://github.com/ceph/ceph/pull/37938 Cheers, dan On Sun, Dec 18, 2022 at

[ceph-users] Re: osd set-require-min-compat-client

2022-11-30 Thread Dan van der Ster
zender), Prof. Dr.-Ing. Harald Bolt, > Dr. Astrid Lambrecht, Prof. Dr. Frauke Melchior > - > ----------------- > > Am 30.11.

[ceph-users] Re: osd set-require-min-compat-client

2022-11-30 Thread Dan van der Ster
Hi Felix, This change won't trigger any rebalancing. It will prevent older clients from connecting, but since this isn't a crush tunable it won't directly affect data placement. Best, Dan On Wed, Nov 30, 2022, 12:33 Stolte, Felix wrote: > Hey guys, > > our ceph cluster is on pacific, but

[ceph-users] Re: PGs stuck down

2022-11-30 Thread Dan van der Ster
Hi all, It's difficult to say exactly what happened here without cluster logs. Dale, would you be able to share the ceph.log showing the start of the incident? Cheers, dan On Wed, Nov 30, 2022 at 10:30 AM Frank Schilder wrote: > > Hi Eugen, > > power outage is one thing, a cable cut is

[ceph-users] Re: LVM osds loose connection to disk

2022-11-18 Thread Dan van der Ster
Hi Frank, bfq was definitely broken, deadlocking io for a few CentOS Stream 8 kernels between EL 8.5 and 8.6 -- we also hit that in production and switched over to `none`. I don't recall exactly when the upstream kernel was also broken but apparently this was the fix:

[ceph-users] Re: OSDs down after reweight

2022-11-15 Thread Dan van der Ster
Hi Frank, Just a guess, but I wonder if for small values rounding/precision start to impact the placement like you observed. Do you see the same issue if you reweight to 2x the original? -- Dan On Tue, Nov 15, 2022 at 10:09 AM Frank Schilder wrote: > > Hi all, > > I re-weighted all OSDs in a

[ceph-users] Re: Temporary shutdown of subcluster and cephfs

2022-10-19 Thread Dan van der Ster
Hi Frank, fs fail isn't ideal -- there's an 'fs down' command for this. Here's a procedure we used, last used in the nautilus days: 1. If possible, umount fs from all the clients, so that all dirty pages are flushed. 2. Prepare the ceph cluster: ceph osd set noout/noin 3. Wait until there is

[ceph-users] Re: Status of Quincy 17.2.5 ?

2022-10-19 Thread Dan van der Ster
There was a mail on d...@ceph.io that 17.2.4 missed a few backports, so I presume 17.2.5 is a hotfix -- it's what 17.2.4 was supposed to be. (And clearly the announcement is pending) https://github.com/ceph/ceph/commits/v17.2.5 -- dan On Wed, Oct 19, 2022 at 11:46 AM Christian Rohmann wrote: >

[ceph-users] Re: 1 OSD laggy: log_latency_fn slow; heartbeat_map is_healthy had timed out after 15

2022-10-16 Thread Dan van der Ster
Hi Michel, Are you sure there isn't a hardware problem with the disk? E.g. maybe you have SCSI timeouts in dmesg or high ioutil with iostat? Anyway I don't think there's a big risk related to draining and stopping the osd. Just consider this a disk failure, which can happen at any time anyway.

[ceph-users] Re: crush hierarchy backwards and upmaps ...

2022-10-14 Thread Dan van der Ster
d it, and your observations seems to confirm that. I suggest you post to that ticket with your info. Cheers, Dan > It would be great if I could use the first rule, except for this bug. Perhaps > the second rule is best at this point. > > Any other thoughts would be appreciated. >

[ceph-users] Re: crush hierarchy backwards and upmaps ...

2022-10-11 Thread Dan van der Ster
a according to a diff before and > after of --test-pg-upmap-entries. > In this scenario, I dont see any unexpected errors with --upmap-cleanup > and I do not want to get stuck > > rule mypoolname { > id -5 > type erasure > step take myroot > step choose ind

[ceph-users] Re: crush hierarchy backwards and upmaps ...

2022-10-10 Thread Dan van der Ster
Hi, Here's a similar bug: https://tracker.ceph.com/issues/47361 Back then, upmap would generate mappings that invalidate the crush rule. I don't know if that is still the case, but indeed you'll want to correct your rule. Something else you can do before applying the new crush map is use

[ceph-users] Re: recurring stat mismatch on PG

2022-10-08 Thread Dan van der Ster
isø Campus > Bygning 109, rum S14 > > ________________ > From: Dan van der Ster > Sent: 08 October 2022 11:18:37 > To: Frank Schilder > Cc: Ceph Users > Subject: Re: [ceph-users] recurring stat mismatch on PG > > Is that the log from the primary OSD? > > About the restart,

[ceph-users] Re: recurring stat mismatch on PG

2022-10-08 Thread Dan van der Ster
est regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > > > From: Dan van der Ster > Sent: 08 October 2022 11:03:05 > To: Frank Schilder > Cc: Ceph Users > Subject: Re: [ceph-users] recurring s

[ceph-users] Re: recurring stat mismatch on PG

2022-10-08 Thread Dan van der Ster
Hi, Is that 15.2.17? It reminds me of this bug - https://tracker.ceph.com/issues/52705 - where an object with a particular name would hash to and cause a stat mismatch during scrub. But 15.2.17 should have the fix for that. Can you find the relevant osd log for more info? .. Dan On

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Dan van der Ster
Hi Zakhar, I can back up what Konstantin has reported -- we occasionally have HDDs performing very slowly even though all smart tests come back clean. Besides ceph osd perf showing a high latency, you could see high ioutil% with iostat. We normally replace those HDDs -- usually by draining and

[ceph-users] Re: Stuck in upgrade

2022-10-07 Thread Dan van der Ster
Hi Jan, It looks like you got into this situation by not setting require-osd-release to pacific while you were running 16.2.7. The code has that expectation, and unluckily for you if you had upgraded to 16.2.8 you would have had a HEALTH_WARN that pointed out the mismatch between

[ceph-users] Re: osd_memory_target for low-memory machines

2022-10-03 Thread Dan van der Ster
Hi, 384MB is far too low for a Ceph OSD. The warning is telling you that it's below the min. Cheers, Dan On Sun, Oct 2, 2022 at 11:10 AM Nicola Mori wrote: > > Dear Ceph users, > > I put together a cluster by reusing some (very) old machines with low > amounts of RAM, as low as 4 GB for the

[ceph-users] Re: Almost there - trying to recover cephfs from power outage

2022-09-21 Thread Dan van der Ster
Hi Jorge, There was an older procedure before the --recover flag. You can find that here: https://github.com/ceph/ceph/pull/42295/files It was the result of this tracker: https://tracker.ceph.com/issues/51341 Also, here was the change which added the --recover flag:

[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Dan van der Ster
Another common config to workaround this pg num limit is: ceph config set osd osd_max_pg_per_osd_hard_ratio 10 (Then possibly the repeer step on each activating pg) .. Dan On Thu, Sept 15, 2022, 17:47 Josh Baergen wrote: > Hi Fulvio, > > I've seen this in the past when a CRUSH change

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
ndep osd". We should open a tracker for this. Either "choose indep osd" and "chooseleaf indep osd" should be give the same result, or the pool creation should use "chooseleaf indep osd" in this case. -- dan On Tue, Aug 30, 2022 at 1:43 PM Dan van der Ster wrote: >

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
eported above. I'm quite > confident that this is unintended if not dangerous behaviour and should be > corrected. I'm willing to file a tracker item with the data above. I'm > actually wondering if this might be related to > https://tracker.ceph.com/issues/56995 . > > Thanks for

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
up ([6,1,4,5,3,8], p6) acting ([6,1,4,5,3,1], p6) -- dan On Tue, Aug 30, 2022 at 11:50 AM Dan van der Ster wrote: > > BTW, I vaguely recalled seeing this before. Yup, found it: > https://tracker.ceph.com/issues/55169 > > On Tue, Aug 30, 2022 at 11:46 AM Dan van der Ster wrote:

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
BTW, I vaguely recalled seeing this before. Yup, found it: https://tracker.ceph.com/issues/55169 On Tue, Aug 30, 2022 at 11:46 AM Dan van der Ster wrote: > > > 2. osd.7 is destroyed but still "up" in the osdmap. > > Oops, you can ignore this point -- this was an observat

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
Aug 30, 2022 at 11:41 AM Dan van der Ster wrote: > > Hi Frank, > > I suspect this is a combination of issues. > 1. You have "choose" instead of "chooseleaf" in rule 1. > 2. osd.7 is destroyed but still "up" in the osdmap. > 3. The _tries settings in

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-30 Thread Dan van der Ster
he second rule seems to be almost as good or bad as the > default one (step choose indep 0 type osd), except that it does produce valid > mappings where the default rule fails. > > I will wait with changing the rule in the hope that you find a more elegant > solution to this riddle

[ceph-users] Re: Bug in crush algorithm? 1 PG with the same OSD twice.

2022-08-29 Thread Dan van der Ster
Hi Frank, CRUSH can only find 5 OSDs, given your current tree, rule, and reweights. This is why there is a NONE in the UP set for shard 6. But in ACTING we see that it is refusing to remove shard 6 from osd.1 -- that is the only copy of that shard, so in this case it's helping you rather than

[ceph-users] Re: Some odd results while testing disk performance related to write caching

2022-08-15 Thread Dan van der Ster
Hi, We have some docs about this in the Ceph hardware recommendations: https://docs.ceph.com/en/latest/start/hardware-recommendations/#write-caches I added some responses inline.. On Fri, Aug 5, 2022 at 7:23 PM Torbjörn Jansson wrote: > > Hello > > i got a small 3 node ceph cluster and i'm

[ceph-users] Re: mgr service restarted by package install?

2022-07-18 Thread Dan van der Ster
Hi, It probably wasn't restarted by the package, but the mgr itself respawned because the set of enabled modules changed. E.g. this happens when upgrading from octopus to pacific, just after the pacific mons get a quorum: 2022-07-13T11:43:41.308+0200 7f71c0c86700 1 mgr handle_mgr_map respawning

[ceph-users] Re: Slow osdmaptool upmap performance

2022-07-18 Thread Dan van der Ster
Hi, Can you try with the fix for this? https://tracker.ceph.com/issues/54180 (https://github.com/ceph/ceph/pull/44925) It hasn't been backported to any releases, but we could request that if it looks useful. Cheers, Dan On Mon, Jul 18, 2022 at 12:44 AM stuart.anderson wrote: > > I am seeing

[ceph-users] Re: pacific doesn't defer small writes for pre-pacific hdd osds

2022-07-14 Thread Dan van der Ster
ptions in > their defaults)? > > > Thanks, > k > > On 14 Jul 2022, at 08:43, Dan van der Ster wrote: > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: pacific doesn't defer small writes for pre-pacific hdd osds

2022-07-13 Thread Dan van der Ster
pped or rebuilt? > > Best regards, > Zakhar > > On Tue, 12 Jul 2022 at 14:46, Dan van der Ster wrote: > >> Hi Igor, >> >> Thank you for the reply and information. >> I confirm that `ceph config set osd bluestore_prefer_deferred_size_hdd >> 65537` correc

[ceph-users] Re: pacific doesn't defer small writes for pre-pacific hdd osds

2022-07-12 Thread Dan van der Ster
se it that high as 128K to avoid too many writes being deferred > (and hence DB overburden). > > IMO setting the parameter to 64K+1 should be fine. > > > Thanks, > > Igor > > On 7/7/2022 12:43 AM, Dan van der Ster wrote: > > Hi Igor and others, > > (apolog

[ceph-users] Re: pacific doesn't defer small writes for pre-pacific hdd osds

2022-07-07 Thread Dan van der Ster
Hi, On Thu, Jul 7, 2022 at 2:37 PM Konstantin Shalygin wrote: > > Hi, > > On 7 Jul 2022, at 13:04, Dan van der Ster wrote: > > I'm not sure the html mail made it to the lists -- resending in plain text. > I've also opened https://tracker.ceph.com/issues/56488 > > >

[ceph-users] Re: pacific doesn't defer small writes for pre-pacific hdd osds

2022-07-07 Thread Dan van der Ster
Hi again, I'm not sure the html mail made it to the lists -- resending in plain text. I've also opened https://tracker.ceph.com/issues/56488 Cheers, Dan On Wed, Jul 6, 2022 at 11:43 PM Dan van der Ster wrote: > > Hi Igor and others, > > (apologies for html, but i want to

[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-24 Thread Dan van der Ster
$SNAPNAME" - which I think is the same internally? > > And yes, our CephFS has numerous snapshots itself for backup purposes. > > > Cheers, > Pascal > > > > Dan van der Ster wrote on 24.06.22 11:06: > > Hi Pascal, > > > > I'm not sure why you

[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-24 Thread Dan van der Ster
> > Do you have any ideas how I could anyway remove the broken snapshot objects? > > > Cheers, > > Pascal > > > Dan van der Ster wrote on 24.06.22 09:27: > > Hi, > > > > It's trivial to reproduce. Running 16.2.9 with max_mds=2, take a pool > > snapshot o

[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-24 Thread Dan van der Ster
gt;> > list-inconsistent-obj $i | jq -er .inconsistents[].object.name| awk > >> > -F'.' '{print $2}'; done > >> > > >> > we than found inconsistent snaps on the Object: > >> > > >> > rados list-inconsistent-snapset $PG --format=json-p

[ceph-users] Re: Inconsistent PGs after upgrade to Pacific

2022-06-23 Thread Dan van der Ster
Hi Pascal, It's not clear to me how the upgrade procedure you described would lead to inconsistent PGs. Even if you didn't record every step, do you have the ceph.log, the mds logs, perhaps some osd logs from this time? And which versions did you upgrade from / to ? Cheers, Dan On Wed, Jun 22,

[ceph-users] Re: Tuning for cephfs backup client?

2022-06-23 Thread Dan van der Ster
Hi, If the single backup client is iterating through the entire fs, its local dentry cache will probably be thrashing, rendering it quite useless. And that dentry cache will be constantly hitting the mds caps per client limit, so the mds will be busy asking it to release caps (to invalidate

[ceph-users] Re: Drained OSDs are still ACTIVE_PRIMARY - casuing high IO latency on clients

2022-05-20 Thread Dan van der Ster
Hi, Just a curiosity... It looks like you're comparing an EC pool in octopus to a replicated pool in nautilus. Does primary affinity work for you in octopus on a replicated pool? And does a nautilus EC pool work? .. Dan On Fri., May 20, 2022, 13:53 Denis Polom, wrote: > Hi > > I observed

[ceph-users] Re: No rebalance after ceph osd crush unlink

2022-05-18 Thread Dan van der Ster
> > ____ > From: Dan van der Ster > Sent: 18 May 2022 12:04:07 > To: Frank Schilder > Cc: ceph-users > Subject: Re: [ceph-users] No rebalance after ceph osd crush unlink > > Hi Frank, > > Did you check the shadow tree (the one

[ceph-users] Re: No rebalance after ceph osd crush unlink

2022-05-18 Thread Dan van der Ster
Hi Frank, Did you check the shadow tree (the one with tilde's in the name, seen with `ceph osd crush tree --show-shadow`)? Maybe the host was removed in the outer tree, but not the one used for device-type selection. There were bugs in this area before, e.g. https://tracker.ceph.com/issues/48065

[ceph-users] Re: v16.2.8 Pacific released

2022-05-17 Thread Dan van der Ster
On Tue, May 17, 2022 at 1:14 PM Cory Snyder wrote: > > Hi all, > > Unfortunately, we experienced some issues with the upgrade to 16.2.8 > on one of our larger clusters. Within a few hours of the upgrade, all > 5 of our managers had become unavailable. We found that they were all > deadlocked due

[ceph-users] Re: Reasonable MDS rejoin time?

2022-05-17 Thread Dan van der Ster
Hi Felix, "rejoin" took awhile in the past because the MDS needs to reload all inodes for all the open directories at boot time. In our experience this can take ~10 minutes on the most active clusters. In your case, I wonder if the MDS was going OOM in a loop while recovering? This was happening

[ceph-users] Re: Stop Rebalancing

2022-04-13 Thread Dan van der Ster
On Wed, Apr 13, 2022 at 7:07 PM Gregory Farnum wrote: > > On Wed, Apr 13, 2022 at 10:01 AM Dan van der Ster wrote: > > > > I would set the pg_num, not pgp_num. In older versions of ceph you could > > manipulate these things separately, but in pacific I'm not confiden

[ceph-users] Re: Stop Rebalancing

2022-04-13 Thread Dan van der Ster
just start > a whole new load of misplaced object rebalancing. Won't it? > > Thank you, > Ray > > > -Original Message- > From: Dan van der Ster > Sent: Wednesday, April 13, 2022 11:11 AM > To: Ray Cunningham > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] Stop

[ceph-users] Re: Stop Rebalancing

2022-04-13 Thread Dan van der Ster
off for all > Profile is scale-up > > > We have set norebalance and nobackfill and are watching to see what happens. > > Thank you, > Ray > > -Original Message- > From: Dan van der Ster > Sent: Wednesday, April 13, 2022 10:00 AM > To: Ray Cunningham > Cc: ceph-user

[ceph-users] Re: Stop Rebalancing

2022-04-13 Thread Dan van der Ster
orrow. We > definitely want to get this under control. > > Thank you, > Ray > > > -Original Message- > From: Dan van der Ster > Sent: Tuesday, April 12, 2022 2:46 PM > To: Ray Cunningham > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] Stop Rebalan

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Dan van der Ster
tomorrow. We > definitely want to get this under control. > > Thank you, > Ray > > > -Original Message- > From: Dan van der Ster > Sent: Tuesday, April 12, 2022 2:46 PM > To: Ray Cunningham > Cc: ceph-users@ceph.io > Subject: Re: [ceph-users] Stop Rebalan

[ceph-users] Re: Stop Rebalancing

2022-04-12 Thread Dan van der Ster
Hi Ray, Disabling the autoscaler on all pools is probably a good idea. At least until https://tracker.ceph.com/issues/53729 is fixed. (You are likely not susceptible to that -- but better safe than sorry). To pause the ongoing PG merges, you can indeed set the pg_num to the current value. This

[ceph-users] Re: Successful Upgrade from 14.2.18 to 15.2.16

2022-04-12 Thread Dan van der Ster
Hi Stefan, Thanks for the report. 9 hours fsck is the longest I've heard about yet -- and on NVMe, that's quite surprising! Which firmware are you running on those Samsung's? For a different reason Mark and we have been comparing performance of that drive between what's in his lab vs what we

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-04 Thread Dan van der Ster
BTW -- i've created https://tracker.ceph.com/issues/55169 to ask that we add some input validation. Injecting such a crush map would ideally not be possible. -- dan On Mon, Apr 4, 2022 at 11:02 AM Dan van der Ster wrote: > > Excellent news! > After everything is back to active+cle

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-04 Thread Dan van der Ster
s of lessons learned from my > side, I'm really grateful. > >All PGs are now active, will let Ceph rebalance. > >Ciao ciao > > Fulvio > > On 4/4/22 10:50, Dan van der Ster wrote: > > Hi Fulvio, > > > > Yes -- that choose/cho

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-04 Thread Dan van der Ster
wrote: > > Hi again Dan! > Things are improving, all OSDs are up, but still that one PG is down. > More info below. > > On 4/1/22 19:26, Dan van der Ster wrote: > >>>> Here is the output of "pg 85.12 query": > >>>> https://pastebi

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-04 Thread Dan van der Ster
gt; Hi again Dan! > Things are improving, all OSDs are up, but still that one PG is down. > More info below. > > On 4/1/22 19:26, Dan van der Ster wrote: > >>>> Here is the output of "pg 85.12 query": > >>>> https://pastebin.ubuntu.com/p/ww3J

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Dan van der Ster
We're on the right track! On Fri, Apr 1, 2022 at 6:57 PM Fulvio Galeazzi wrote: > > Ciao Dan, thanks for your messages! > > On 4/1/22 11:25, Dan van der Ster wrote: > > The PGs are stale, down, inactive *because* the OSDs don't start. > > Your main efforts should be t

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Dan van der Ster
> At the time I changed the rule, there was no 'down' PG, all PGs in the > cluster were 'active' plus possibly some other state (remapped, > degraded, whatever) as I had added some new disk servers few days before. Never make crush rule changes when any PG is degraded, remapped, or wha

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Dan van der Ster
lt;--- this was "osd", before > step emit > } > > At the time I changed the rule, there was no 'down' PG, all PGs in the > cluster were 'active' plus possibly some other state (remapped, > degraded, whatever) as I had added some new disk servers few days before. >

[ceph-users] Re: PG down, due to 3 OSD failing

2022-03-30 Thread Dan van der Ster
ps://pastebin.ubuntu.com/p/dTfPkMb7mD/ > a few hundreds of lines from one of the failed OSDs upon "activate --all". > >Thanks > > Fulvio > > On 29/03/2022 10:53, Dan van der Ster wrote: > > Hi Fulvio, > > > > I don't

[ceph-users] Re: PG down, due to 3 OSD failing

2022-03-29 Thread Dan van der Ster
aleazzi ha scritto: > > Thanks a lot, Dan! > > > > > The EC pgs have a naming convention like 85.25s1 etc.. for the various > > > k/m EC shards. > > > > That was the bit of information I was missing... I was looking for the > > wrong object. > > I

[ceph-users] Re: PG down, due to 3 OSD failing

2022-03-28 Thread Dan van der Ster
Hi Fulvio, You can check (offline) which PGs are on an OSD with the list-pgs op, e.g. ceph-objectstore-tool --data-path /var/lib/ceph/osd/cephpa1-158/ --op list-pgs The EC pgs have a naming convention like 85.25s1 etc.. for the various k/m EC shards. -- dan On Mon, Mar 28, 2022 at 2:29 PM

[ceph-users] Re: ceph mon failing to start

2022-03-28 Thread Dan van der Ster
Are the two running mons also running 14.2.9 ? --- dan On Mon, Mar 28, 2022 at 8:27 AM Tomáš Hodek wrote: > > Hi, I have 3 node ceph cluster (managed via proxmox). Got single node > fatal failure and replaced it. Os boots correctly, however monitor on > failed node did not start successfully;

[ceph-users] Re: OSD(s) reporting legacy (not per-pool) BlueStore omap usage stats

2022-03-10 Thread Dan van der Ster
Hi, After Nautilus there were two omap usage stats upgrades: Octopus (v15) fsck (on by default) enables per-pool omap usage stats. Pacific (v16) fsck (off by default) enables per-pg omap usage stats. (fsck is off by default in pacific because it takes quite some time to update the on-disk

[ceph-users] Re: octopus (15.2.16) OSDs crash or don't answer heathbeats (and get marked as down)

2022-03-08 Thread Dan van der Ster
Here's the reason they exit: 7f1605dc9700 -1 osd.97 486896 _committed_osd_maps marked down 6 > osd_max_markdown_count 5 in last 600.00 seconds, shutting down If an osd flaps (marked down, then up) 6 times in 10 minutes, it exits. (This is a safety measure). It's normally caused by a network

[ceph-users] Re: Retrieving cephx key from ceph-fuse

2022-03-07 Thread Dan van der Ster
On Fri, Mar 4, 2022 at 2:07 PM Robert Vasek wrote: > > Is there a way for an attacker with sufficient privileges to retrieve the > key by somehow mining it off of the process memory of ceph-fuse which is > now maintaining the volume mount? Yes, one should assume that if they can gcore dump the

[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays

2022-03-01 Thread Dan van der Ster
g", > "id": 3776355973, > "ino": 1099567262916, > "frag": "*", > "path": "~mds0/stray3/1000350ecc4" > }, > > Again, thanks for your help, that is really appreciated > > All

  1   2   3   4   5   6   >