filesystems?
Major consideration points: cost of having multiple MDS running (more
memory/cpu used), inability to move files between the two hierarchies
without full copies, and straightforward scaling w/ different file
systems.
Active-active file systems can often function in a similar way with
s
Hi Jimmy,
On Fri, Apr 22, 2022 at 11:02 AM Jimmy Spets wrote:
>
> Does cephadm automatically reduce ranks to 1 or does that have to be done
> manually?
Automatically.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat,
which is already the first thing you do when upgrading a
cluster).
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
om Octopus
> (and or pacific < 16.2.7) to Quincy? It's not in the release notes, but
> just double checking here [1].
Yes it is necessary.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_
:00 tstmon01 ceph-mon[960]: 9: main()
> 2021-12-09T14:56:40.103+00:00 tstmon01 ceph-mon[960]: 10:
> __libc_start_main()
> 2021-12-09T14:56:40.103+00:00 tstmon01 ceph-mon[960]: 11: _start()
I just want to followup that this is indeed a new bug (not an existing
bug as I or
ou have snapshots? If you've deleted the directories
that have been snapshotted then they stick around in the stray
directory until the snapshot is deleted. There's no way to force
purging until the snapshot is also deleted. For this reason, the stray
directory size can grow
https://github.com/ceph/ceph/pull/44514
which is significantly less disruptive than `ls -lR` or `find`.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mai
incy might not be available for a while and I was hoping to avoid a
> bunch of dependencies on the current FS name in use.
>
> > On Jan 4, 2022, at 8:16 AM, Patrick Donnelly wrote:
> >
> > "ceph fs rename" will be available in Quincy. see also:
> > https
_
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
t; [mds.ceph1{-1:448321175} state up:standby seq 1 addr [v2:
> 10.10.0.32:6800/881362738,v1:10.10.0.32:6801/881362738] compat
> {c=[1],r=[1],i=[7ff]}]
The "i=[77f]" indicates to me this may be an MDS older than 16.2.7.
This should not otherwise be possible.
In any case, I'm n
t; service stopped auto-restarting.
Please disable mon_mds_skip_sanity in the mons ceph.conf:
[mon]
mon_mds_skip_sanity = false
The cephadm upgrade sequence is already doing this but I forgot
(sorry!) to mention this is required for manual upgrades in the
rel
Is there a better way to do this, i.e. to
> export multiple subdirectories from cephfs (to different NFS clients
> perhaps) but share the libcephfs connection?
Export blocks do not share libcephfs connections. I don't think there
are any plans to change that.
[1] https://docs.ceph.com/en
ot;0.00",
> "last_scrub_version": 0,
> "symlink": "",
> "xattrs": [],
> "dirfragtree": {
> "splits": []
> },
> "old_inodes": [],
> "oldest_snap": 184
arning, or how can we get rid of it?
This reminds me of https://tracker.ceph.com/issues/46830
Suggest monitoring the client session information from the MDS as Dan
suggested. You can also try increasing mds_min_caps_working_set to see
if that helps.
--
Pat
; >> [ceph]
> >> Nov 5 02:19:14 popeye-mgr-0-10 kernel: dispatch+0xb2/0x1e0 [ceph]
> >> Nov 5 02:19:14 popeye-mgr-0-10 kernel:
> >> process_message+0x7b/0x130 [libceph]
> >> Nov 5 02:19:14 popeye-mgr-0-10 kernel: try_read+0x340/0x5e0
specifically scales by adding more
objects/PGs. This should naturally spread load on primary OSDs across
the cluster. Replicas are for redundancy not for improving (read)
throughput.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
fn_anonymous' is
> 100% busy. Pointing the debugger to it and getting a stack trace at
> random times always shows a similar picture:
Thanks for the report and useful stack trace. This is probably
corrected by the new use of a "fair" mutex in the MDS:
https://tracker.ceph.
nd the project today. I have no doubt that Ceph will
> continue to thrive.
I want to echo what others have already said about your establishing a
welcoming FOSS project. Being part of the Ceph team is a highlight of
my life. We all owe you a debt of gratitude.
Thanks and good luck on your n
e an Alba, àireamh clàraidh SC005336.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
ventually ended up taking
> eveything down and rebuilding the monstore using
> monstore-tool. Perhaps a longer and less pleasant path than necessary
> but it was effective.
>
> -Jon
>
> On Thu, Oct 07, 2021 at 09:11:21PM -0400, Patrick Donnelly wrote:
> :Hello Jonathan,
> :
ring if there are any plans to fix this limitation soon?
Probably Quincy, no promises.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing
nger
> understand old encoding version v < 7: Malformed input
You upgraded from v16.2.5 and not Octopus? I would expect your cluster
to crash when upgrading to any version of Pacific:
https://tracker.ceph.com/issues/51673
Only the crash error has changed from an assertion to an excepti
and). I read the release notes and there did seem to be some
> related fixes between 14.2.2 and 14.2.9 but nothing after 14.2.9.
>
> I can't seem to find any references to a problem like this anywhere.
> Does anyone have any ideas?
You're probably hitt
On Fri, Sep 17, 2021 at 11:30 AM Robert Sander
wrote:
>
> On 17.09.21 16:40, Patrick Donnelly wrote:
>
> > Stopping NFS should not have been necessary. But, yes, reducing
> > max_mds to 1 and disabling allow_standby_replay is required. See:
> > https://docs.ceph.com/
max_mds 1
> in 0,1
> up {}
> failed 0,1
Run:
ceph fs compat add_incompat cephfs 7 "mds uses inline data"
It's interesting you're in the same situation (two ranks). Are you
using cephadm? If not, were you not aware of the MDS upgrade procedure
[1]?
[1] http
ific/9a1ccf41c32446e1b31328e7d01ea8e4aaea8cbb/
for the monitors (only), and then run:
for i in 0 1; do ceph mds addfailed :$i --yes-i-really-mean-it ; done
it should fix it for you.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
last resort as it's not well tested with multiple ranks (you have rank
0 and 1). It's likely you'd lose metadata.
I will compile an addfailed command in a branch but you'll need to
download the packages and run it. Please be careful running
hidden/debugging commands.
--
Patrick
,6=dirfrag is stored in omap,8=no anchor table,9=file
> layout v2,10=snaprealm v2}
Please run:
ceph fs compat add_incompat cephfs 7 "mds uses inline data"
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E
ph fs compat add_incompat 8 "no anchor table"
ceph fs compat add_incompat 9 "file layout v2"
ceph fs compat add_incompat 10 "snaprealm v2"
> I began to feel more and more that the issue was related to a damaged
> cephfs, from a recent set of server malfuncti
7f8105cf1700 1 mds.ceph3 Monitors have
> assigned me to become a standby.
>
> setting add_incompat 1 does also not work:
> # ceph fs compat cephfs add_incompat 1
> Error EINVAL: adding a feature requires a feature string
>
> Any ideas?
Please share `ceph fs dump`.
--
Patr
pping NFS should not have been necessary. But, yes, reducing
max_mds to 1 and disabling allow_standby_replay is required. See:
https://docs.ceph.com/en/pacific/cephfs/upgrading/#upgrading-the-mds-cluster
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engi
"allow rw path=/kio" mon "allow r" osd
> "allow rw tag cephfs data=filesystem"
>
> Ist hat a bug or is that an intended behaviour change?
I'm not seeing a difference between the caps produced by `fs
authorize` and your `auth get-or-c
g born by an MDS. What is it?
Please try the resolutions suggested in: https://tracker.ceph.com/issues/45333
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
Hi Prayank,
Jan has a fix in progress here: https://github.com/ceph/ceph/pull/42893
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list
in the "fs subvolume" interface did it but we've
not heard any other reports of this problem. Otherwise, nothing else
in Ceph internally uses it.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
On Sat, Aug 21, 2021 at 12:48 AM David Prude wrote:
>
> It appears my previous message may have been malformed. Attached is the
> mds log from the time period mentioned.
>
> -David
>
> On 8/20/21 8:59 PM, Patrick Donnelly wrote:
> > Hello David,
> >
> > On F
ent and
> then moved on to testing with our client.admin (with auth listed above).
I cannot reproduce the problem...
> We have tried explicitly setting "ceph fs set dncephfs allow_new_snaps
> true" which had no effect. We have search the mds logs and no entries
> appear on the
probably what's preventing promotion of
standbys. That's a new change in master (which is also being
backported to Pacific). Did you downgrade back to Pacific?
Try:
for i in $(seq 1 10); do ceph fs compat add_incompat $i; done
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Softwa
on how to troubleshot or
> resolve this would be most welcome.
Looks like: https://tracker.ceph.com/issues/49132
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
__
ployed until now that has not used CephFS?
Any cluster created at Jewel or later won't be affected. You only need
to consider clusters that were built pre-Jewel.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F
ceph::buffer::v15_2_0::list,
> unsigned long)+0x54) [0x55b3a5e6ed84]
> 4: main()
> 5: __libc_start_main()
> 6: _start()
> Aborted (core dumped)
>
> Basically, the above steps have the same workflow regarding to how monitor
> load the mdsmap from DB and
an easy way to check the release a cluster started
as. And unfortunately, there is no way to check for legacy data
structures. If your cluster has used CephFS at all since Jewel, it's
very unlikely there will be any in the mon stores. If you're not sure,
best to upgrade through v15.2.14
-released Octopus v15.2.14 before continuing on to
Pacific/Quincy. After a day's time, the Monitors will have cleared out
the old structures.
[1] https://tracker.ceph.com/issues/51673
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, C
ot;: 803
},
{
"dirino": 1099859111936,
"dname": "6275606",
"version": 132338
},
{
"dirino": 1099511627776,
"dname":
"teuthology-2021-07-16_0
file_name && mv -f .hidden_file_name
> original_file_name
>
> -Patrick
> ________
> From: Patrick Donnelly
> Sent: Thursday, July 22, 2021 5:03 PM
> To: huxia...@horebdata.cn
> Cc: ceph-users
> Subject: [ceph-users] Re: How to make CephFS a ti
about the old files that already exist in FOLDER before
> executing the above command?
Correct.
> Should i mannually migrate those old files, and how?
Copy them.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunn
nts are highly appreciated,
We have an outstanding ticket for this but no one has yet taken it up:
https://tracker.ceph.com/issues/40285
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
is point (late reply, my fault), I'm not
sure it's worth the trouble.
> 3) I was somehow surprised by this, because I had thought that the new
> `ceph -s` multi-mds scrub status implied that multi-mds scrubbing was
> now working:
>
> task status:
> scrub status:
t possible to accurately account for the quota
usage prior to doing the rename. Rather than allow a quota to
potentially be massively overrun, we fell back to the old behavior of
not allowing it.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engin
rites promptly",
> from this sentence, it seems that "MDS_SLOW_REQUEST" also contains OSD
> operations by the MDS?
Yes. If you have slow metadata IO warnings you will likely also have
slow request warnings.
--
Patrick Donnelly, Ph.D.
He / Him /
> 30 secs, oldest
> blocked for 51123 secs MDS_SLOW_REQUEST 1 MDSs report slow requests
MDS_SLOW_REQUEST: RPCs from the client to the MDS are "slow", i.e. not
complete in less than 30 seconds.
MDS_SLOW_METADATA_IO: OSD operations by the MDS are not yet complete
after 30 seconds.
--
nel client. You can
try the same command on ceph-fuse or maybe initiate a recursive scrub
on the MDS.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
s going on with the client requests and write operations.
I suggest you look at the "perf dump" statistics from the MDS (via
ceph tell or admin socket) over a period of time to get an idea what
operations it's performing. It's probable your workload changed
somehow and that i
w Monitor election
> 1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 calling
> monitor election
In the process of killing the active MDS, are you also killing a monitor?
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software E
; nautilus (stable)": 12
>},
>"mds": {
>"ceph version 14.2.19 (bb796b9b5bab9463106022eef406373182465d11)
> nautilus (stable)": 3
>},
>"rgw": {
>"ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9)
> nautilus (stable)": 9
It's a bug: https://tracker.ceph.com/issues/50060
On Wed, Dec 23, 2020 at 5:53 PM Alex Taylor wrote:
>
> Hi Patrick,
>
> Any updates? Looking forward to your reply :D
>
>
> On Thu, Dec 17, 2020 at 11:39 AM Patrick Donnelly wrote:
> >
> > On Wed, Dec
On Mon, Mar 15, 2021 at 10:42 AM Jeff Layton wrote:
> The question is, does the MDS you're using return an inode structure
> version >=2 ?
Yes, he needs to upgrade to at least nautilus. Mimic is missing commit
8469a81625180668a9dec840293013be019236b8.
--
Patrick Donnelly, Ph.D.
that pinning is in effect as intended?
IIRC, getfattr support was recently added. What client are you using?
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
specific use case I require snapshots on the subvolume
> group layer. It therefore seems better to just forego the abstraction as
> a whole and work on bare CephFS.
subvolumegroup snapshots will come back, probably in a minor release of Pacific.
--
Patrick Donnel
does this sort of damage mean? Is there anything
> I can do to recover these files?
Scrubbing should correct it. Try "recursive repair force" to see if
that helps. "force" will cause the MDS to revisit metadata that has
been scrubbed previously but unchanged since then.
--
> After looking at our monitoring history, it seems the mds cache is
> actually used more fully, but most of our servers are getting a weekly
> reboot by default. This clears the mds cache obviously. I wonder if
> that's a smart idea for an MDS node...? ;-)
No, it's not.
ume API. Note:
subvolume group snapshots are currently disabled (but may not be for
your version of Octopus) but we expect to bring it back soon.
> - What is the current recommendation regarding CephFS and max number of
> snapshots?
A given directory should have less than ~100 snapshots in
maybe I
> should ask a different question: Does a (ceph-fuse / kernel) client use
> the *cephfs flags* bit at all? If not than we don't have to focus on
> this, and we can conclude we cannot reproduce the issue on our test
> environment.
ceph-fuse/kernel client don't use th
On Thu, Dec 17, 2020 at 11:35 AM Stefan Kooman wrote:
>
> On 12/17/20 7:45 PM, Patrick Donnelly wrote:
>
> >
> > When a file system is newly created, it's assumed you want all the
> > stable features on, including multiple MDS, directory fragmentation,
> > s
> lifetime of a cluster? What exactly is the purpose of the filesystem
> flags bit?
When a file system is newly created, it's assumed you want all the
stable features on, including multiple MDS, directory fragmentation,
snapshots, etc. That's what those flags are for. If you
tly. These correspond to reserved feature bits
for unspecified older Ceph releases. Suggest you just set the
min_compat_client to jewel.
In any case, I think what you're asking is about the file system flags
and not the required_client_features.
--
Patrick Donnelly, Ph.D.
He / Him / His
Princi
ducing it with the master branch and could not. It might
be due to an older fuse/ceph. I suggest you upgrade!
> 2. It works again with fuse_default_permissions=true, any drawbacks if
> this option is set?
Correctness (ironically, for you) and performance.
--
Patrick Donnelly, Ph.D.
He / Him
ax_decay_threshold 98304
> mds advanced mds_recall_warning_threshold 196608
> globaladvanced mon_compact_on_start true
>
> I haven't had any noticeable slow downs or crashes in a while with 3
> active MDS and 3 hot standbys.
Thanks for sharing the settings that worked fo
On Mon, Dec 7, 2020 at 12:06 PM Patrick Donnelly wrote:
>
> Hi Dan & Janek,
>
> On Sat, Dec 5, 2020 at 6:26 AM Dan van der Ster wrote:
> > My understanding is that the recall thresholds (see my list below)
> > should be scaled proportionally. OTOH, I haven't pla
at is the risk of
> setting that value to say 10TiB?
There is no known downside. Let us know how it goes!
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users
you have too large of a directory, things could
get ugly!)
> > I'm hopeful your problems will be addressed by:
> > https://tracker.ceph.com/issues/47307
> That does indeed sound a bit like it might fix these kind of issues.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principa
ow if it's missing information or if something could
be more clear.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-user
s for the MDS have you made?
I'm hopeful your problems will be addressed by:
https://tracker.ceph.com/issues/47307
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
_
ible that the clients dropped the caps
> already before the MDS request was handled/received.
Can you share any config changes you've made on the MDS?
Also, Mimic is EOL as you probably know. Please upgrade :)
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal S
mon.ceph03 config set "mon_health_log_update_period" 30
> ceph tell mon.ceph03 config set "debug_mgr" "0/0"
>
> which made it better, but i really cant remember it all and would like
> to have the default values.
>
> Is there a way to reset those Log Va
he most stable but don't know
> if that's still the case.
You need to first upgrade to Nautilus in any case. n+2 releases is the
max delta between upgrades.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E
should I believe - the presentation or the official docs?
We expect to make multi-fs stable in Pacific.
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-
older thread
> on the topic in the users-list and also a fix/workaround.
This is likely to be the problem. Please add the application tag to
your CephFS data pools:
https://docs.ceph.com/en/latest/rados/operations/pools/#associate-pool-to-application
--
Patrick Donnelly, Ph.D.
He / Him
force MDS to change
> status to active and run all of the required directory checks in the
> background? How can I localise the root cause?
Link to a tracker issue where some discussion has taken place:
https://tracker.ceph.com/issues/47582
--
Patrick Donnelly, Ph.D.
He / Him / His
Princip
s fine with
> the MDS automatically deployed but there is no provision for using EC with
> the data pool
See "Using EC pools with CephFS" in
https://ceph.io/community/new-luminous-erasure-coding-rbd-cephfs/
I will make a note to improve the ceph documentation on this.
--
Patri
eeds updated for some reason as part of an upgrade (e.g.
Mimic and snapshot formats). It's not considered necessary to do it on
a routine basis. RADOS PG scrubbing is sufficient for ensuring that
the backing data is routinely checked for correctness/redundancy.
--
Patrick Donnelly, Ph.D.
He / H
ate.
>
> Gr. Stefan
>
> P.s. I think our only option was to get the active restarted at that
> point, but still.
Yes, there should be a note in the docs about that. It seems a new PR
is up to respond to this issue:
https://github.com/ceph/ceph/pull/36823
--
Patrick Donnelly, Ph.D.
He /
ke if
> osd.121 goes down, you can start it on some random node.
Why not?
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-
ng. Feedback is welcome.
[0] https://github.com/batrick/ceph-linode
[1] https://www.linode.com/
[2] https://docs.ceph.com/docs/master/cephadm/
[3] https://github.com/batrick/ceph-linode/blob/master/cephadm.yml
Full disclosure: I have no relationship with Linode except as a customer.
--
Patrick Don
e increase MDS debugging:
ceph config set mds debug_mds 20
for the time it takes to reproduce.
Then the core dump may also be helpful:
https://docs.ceph.com/docs/master/man/8/ceph-post-file/
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
R
s.data'. It must
> be a valid data pool
Did you forget to add the data pool to the volume (file system)?
https://docs.ceph.com/docs/master/cephfs/administration/#file-systems
See "add_data_pool".
--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale
messages ight appear there?
Any kind of cluster task. Right now (for CephFS), we just use it for
on-going scrubs. There's a bug where idle scrub is continually
reported. It will be resolved in Nautilus with this backport:
https://tracker.ceph.com/issues/46480
--
Patrick Don
ster/cephfs/administration/#taking-the-cluster-down-rapidly-for-deletion-or-disaster-recovery
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing lis
he_memory_limit
> I’m concerned that reducing mds_cache_memory_limit even in very small
> increments may trigger a large recall of caps and overwhelm the MDS.
That used to be the case in older versions of Luminous but not any longer.
--
Patrick Don
d to
> cache pressure
> MDS_CACHE_OVERSIZED 1 MDSs report oversized cache
> mdsceph-mds1(mds.0): MDS cache is too large (91GB/32GB); 34400070
> inodes in use by clients, 3293 stray files
Can you share the client list? Use the `ceph tell mds.foo session ls` command.
--
Patrick Donnelly, Ph
there a way to change the default pool or some other kind of
> migration without having to recreate the FS?
Not until something like [1] is implemented.
If it's not broken for you, don't fix it.
[1] https://tracker.ceph.com/issues/40285
--
Patrick Donnelly, Ph.D.
He / Him / His
3,180,115]
>
> Last time, it seemed to just recover after about an hour all by it's self.
> Any way to speed this up?
We need more cluster information, error messages, client
versions/types, etc. to help.
--
Patrick Donnelly, Ph.D.
He / Him / Hi
--
>
> Does anyone have similar problems? Or if this behavior is by purpose, can you
> explain to me why this is the case?
> Thank you in advance for your time and thoughts.
Here's what Jeff Layton had to say (he didn't get the mail posting somehow):
"Yes that
osterity, a tracker was opened for this bug:
https://tracker.ceph.com/issues/44546
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@cep
ome kind of bug as others have reported which is causing the
cache size / anonymous memory to continually increase. You will need
to post more information about the client type/version, cache usage,
perf dumps, and workload to help diagnose.
--
Patrick Donnelly, Ph.D.
He / Him / His
Se
gt;mds_recall_max_decay_rate = 2.5
It looks like your setting for mds_recall_max_caps is larger than
mds_recall_max_decay_threshold. Are you changing these configurations?
If so, why?
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301
ariables are removed.
Instead, follow this procedure.:
https://docs.ceph.com/docs/octopus/cephfs/standby/#configuring-mds-file-system-affinity
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ason to
change anything. "If it's not broken, don't fix it."
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-us
space
utilization. Small files will still create at least one object in the
EC pool.
> Also, is it possible to insert a replicated data pool as the default on
> an already deployed CephFS, or will I need to create a new FS and copy
> the data over?
You must create a new file system at th
at inodes added/removed (to
identify churn):
"mds_mem": {
"ino": 2740340,
"ino+": 19461742,
"ino-": 16721402,
--
Patrick Donnelly, Ph.D.
He / Him / His
Senior Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD
101 - 200 of 222 matches
Mail list logo