[ceph-users] Re: Single vs multiple cephfs file systems pros and cons

2022-07-19 Thread Patrick Donnelly
filesystems? Major consideration points: cost of having multiple MDS running (more memory/cpu used), inability to move files between the two hierarchies without full copies, and straightforward scaling w/ different file systems. Active-active file systems can often function in a similar way with s

[ceph-users] Re: MDS upgrade to Quincy

2022-05-18 Thread Patrick Donnelly
Hi Jimmy, On Fri, Apr 22, 2022 at 11:02 AM Jimmy Spets wrote: > > Does cephadm automatically reduce ranks to 1 or does that have to be done > manually? Automatically. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat,

[ceph-users] Re: MDS upgrade to Quincy

2022-04-21 Thread Patrick Donnelly
which is already the first thing you do when upgrading a cluster). -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v17.2.0 Quincy released

2022-04-20 Thread Patrick Donnelly
om Octopus > (and or pacific < 16.2.7) to Quincy? It's not in the release notes, but > just double checking here [1]. Yes it is necessary. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _

[ceph-users] Re: Upgrade 16.2.6 -> 16.2.7 - MON assertion failure

2022-01-31 Thread Patrick Donnelly
:00 tstmon01 ceph-mon[960]: 9: main() > 2021-12-09T14:56:40.103+00:00 tstmon01 ceph-mon[960]: 10: > __libc_start_main() > 2021-12-09T14:56:40.103+00:00 tstmon01 ceph-mon[960]: 11: _start() I just want to followup that this is indeed a new bug (not an existing bug as I or

[ceph-users] Re: cephfs: [ERR] loaded dup inode

2022-01-20 Thread Patrick Donnelly
ou have snapshots? If you've deleted the directories that have been snapshotted then they stick around in the stray directory until the snapshot is deleted. There's no way to force purging until the snapshot is also deleted. For this reason, the stray directory size can grow

[ceph-users] Re: cephfs: [ERR] loaded dup inode

2022-01-16 Thread Patrick Donnelly
https://github.com/ceph/ceph/pull/44514 which is significantly less disruptive than `ls -lR` or `find`. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mai

[ceph-users] Re: fs rename: invalid command

2022-01-07 Thread Patrick Donnelly
incy might not be available for a while and I was hoping to avoid a > bunch of dependencies on the current FS name in use. > > > On Jan 4, 2022, at 8:16 AM, Patrick Donnelly wrote: > > > > "ceph fs rename" will be available in Quincy. see also: > > https

[ceph-users] Re: fs rename: invalid command

2022-01-04 Thread Patrick Donnelly
_ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

[ceph-users] Re: Issue Upgrading to 16.2.7 related to mon_mds_skip_sanity.

2021-12-23 Thread Patrick Donnelly
t; [mds.ceph1{-1:448321175} state up:standby seq 1 addr [v2: > 10.10.0.32:6800/881362738,v1:10.10.0.32:6801/881362738] compat > {c=[1],r=[1],i=[7ff]}] The "i=[77f]" indicates to me this may be an MDS older than 16.2.7. This should not otherwise be possible. In any case, I'm n

[ceph-users] Re: Upgrade 16.2.6 -> 16.2.7 - MON assertion failure

2021-12-09 Thread Patrick Donnelly
t; service stopped auto-restarting. Please disable mon_mds_skip_sanity in the mons ceph.conf: [mon] mon_mds_skip_sanity = false The cephadm upgrade sequence is already doing this but I forgot (sorry!) to mention this is required for manual upgrades in the rel

[ceph-users] Re: Ganesha + cephfs - multiple exports

2021-12-07 Thread Patrick Donnelly
Is there a better way to do this, i.e. to > export multiple subdirectories from cephfs (to different NFS clients > perhaps) but share the libcephfs connection? Export blocks do not share libcephfs connections. I don't think there are any plans to change that. [1] https://docs.ceph.com/en

[ceph-users] Re: Annoying MDS_CLIENT_RECALL Warning

2021-11-19 Thread Patrick Donnelly
ot;0.00", > "last_scrub_version": 0, > "symlink": "", > "xattrs": [], > "dirfragtree": { > "splits": [] > }, > "old_inodes": [], > "oldest_snap": 184

[ceph-users] Re: Annoying MDS_CLIENT_RECALL Warning

2021-11-18 Thread Patrick Donnelly
arning, or how can we get rid of it? This reminds me of https://tracker.ceph.com/issues/46830 Suggest monitoring the client session information from the MDS as Dan suggested. You can also try increasing mds_min_caps_working_set to see if that helps. -- Pat

[ceph-users] Re: High cephfs MDS latency and CPU load

2021-11-11 Thread Patrick Donnelly
; >> [ceph] > >> Nov 5 02:19:14 popeye-mgr-0-10 kernel: dispatch+0xb2/0x1e0 [ceph] > >> Nov 5 02:19:14 popeye-mgr-0-10 kernel: > >> process_message+0x7b/0x130 [libceph] > >> Nov 5 02:19:14 popeye-mgr-0-10 kernel: try_read+0x340/0x5e0

[ceph-users] Re: Pacific: parallel PG reads?

2021-11-11 Thread Patrick Donnelly
specifically scales by adding more objects/PGs. This should naturally spread load on primary OSDs across the cluster. Replicas are for redundancy not for improving (read) throughput. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc.

[ceph-users] Re: High cephfs MDS latency and CPU load

2021-11-04 Thread Patrick Donnelly
fn_anonymous' is > 100% busy. Pointing the debugger to it and getting a stack trace at > random times always shows a similar picture: Thanks for the report and useful stack trace. This is probably corrected by the new use of a "fair" mutex in the MDS: https://tracker.ceph.

[ceph-users] Re: A change in Ceph leadership...

2021-10-18 Thread Patrick Donnelly
nd the project today. I have no doubt that Ceph will > continue to thrive. I want to echo what others have already said about your establishing a welcoming FOSS project. Being part of the Ceph team is a highlight of my life. We all owe you a debt of gratitude. Thanks and good luck on your n

[ceph-users] Re: ceph fs status output

2021-10-14 Thread Patrick Donnelly
e an Alba, àireamh clàraidh SC005336. > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA

[ceph-users] Re: Broken mon state after (attempted) 16.2.5 -> 16.2.6 upgrade

2021-10-12 Thread Patrick Donnelly
ventually ended up taking > eveything down and rebuilding the monstore using > monstore-tool. Perhaps a longer and less pleasant path than necessary > but it was effective. > > -Jon > > On Thu, Oct 07, 2021 at 09:11:21PM -0400, Patrick Donnelly wrote: > :Hello Jonathan, > :

[ceph-users] Re: Multi-MDS CephFS upgrades limitation

2021-10-07 Thread Patrick Donnelly
ring if there are any plans to fix this limitation soon? Probably Quincy, no promises. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing

[ceph-users] Re: Broken mon state after (attempted) 16.2.5 -> 16.2.6 upgrade

2021-10-07 Thread Patrick Donnelly
nger > understand old encoding version v < 7: Malformed input You upgraded from v16.2.5 and not Octopus? I would expect your cluster to crash when upgrading to any version of Pacific: https://tracker.ceph.com/issues/51673 Only the crash error has changed from an assertion to an excepti

[ceph-users] Re: Corruption on cluster

2021-09-21 Thread Patrick Donnelly
and). I read the release notes and there did seem to be some > related fixes between 14.2.2 and 14.2.9 but nothing after 14.2.9. > > I can't seem to find any references to a problem like this anywhere. > Does anyone have any ideas? You're probably hitt

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
On Fri, Sep 17, 2021 at 11:30 AM Robert Sander wrote: > > On 17.09.21 16:40, Patrick Donnelly wrote: > > > Stopping NFS should not have been necessary. But, yes, reducing > > max_mds to 1 and disabling allow_standby_replay is required. See: > > https://docs.ceph.com/

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
max_mds 1 > in 0,1 > up {} > failed 0,1 Run: ceph fs compat add_incompat cephfs 7 "mds uses inline data" It's interesting you're in the same situation (two ranks). Are you using cephadm? If not, were you not aware of the MDS upgrade procedure [1]? [1] http

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
ific/9a1ccf41c32446e1b31328e7d01ea8e4aaea8cbb/ for the monitors (only), and then run: for i in 0 1; do ceph mds addfailed :$i --yes-i-really-mean-it ; done it should fix it for you. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
last resort as it's not well tested with multiple ranks (you have rank 0 and 1). It's likely you'd lose metadata. I will compile an addfailed command in a branch but you'll need to download the packages and run it. Please be careful running hidden/debugging commands. -- Patrick

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
,6=dirfrag is stored in omap,8=no anchor table,9=file > layout v2,10=snaprealm v2} Please run: ceph fs compat add_incompat cephfs 7 "mds uses inline data" -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
ph fs compat add_incompat 8 "no anchor table" ceph fs compat add_incompat 9 "file layout v2" ceph fs compat add_incompat 10 "snaprealm v2" > I began to feel more and more that the issue was related to a damaged > cephfs, from a recent set of server malfuncti

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
7f8105cf1700 1 mds.ceph3 Monitors have > assigned me to become a standby. > > setting add_incompat 1 does also not work: > # ceph fs compat cephfs add_incompat 1 > Error EINVAL: adding a feature requires a feature string > > Any ideas? Please share `ceph fs dump`. -- Patr

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-09-17 Thread Patrick Donnelly
pping NFS should not have been necessary. But, yes, reducing max_mds to 1 and disabling allow_standby_replay is required. See: https://docs.ceph.com/en/pacific/cephfs/upgrading/#upgrading-the-mds-cluster -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engi

[ceph-users] Re: ceph fs authorization changed?

2021-09-03 Thread Patrick Donnelly
"allow rw path=/kio" mon "allow r" osd > "allow rw tag cephfs data=filesystem" > > Ist hat a bug or is that an intended behaviour change? I'm not seeing a difference between the caps produced by `fs authorize` and your `auth get-or-c

[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

2021-08-27 Thread Patrick Donnelly
g born by an MDS. What is it? Please try the resolutions suggested in: https://tracker.ceph.com/issues/45333 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: ceph snap-schedule retention is not properly being implemented

2021-08-23 Thread Patrick Donnelly
Hi Prayank, Jan has a fix in progress here: https://github.com/ceph/ceph/pull/42893 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list

[ceph-users] Re: Cephfs cannot create snapshots in subdirs of / with mds = "allow *"

2021-08-23 Thread Patrick Donnelly
in the "fs subvolume" interface did it but we've not heard any other reports of this problem. Otherwise, nothing else in Ceph internally uses it. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

[ceph-users] Re: Cephfs cannot create snapshots in subdirs of / with mds = "allow *"

2021-08-21 Thread Patrick Donnelly
On Sat, Aug 21, 2021 at 12:48 AM David Prude wrote: > > It appears my previous message may have been malformed. Attached is the > mds log from the time period mentioned. > > -David > > On 8/20/21 8:59 PM, Patrick Donnelly wrote: > > Hello David, > > > > On F

[ceph-users] Re: Cephfs cannot create snapshots in subdirs of / with mds = "allow *"

2021-08-20 Thread Patrick Donnelly
ent and > then moved on to testing with our client.admin (with auth listed above). I cannot reproduce the problem... > We have tried explicitly setting "ceph fs set dncephfs allow_new_snaps > true" which had no effect. We have search the mds logs and no entries > appear on the

[ceph-users] Re: Cephfs - MDS all up:standby, not becoming up:active

2021-08-20 Thread Patrick Donnelly
probably what's preventing promotion of standbys. That's a new change in master (which is also being backported to Pacific). Did you downgrade back to Pacific? Try: for i in $(seq 1 10); do ceph fs compat add_incompat $i; done -- Patrick Donnelly, Ph.D. He / Him / His Principal Softwa

[ceph-users] Re: Multiple cephfs MDS crashes with same assert_condition: state == LOCK_XLOCK || state == LOCK_XLOCKDONE

2021-08-20 Thread Patrick Donnelly
on how to troubleshot or > resolve this would be most welcome. Looks like: https://tracker.ceph.com/issues/49132 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D __

[ceph-users] Re: PSA: upgrading older clusters without CephFS

2021-08-16 Thread Patrick Donnelly
ployed until now that has not used CephFS? Any cluster created at Jewel or later won't be affected. You only need to consider clusters that were built pre-Jewel. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F

[ceph-users] Re: PSA: upgrading older clusters without CephFS

2021-08-16 Thread Patrick Donnelly
ceph::buffer::v15_2_0::list, > unsigned long)+0x54) [0x55b3a5e6ed84] > 4: main() > 5: __libc_start_main() > 6: _start() > Aborted (core dumped) > > Basically, the above steps have the same workflow regarding to how monitor > load the mdsmap from DB and

[ceph-users] Re: PSA: upgrading older clusters without CephFS

2021-08-06 Thread Patrick Donnelly
an easy way to check the release a cluster started as. And unfortunately, there is no way to check for legacy data structures. If your cluster has used CephFS at all since Jewel, it's very unlikely there will be any in the mon stores. If you're not sure, best to upgrade through v15.2.14

[ceph-users] PSA: upgrading older clusters without CephFS

2021-08-05 Thread Patrick Donnelly
-released Octopus v15.2.14 before continuing on to Pacific/Quincy. After a day's time, the Monitors will have cleared out the old structures. [1] https://tracker.ceph.com/issues/51673 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, C

[ceph-users] Re: Locating files on pool

2021-07-27 Thread Patrick Donnelly
ot;: 803 }, { "dirino": 1099859111936, "dname": "6275606", "version": 132338 }, { "dirino": 1099511627776, "dname": "teuthology-2021-07-16_0

[ceph-users] Re: How to make CephFS a tiered file system?

2021-07-22 Thread Patrick Donnelly
file_name && mv -f .hidden_file_name > original_file_name > > -Patrick > ________ > From: Patrick Donnelly > Sent: Thursday, July 22, 2021 5:03 PM > To: huxia...@horebdata.cn > Cc: ceph-users > Subject: [ceph-users] Re: How to make CephFS a ti

[ceph-users] Re: How to make CephFS a tiered file system?

2021-07-22 Thread Patrick Donnelly
about the old files that already exist in FOLDER before > executing the above command? Correct. > Should i mannually migrate those old files, and how? Copy them. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunn

[ceph-users] Re: How to make CephFS a tiered file system?

2021-07-21 Thread Patrick Donnelly
nts are highly appreciated, We have an outstanding ticket for this but no one has yet taken it up: https://tracker.ceph.com/issues/40285 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

[ceph-users] Re: cephfs forward scrubbing docs

2021-06-30 Thread Patrick Donnelly
is point (late reply, my fault), I'm not sure it's worth the trouble. > 3) I was somehow surprised by this, because I had thought that the new > `ceph -s` multi-mds scrub status implied that multi-mds scrubbing was > now working: > > task status: > scrub status:

[ceph-users] Re: ceph fs mv does copy, not move

2021-06-23 Thread Patrick Donnelly
t possible to accurately account for the quota usage prior to doing the rename. Rather than allow a quota to potentially be massively overrun, we fell back to the old behavior of not allowing it. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engin

[ceph-users] Re: In "ceph health detail", what's the diff between MDS_SLOW_METADATA_IO and MDS_SLOW_REQUEST?

2021-06-23 Thread Patrick Donnelly
rites promptly", > from this sentence, it seems that "MDS_SLOW_REQUEST" also contains OSD > operations by the MDS? Yes. If you have slow metadata IO warnings you will likely also have slow request warnings. -- Patrick Donnelly, Ph.D. He / Him /

[ceph-users] Re: In "ceph health detail", what's the diff between MDS_SLOW_METADATA_IO and MDS_SLOW_REQUEST?

2021-06-21 Thread Patrick Donnelly
> 30 secs, oldest > blocked for 51123 secs MDS_SLOW_REQUEST 1 MDSs report slow requests MDS_SLOW_REQUEST: RPCs from the client to the MDS are "slow", i.e. not complete in less than 30 seconds. MDS_SLOW_METADATA_IO: OSD operations by the MDS are not yet complete after 30 seconds. --

[ceph-users] Re: Force processing of num_strays in mds

2021-05-21 Thread Patrick Donnelly
nel client. You can try the same command on ceph-fuse or maybe initiate a recursive scrub on the MDS. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: Write Ops on CephFS Increasing exponentially

2021-05-10 Thread Patrick Donnelly
s going on with the client requests and write operations. I suggest you look at the "perf dump" statistics from the MDS (via ceph tell or admin socket) over a period of time to get an idea what operations it's performing. It's probable your workload changed somehow and that i

[ceph-users] Re: [ Ceph MDS MON Config Variables ] Failover Delay issue

2021-05-03 Thread Patrick Donnelly
w Monitor election > 1. [log] 0 log_channel(cluster) log [INF] : mon.cephnode1 calling > monitor election In the process of killing the active MDS, are you also killing a monitor? -- Patrick Donnelly, Ph.D. He / Him / His Principal Software E

[ceph-users] Re: MDS replay takes forever and cephfs is down

2021-04-21 Thread Patrick Donnelly
; nautilus (stable)": 12 >}, >"mds": { >"ceph version 14.2.19 (bb796b9b5bab9463106022eef406373182465d11) > nautilus (stable)": 3 >}, >"rgw": { >"ceph version 14.2.18 (befbc92f3c11eedd8626487211d200c0b44786d9) > nautilus (stable)": 9

[ceph-users] Re: ceph-fuse false passed X_OK check

2021-03-30 Thread Patrick Donnelly
It's a bug: https://tracker.ceph.com/issues/50060 On Wed, Dec 23, 2020 at 5:53 PM Alex Taylor wrote: > > Hi Patrick, > > Any updates? Looking forward to your reply :D > > > On Thu, Dec 17, 2020 at 11:39 AM Patrick Donnelly wrote: > > > > On Wed, Dec

[ceph-users] Re: MDS pinning: ceph.dir.pin: No such attribute

2021-03-15 Thread Patrick Donnelly
On Mon, Mar 15, 2021 at 10:42 AM Jeff Layton wrote: > The question is, does the MDS you're using return an inode structure > version >=2 ? Yes, he needs to upgrade to at least nautilus. Mimic is missing commit 8469a81625180668a9dec840293013be019236b8. -- Patrick Donnelly, Ph.D.

[ceph-users] Re: MDS pinning: ceph.dir.pin: No such attribute

2021-03-15 Thread Patrick Donnelly
that pinning is in effect as intended? IIRC, getfattr support was recently added. What client are you using? -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: CephFS: side effects of not using ceph-mgr volumes / subvolumes

2021-03-03 Thread Patrick Donnelly
specific use case I require snapshots on the subvolume > group layer. It therefore seems better to just forego the abstraction as > a whole and work on bare CephFS. subvolumegroup snapshots will come back, probably in a minor release of Pacific. -- Patrick Donnel

[ceph-users] Re: MDSs report damaged metadata

2021-02-25 Thread Patrick Donnelly
does this sort of damage mean? Is there anything > I can do to recover these files? Scrubbing should correct it. Try "recursive repair force" to see if that helps. "force" will cause the MDS to revisit metadata that has been scrubbed previously but unchanged since then. --

[ceph-users] Re: ceph slow at 80% full, mds nodes lots of unused memory

2021-02-24 Thread Patrick Donnelly
> After looking at our monitoring history, it seems the mds cache is > actually used more fully, but most of our servers are getting a weekly > reboot by default. This clears the mds cache obviously. I wonder if > that's a smart idea for an MDS node...? ;-) No, it's not.

[ceph-users] Re: CephFS Octopus snapshots / kworker at 100% / kernel vs. fuse client

2021-02-15 Thread Patrick Donnelly
ume API. Note: subvolume group snapshots are currently disabled (but may not be for your version of Octopus) but we expect to bring it back soon. > - What is the current recommendation regarding CephFS and max number of > snapshots? A given directory should have less than ~100 snapshots in

[ceph-users] Re: cephfs flags question

2020-12-19 Thread Patrick Donnelly
maybe I > should ask a different question: Does a (ceph-fuse / kernel) client use > the *cephfs flags* bit at all? If not than we don't have to focus on > this, and we can conclude we cannot reproduce the issue on our test > environment. ceph-fuse/kernel client don't use th

[ceph-users] Re: cephfs flags question

2020-12-17 Thread Patrick Donnelly
On Thu, Dec 17, 2020 at 11:35 AM Stefan Kooman wrote: > > On 12/17/20 7:45 PM, Patrick Donnelly wrote: > > > > > When a file system is newly created, it's assumed you want all the > > stable features on, including multiple MDS, directory fragmentation, > > s

[ceph-users] Re: cephfs flags question

2020-12-17 Thread Patrick Donnelly
> lifetime of a cluster? What exactly is the purpose of the filesystem > flags bit? When a file system is newly created, it's assumed you want all the stable features on, including multiple MDS, directory fragmentation, snapshots, etc. That's what those flags are for. If you

[ceph-users] Re: cephfs flags question

2020-12-17 Thread Patrick Donnelly
tly. These correspond to reserved feature bits for unspecified older Ceph releases. Suggest you just set the min_compat_client to jewel. In any case, I think what you're asking is about the file system flags and not the required_client_features. -- Patrick Donnelly, Ph.D. He / Him / His Princi

[ceph-users] Re: ceph-fuse false passed X_OK check

2020-12-16 Thread Patrick Donnelly
ducing it with the master branch and could not. It might be due to an older fuse/ceph. I suggest you upgrade! > 2. It works again with fuse_default_permissions=true, any drawbacks if > this option is set? Correctness (ironically, for you) and performance. -- Patrick Donnelly, Ph.D. He / Him

[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

2020-12-15 Thread Patrick Donnelly
ax_decay_threshold 98304 > mds advanced mds_recall_warning_threshold 196608 > globaladvanced mon_compact_on_start true > > I haven't had any noticeable slow downs or crashes in a while with 3 > active MDS and 3 hot standbys. Thanks for sharing the settings that worked fo

[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

2020-12-14 Thread Patrick Donnelly
On Mon, Dec 7, 2020 at 12:06 PM Patrick Donnelly wrote: > > Hi Dan & Janek, > > On Sat, Dec 5, 2020 at 6:26 AM Dan van der Ster wrote: > > My understanding is that the recall thresholds (see my list below) > > should be scaled proportionally. OTOH, I haven't pla

[ceph-users] Re: CephFS max_file_size

2020-12-11 Thread Patrick Donnelly
at is the risk of > setting that value to say 10TiB? There is no known downside. Let us know how it goes! -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-07 Thread Patrick Donnelly
you have too large of a directory, things could get ugly!) > > I'm hopeful your problems will be addressed by: > > https://tracker.ceph.com/issues/47307 > That does indeed sound a bit like it might fix these kind of issues. -- Patrick Donnelly, Ph.D. He / Him / His Principa

[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

2020-12-07 Thread Patrick Donnelly
ow if it's missing information or if something could be more clear. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-user

[ceph-users] Re: MDS lost, Filesystem degraded and wont mount

2020-12-07 Thread Patrick Donnelly
s for the MDS have you made? I'm hopeful your problems will be addressed by: https://tracker.ceph.com/issues/47307 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE: 3 clients failing to respond to capability release

2020-10-30 Thread Patrick Donnelly
ible that the clients dropped the caps > already before the MDS request was handled/received. Can you share any config changes you've made on the MDS? Also, Mimic is EOL as you probably know. Please upgrade :) -- Patrick Donnelly, Ph.D. He / Him / His Principal S

[ceph-users] Re: How to reset Log Levels

2020-10-29 Thread Patrick Donnelly
mon.ceph03 config set "mon_health_log_update_period" 30 > ceph tell mon.ceph03 config set "debug_mgr" "0/0" > > which made it better, but i really cant remember it all and would like > to have the default values. > > Is there a way to reset those Log Va

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-23 Thread Patrick Donnelly
he most stable but don't know > if that's still the case. You need to first upgrade to Nautilus in any case. n+2 releases is the max delta between upgrades. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E

[ceph-users] Re: Is cephfs multi-volume support stable?

2020-10-10 Thread Patrick Donnelly
should I believe - the presentation or the official docs? We expect to make multi-fs stable in Pacific. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-

[ceph-users] Re: cephfs tag not working

2020-10-01 Thread Patrick Donnelly
older thread > on the topic in the users-list and also a fix/workaround. This is likely to be the problem. Please add the application tag to your CephFS data pools: https://docs.ceph.com/en/latest/rados/operations/pools/#associate-pool-to-application -- Patrick Donnelly, Ph.D. He / Him

[ceph-users] Re: Ceph MDS stays in "up:replay" for hours. MDS failover takes 10-15 hours.

2020-09-22 Thread Patrick Donnelly
force MDS to change > status to active and run all of the required directory checks in the > background? How can I localise the root cause? Link to a tracker issue where some discussion has taken place: https://tracker.ceph.com/issues/47582 -- Patrick Donnelly, Ph.D. He / Him / His Princip

[ceph-users] Re: Unable to start mds when creating cephfs volume with erasure encoding data pool

2020-09-14 Thread Patrick Donnelly
s fine with > the MDS automatically deployed but there is no provision for using EC with > the data pool See "Using EC pools with CephFS" in https://ceph.io/community/new-luminous-erasure-coding-rbd-cephfs/ I will make a note to improve the ceph documentation on this. -- Patri

[ceph-users] Re: damaged cephfs

2020-09-04 Thread Patrick Donnelly
eeds updated for some reason as part of an upgrade (e.g. Mimic and snapshot formats). It's not considered necessary to do it on a routine basis. RADOS PG scrubbing is sufficient for ensuring that the backing data is routinely checked for correctness/redundancy. -- Patrick Donnelly, Ph.D. He / H

[ceph-users] Re: MDS troubleshooting documentation: ceph daemon mds. dump cache

2020-08-31 Thread Patrick Donnelly
ate. > > Gr. Stefan > > P.s. I think our only option was to get the active restarted at that > point, but still. Yes, there should be a note in the docs about that. It seems a new PR is up to respond to this issue: https://github.com/ceph/ceph/pull/36823 -- Patrick Donnelly, Ph.D. He /

[ceph-users] Re: [ANN] A framework for deploying Octopus using cephadm in the cloud

2020-08-01 Thread Patrick Donnelly
ke if > osd.121 goes down, you can start it on some random node. Why not? -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-

[ceph-users] [ANN] A framework for deploying Octopus using cephadm in the cloud

2020-07-31 Thread Patrick Donnelly
ng. Feedback is welcome. [0] https://github.com/batrick/ceph-linode [1] https://www.linode.com/ [2] https://docs.ceph.com/docs/master/cephadm/ [3] https://github.com/batrick/ceph-linode/blob/master/cephadm.yml Full disclosure: I have no relationship with Linode except as a customer. -- Patrick Don

[ceph-users] Re: [Ceph Octopus 15.2.3 ] MDS crashed suddenly

2020-07-20 Thread Patrick Donnelly
e increase MDS debugging: ceph config set mds debug_mds 20 for the time it takes to reproduce. Then the core dump may also be helpful: https://docs.ceph.com/docs/master/man/8/ceph-post-file/ -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer R

[ceph-users] Re: cephfs: creating two subvolumegroups with dedicated data pool...

2020-07-13 Thread Patrick Donnelly
s.data'. It must > be a valid data pool Did you forget to add the data pool to the volume (file system)? https://docs.ceph.com/docs/master/cephfs/administration/#file-systems See "add_data_pool". -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat Sunnyvale

[ceph-users] Re: "task status" section in ceph -s output new?

2020-07-13 Thread Patrick Donnelly
messages ight appear there? Any kind of cluster task. Right now (for CephFS), we just use it for on-going scrubs. There's a bug where idle scrub is continually reported. It will be resolved in Nautilus with this backport: https://tracker.ceph.com/issues/46480 -- Patrick Don

[ceph-users] Re: How to remove one of two filesystems

2020-06-22 Thread Patrick Donnelly
ster/cephfs/administration/#taking-the-cluster-down-rapidly-for-deletion-or-disaster-recovery -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing lis

[ceph-users] Re: Reducing RAM usage on production MDS

2020-05-28 Thread Patrick Donnelly
he_memory_limit > I’m concerned that reducing mds_cache_memory_limit even in very small > increments may trigger a large recall of caps and overwhelm the MDS. That used to be the case in older versions of Luminous but not any longer. -- Patrick Don

[ceph-users] Re: MDS_CACHE_OVERSIZED warning

2020-05-11 Thread Patrick Donnelly
d to > cache pressure > MDS_CACHE_OVERSIZED 1 MDSs report oversized cache > mdsceph-mds1(mds.0): MDS cache is too large (91GB/32GB); 34400070 > inodes in use by clients, 3293 stray files Can you share the client list? Use the `ceph tell mds.foo session ls` command. -- Patrick Donnelly, Ph

[ceph-users] Re: cephfs change/migrate default data pool

2020-05-07 Thread Patrick Donnelly
there a way to change the default pool or some other kind of > migration without having to recreate the FS? Not until something like [1] is implemented. If it's not broken for you, don't fix it. [1] https://tracker.ceph.com/issues/40285 -- Patrick Donnelly, Ph.D. He / Him / His

[ceph-users] Re: Cluster blacklists MDS, can't start

2020-05-06 Thread Patrick Donnelly
3,180,115] > > Last time, it seemed to just recover after about an hour all by it's self. > Any way to speed this up? We need more cluster information, error messages, client versions/types, etc. to help. -- Patrick Donnelly, Ph.D. He / Him / Hi

[ceph-users] Re: CephFS with active-active NFS Ganesha

2020-05-06 Thread Patrick Donnelly
-- > > Does anyone have similar problems? Or if this behavior is by purpose, can you > explain to me why this is the case? > Thank you in advance for your time and thoughts. Here's what Jeff Layton had to say (he didn't get the mail posting somehow): "Yes that

[ceph-users] Re: ceph: Can't lookup inode 1 (err: -13)

2020-05-06 Thread Patrick Donnelly
osterity, a tracker was opened for this bug: https://tracker.ceph.com/issues/44546 -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@cep

[ceph-users] Re: How many MDS servers

2020-05-06 Thread Patrick Donnelly
ome kind of bug as others have reported which is causing the cache size / anonymous memory to continually increase. You will need to post more information about the client type/version, cache usage, perf dumps, and workload to help diagnose. -- Patrick Donnelly, Ph.D. He / Him / His Se

[ceph-users] Re: ceph mds can't recall client caps anymore

2020-04-07 Thread Patrick Donnelly
gt;mds_recall_max_decay_rate = 2.5 It looks like your setting for mds_recall_max_caps is larger than mds_recall_max_decay_threshold. Are you changing these configurations? If so, why? -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301

[ceph-users] Re: Multiple CephFS creation

2020-03-31 Thread Patrick Donnelly
ariables are removed. Instead, follow this procedure.: https://docs.ceph.com/docs/octopus/cephfs/standby/#configuring-mds-file-system-affinity -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: Need clarification on CephFS, EC Pools, and File Layouts

2020-03-04 Thread Patrick Donnelly
ason to change anything. "If it's not broken, don't fix it." -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-us

[ceph-users] Re: Need clarification on CephFS, EC Pools, and File Layouts

2020-03-04 Thread Patrick Donnelly
space utilization. Small files will still create at least one object in the EC pool. > Also, is it possible to insert a replicated data pool as the default on > an already deployed CephFS, or will I need to create a new FS and copy > the data over? You must create a new file system at th

[ceph-users] Re: Frequest LARGE_OMAP_OBJECTS in cephfs metadata pool

2020-02-25 Thread Patrick Donnelly
at inodes added/removed (to identify churn): "mds_mem": { "ino": 2740340, "ino+": 19461742, "ino-": 16721402, -- Patrick Donnelly, Ph.D. He / Him / His Senior Software Engineer Red Hat Sunnyvale, CA GPG: 19F28A586F808C2402351B93C3301A3E258DD

<    1   2   3   >