[ceph-users] Re: Question about PR merge

2024-04-17 Thread Patrick Donnelly
as it's not even merged to main yet. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Spam in log file

2024-03-25 Thread Patrick Donnelly
Nope. On Mon, Mar 25, 2024 at 8:33 AM Albert Shih wrote: > > Le 25/03/2024 à 08:28:54-0400, Patrick Donnelly a écrit > Hi, > > > > > The fix is in one of the next releases. Check the tracker ticket: > > https://tracker.ceph.com/issues/63166 > > O

[ceph-users] Re: Spam in log file

2024-03-25 Thread Patrick Donnelly
t ready > for session (expect reconnect) > Mar 25 13:18:39 cthulhu2 ceph-mgr[2843]: mgr.server handle_open ignoring open > from mds.cephfs.cthulhu2.dqahyt v2:145.238.187.185:6800/2763465960; not ready > for session (expect reconnect) > Mar 25 13:18:39 cthulhu2 ceph-

[ceph-users] Re: Cephfs error state with one bad file

2024-03-15 Thread Patrick Donnelly
Hi Sake, On Tue, Jan 2, 2024 at 4:02 AM Sake Ceph wrote: > > Hi again, hopefully for the last time with problems. > > We had a MDS crash earlier with the MDS staying in failed state and used a > command to reset the filesystem (this was wrong, I know now, thanks Patrick > Don

[ceph-users] Re: MDS subtree pinning

2024-03-15 Thread Patrick Donnelly
pp3 > /app4 > > Can I pin /app1 to MDS rank 0 and 1, This will be possible with: https://github.com/ceph/ceph/pull/52373 -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: reef 18.2.2 (hot-fix) QE validation status

2024-03-06 Thread Patrick Donnelly
On Wed, Mar 6, 2024 at 2:55 AM Venky Shankar wrote: > > +Patrick Donnelly > > On Tue, Mar 5, 2024 at 9:18 PM Yuri Weinstein wrote: > > > > Details of this release are summarized here: > > > > https://tracker.ceph.com/issues/64721#note-1 > >

[ceph-users] Ceph Leadership Team Meeting, 2024-02-28 Minutes

2024-02-28 Thread Patrick Donnelly
1.0 milestone: https://github.com/ceph/ceph/milestone/21 -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: Direct ceph mount on desktops

2024-02-06 Thread Patrick Donnelly
e "recover_session" in the mount.ceph man page). Any dirty data would be lost. As for whether you should have clients that hibernate, it's not ideal. It could conceivably create problems if client machines hibernate longer than the blocklist duration (after eviction by the MDS). -- Patr

[ceph-users] Re: cephfs inode backtrace information

2024-01-31 Thread Patrick Donnelly
y this question is asked now when the file system already has a significant amount of data? Are you thinking about recreating the fs?) -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: mds crashes after up:replay state

2024-01-05 Thread Patrick Donnelly
failing in up:replay but shortly after reaching one of the later states. Check the mon logs to see what the FSMap changes were. Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _

[ceph-users] Re: FS down - mds degraded

2023-12-21 Thread Patrick Donnelly
k the rank repaired. See end of: https://docs.ceph.com/en/latest/cephfs/administration/#daemons (ceph mds repaired ) I admit that is not easy to find. I will add a ticket to improve the documentation: https://tracker.ceph.com/issues/63885 -- Patrick Donnelly, Ph.D. He / Him / His

[ceph-users] Re: FS down - mds degraded

2023-12-21 Thread Patrick Donnelly
On Thu, Dec 21, 2023 at 2:49 AM David C. wrote: > I would start by decrementing max_mds by 1: > ceph fs set atlassian-prod max_mds 2 This will have no positive effect. The monitors will not alter the number of ranks (i.e. stop a rank) if the cluster is degraded. -- Patrick Donnelly, Ph

[ceph-users] Re: FS down - mds degraded

2023-12-21 Thread Patrick Donnelly
h assumes that whatever caused the MDS to become damaged will reoccur when it restarts). -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.i

[ceph-users] Re: mds.0.journaler.pq(ro) _finish_read got error -2

2023-12-12 Thread Patrick Donnelly
ects lost? It would be helpful to know more about the circumstances of this "broken CephFS"? What Ceph version? -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files

2023-11-29 Thread Patrick Donnelly
Hi Sebastian, On Wed, Nov 29, 2023 at 3:11 PM Sebastian Knust wrote: > > Hello Patrick, > > On 27.11.23 19:05, Patrick Donnelly wrote: > > > > I would **really** love to see the debug logs from the MDS. Please > > upload them using ceph-post-file [1]. If you ca

[ceph-users] Re: MDS_DAMAGE in 17.2.7 / Cannot delete affected files

2023-11-27 Thread Patrick Donnelly
50f) [0x7f3fe5c9531f] > >15: > > (DispatchQueue::DispatchThread::entry()+0x11) [0x7f3fe5d5f381] > >16: > > /lib64/libpthread.so.0(+0x81ca) [0x7f3fe4a0b1ca] > >

[ceph-users] Re: Does cephfs ensure close-to-open consistency after enabling lazyio?

2023-11-27 Thread Patrick Donnelly
No. You must call lazyio_synchronize. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph

[ceph-users] Ceph Leadership Team Weekly Meeting Minutes 2023-11-08

2023-11-08 Thread Patrick Donnelly
? - User + Dev meeting next week - Topics include migration between EC profiles and challenges related to RGW zone replication - Casey can attend end of meeting - open nebula folks planning to do webinar; looking for speakers -- Patrick Donnelly, Ph.D. He / Him / His Red Hat

[ceph-users] Re: list cephfs dirfrags

2023-11-08 Thread Patrick Donnelly
ey exactly > If the dirfrag is not in cache on any rank then the dirfrag is "nowhere". It's only pinned to a rank if it's in cache. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _

[ceph-users] Re: Specify priority for active MGR and MDS

2023-10-19 Thread Patrick Donnelly
will simply not function (go OOM). -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: No snap_schedule module in Octopus

2023-09-19 Thread Patrick Donnelly
I think that if "ceph mgr module enable snap_schedule" was not working > without the "--force" option, it was because something was wrong in my > Ceph install. > > Patrick > > Le 19/09/2023 à 14:29, Patrick Donnelly a écrit : > > https://docs.ceph.com/en/quincy

[ceph-users] Re: No snap_schedule module in Octopus

2023-09-19 Thread Patrick Donnelly
r help. > >> > >> Patrick > >> _______ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > > > > _

[ceph-users] Re: MDS daemons don't report any more

2023-09-11 Thread Patrick Donnelly
1k objects/s > > My first thought is that the status module failed. However, I don't manage to > restart it (always on). An MGR fail-over did not help. > > Any ideas what is going on here? > > Thanks and best regards, > = > Frank Schilder > AIT Risø Campus > Bygning 109, rum S14 > _

[ceph-users] Re: Client failing to respond to capability release

2023-09-01 Thread Patrick Donnelly
n oversized cache. * The session listing shows the session is quiet (the "session_cache_liveness" is near 0). However, the MDS should respect mds_min_caps_per_client by (a) not recalling more caps than mds_min_caps_per_client and (b) not complaining the client has caps < mds_min_caps

[ceph-users] Re: When to use the auth profiles simple-rados-client and profile simple-rados-client-with-blocklist?

2023-09-01 Thread Patrick Donnelly
r the question in $SUBJECT: https://docs.ceph.com/en/reef/rados/api/libcephsqlite/#user You would want to use the simple-rados-client-with-blocklist profile for a libcephsqlite application. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B

[ceph-users] Re: 16.2.14 pacific QE validation status

2023-08-24 Thread Patrick Donnelly
ébastien Han > cephadm - Adam K > dashboard - Ernesto > > rgw - Casey > rbd - Ilya > krbd - Ilya > fs - Venky, Patrick approved https://tracker.ceph.com/projects/cephfs/wiki/Pacific#2023-August-22 -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer I

[ceph-users] Re: help, ceph fs status stuck with no response

2023-08-14 Thread Patrick Donnelly
information to help figure the cause. Such as: `ceph tell mds.X perf dump`, `ceph tell mds.X status`, `ceph fs dump`, and `ceph tell mgr.X perf dump` while this is occurring. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F

[ceph-users] Ceph Leadership Team Meeting: 2023-08-09 Minutes

2023-08-09 Thread Patrick Donnelly
uing his investigation. - Case is updating contributors to generate accurate credits for the new reef release: https://github.com/ceph/ceph/pull/52868 -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

[ceph-users] Re: help, ceph fs status stuck with no response

2023-08-07 Thread Patrick Donnelly
hat. You can use `ceph fs dump` to get most of the same information from the mons directly. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: snaptrim number of objects

2023-08-07 Thread Patrick Donnelly
ytes that need snaptrimming? Unfortunately, no. > Perhaps I can graph that > and see where the differences are. > > That won't explain why my cluster bogs down, but at least it gives > some visibility. Running 17.2.6 everywhere by the way. Please let us know how configuring snaptr

[ceph-users] Re: cephfs - unable to create new subvolume

2023-07-21 Thread Patrick Donnelly
/volumes/csi Then check if it works. If still not: setfattr -n ceph.dir.subvolume -v 0 /volumes/ try again, if still not: setfattr -n ceph.dir.subvolume -v 0 / Please let us know which directory fixed the issue for you. -- Patrick Donnelly, Ph.D. He / H

[ceph-users] Re: MDS cache is too large and crashes

2023-07-21 Thread Patrick Donnelly
gt; think it would be good to get some help. Where should I look next? It's this issue: https://tracker.ceph.com/issues/48673 Sorry I'm still evaluating the fix for it before merging. Hope to be done with it soon. -- Patrick Donnelly, Ph.D. He / Him / His Red H

[ceph-users] Re: Ceph Leadership Team Meeting, 2023-07-19 Minutes

2023-07-19 Thread Patrick Donnelly
Forgot the link: On Wed, Jul 19, 2023 at 2:20 PM Patrick Donnelly wrote: > > Hi folks, > > Today we discussed: > > - Reef is almost ready! The remaining issues are tracked in [1]. In > particular, an epel9 package is holding back the release. [1] https://pad.ceph.com/

[ceph-users] Ceph Leadership Team Meeting, 2023-07-19 Minutes

2023-07-19 Thread Patrick Donnelly
on this will be forthcoming once the proposal is finalized. The monthly user <-> dev meeting will be reevaluated in light of this, possibly continuing on as usual. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E25

[ceph-users] Re: immutable bit

2023-07-07 Thread Patrick Donnelly
see https://tracker.ceph.com/issues/10679 > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer I

[ceph-users] Re: MDSs report slow metadata IOs

2023-07-07 Thread Patrick Donnelly
:58 PM > [INF] > MDS health message cleared (mds.?): 100+ slow metadata IOs are blocked > 30 > secs, oldest blocked for 565 secs > > 7/7/23 4:44:58 PM > [WRN] > Health check update: 7 MDSs report slow metadata IOs (MDS_SLOW_METADATA_IO) > __

[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-06-16 Thread Patrick Donnelly
tead so you can have a look? Yes, please. > Janek > > > On 10/06/2023 15:23, Patrick Donnelly wrote: > > On Fri, Jun 9, 2023 at 3:27 AM Janek Bevendorff > > wrote: > >> Hi Patrick, > >> > >>> I'm afraid your ceph-post-file logs were lost to the neth

[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-06-10 Thread Patrick Donnelly
lease > > disable it since you're using ephemeral pinning: > > > > ceph config set mds mds_bal_interval 0 > > Done. > > Thanks for your help! > Janek > > > -- > > Bauhaus-Universität Weimar > Bauhausstr. 9a, R308 > 99423 Weimar, Germany > > P

[ceph-users] Re: CephFS metadata pool grows by two orders of magnitude while trimming (?) snapshots

2023-06-08 Thread Patrick Donnelly
tached script when the problem is occurring so I can investigate. I'll need a tarball of the outputs. Also, in the off-chance this is related to the MDS balancer, please disable it since you're using ephemeral pinning: ceph config set mds mds_bal_interval 0 -- Patrick Donnelly, Ph.D. He / Him / H

[ceph-users] Re: slow mds requests with random read test

2023-06-07 Thread Patrick Donnelly
00MB/s of write). especially the logs of slow requests are > irrelevant with testing ops. I am thinking it is something with cephfs kernel > client? > > Any other thoughts? > > Patrick Donnelly 于2023年5月31日周三 00:58写道: >> >> On Tue, May 30, 2023 at 8:42 AM Ben wrote: >

[ceph-users] Re: slow mds requests with random read test

2023-05-30 Thread Patrick Donnelly
blem? Your random read workload is too extreme for your cluster of OSDs. It's causing slow metadata ops for the MDS. To resolve this we would normally suggest allocating a set of OSDs on SSDs for use by the CephFS metadata pool to isolate the worklaods. -- Patrick Donnelly, Ph.D. He / Him / His

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-24 Thread Patrick Donnelly
On Wed, May 24, 2023 at 4:26 AM Stefan Kooman wrote: > > On 5/22/23 20:24, Patrick Donnelly wrote: > > > > > The original script is here: > > https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py > > > "# Suggested recovery sequence (fo

[ceph-users] Re: [Help appreciated] ceph mds damaged

2023-05-24 Thread Patrick Donnelly
mmediately and advise > the sender by return email or telephone. > > Deakin University does not warrant that this email and any attachments are > error or virus free. > > -Original Message- > From: Justin Li > Sent: Wednesday, May 24, 2023 8:25 AM > To: Patr

[ceph-users] Re: [Help appreciated] ceph mds damaged

2023-05-23 Thread Patrick Donnelly
r it. > [ERR] MDS_DAMAGE: 1 mds daemon damaged > fs cephfs mds.0 is damaged Do you have a complete log you can share? Try: https://docs.ceph.com/en/quincy/man/8/ceph-post-file/ To get your upgrade to complete, you may set: ceph config set mds mds_go_bad_corrupt_dentry false -- Patrick

[ceph-users] Re: Deleting a CephFS volume

2023-05-22 Thread Patrick Donnelly
circumstance will `ceph fs volume rm` even work if > it fails to delete a volume I just created? `fs rm` just removes the file system from the monitor maps. You still have the data pools lying around which is what the `volume rm` command is complaining about. Try: ceph config set globa

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-22 Thread Patrick Donnelly
e you willing to share this script? I would like to use it to scan our > CephFS before upgrading to 16.2.13. Do you run this script when the > filesystem is online / active? The original script is here: https://github.com/ceph/ceph/blob/main/src/tools/cephfs/first-damage.py -- Patri

[ceph-users] Re: MDS crashes to damaged metadata

2023-05-22 Thread Patrick Donnelly
is expected. Once the dentries are marked damaged, the MDS won't allow operations on those files (like those triggering tracker #38452). > I noticed "mds: catch damage to CDentry’s first member before persisting > (issue#58482, pr#50781, Patrick Donnelly)“ in the change logs for 16

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-10 Thread Patrick Donnelly
f in position 8: > invalid start byte > > Does that mean that the last inode listed in the output file is corrupt? > Any way I can fix it? > > The output file has 14 million lines. We have about 24.5 million objects > in the metadata pool. > > Janek > > > On 0

[ceph-users] Re: CephFS Scrub Questions

2023-05-04 Thread Patrick Donnelly
due regard to the amount of time they may > take...) cephfs-data-scan should only be employed for disaster recovery. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-us

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-03 Thread Patrick Donnelly
.1 No need to do any of the other steps if you just want a read-only check. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.

[ceph-users] Re: MDS "newly corrupt dentry" after patch version upgrade

2023-05-02 Thread Patrick Donnelly
tadata corruption. [1] https://docs.ceph.com/en/quincy/man/8/ceph-post-file/ [2] https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/ -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Ceph Leadership Team Meeting, 2023-04-12 Minutes

2023-04-12 Thread Patrick Donnelly
-minutes -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph.v17 multi-mds ephemeral directory pinning: cannot set or retrieve extended attribute

2023-04-10 Thread Patrick Donnelly
tories. Confirm /home looks something like: ceph tell mds.:0 dump tree /home 0 | jq '.[0].dirfrags[] | .dir_auth' "0" "0" "1" "1" "1" "1" "0" "0" Which tells you the dirfrags for /home are distributed across the ranks (in

[ceph-users] Re: MDS stuck in "up:replay"

2023-02-23 Thread Patrick Donnelly
. My filesystems are > still completely down. > > Cheers, > Thomas > > On 22.02.23 18:36, Patrick Donnelly wrote: > > On Wed, Feb 22, 2023 at 12:10 PM Thomas Widhalm > > wrote: > >> > >> Hi, > >> > >> Thanks for the idea

[ceph-users] Re: MDS stuck in "up:replay"

2023-02-22 Thread Patrick Donnelly
s I gave were for producing hopefully useful debug logs. Not intended to fix the problem for you. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-u

[ceph-users] Re: [Quincy] Module 'devicehealth' has failed: disk I/O error

2023-02-22 Thread Patrick Donnelly
"os_version": "8", > "os_version_id": "8", > "process_name": "ceph-mgr", > "stack_sig": > "7e506cc2729d5a18403f0373447bb825b42aafa2405fb0e5cfffc2896b093ed8", > "timestamp": &quo

[ceph-users] Re: MDS stuck in "up:replay"

2023-02-22 Thread Patrick Donnelly
:replay. Use: ceph config set mds debug_ms 5 ceph config set mds debug_mds 10 and ceph fs fail X ceph fs set X joinable true to get fresh logs from the MDS to see what's going with the messages to the OSDs. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Enginee

[ceph-users] Re: Problem with IO after renaming File System .data pool

2023-02-22 Thread Patrick Donnelly
the credential changed as expected. FWIW, I tested that data pool renames do not break client I/O for a cap generated with `ceph fs authorize...`. It works fine. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___

[ceph-users] Re: 17.2.5 ceph fs status: AssertionError

2023-02-22 Thread Patrick Donnelly
rt - Hosting > http://www.heinlein-support.de > > Tel: 030-405051-43 > Fax: 030-405051-19 > > Zwangsangaben lt. §35a GmbHG: > HRB 93818 B / Amtsgericht Berlin-Charlottenburg, > Geschäftsführer: Peer Heinlein -- Sitz: Berlin >

[ceph-users] Re: Retrieve number of read/write operations for a particular file in Cephfs

2023-01-20 Thread Patrick Donnelly
the "perf dump" from ceph-fuse to measure. The osd write/reads are not fine-grained though. -- Patrick Donnelly, Ph.D. He / Him / His Red Hat Partner Engineer IBM, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-us

[ceph-users] Re: MDS crashes to damaged metadata

2023-01-08 Thread Patrick Donnelly
atch? We fixed the MTU Issue going back to > 1500 on all nodes in the ceph public network on the weekend also. I doubt it. > If you need a debug level 20 log of the ScatterLock for further analysis, i > could schedule snapshots at the end of our workdays and increase the debug > level 5

[ceph-users] Re: mds stuck in standby, not one active

2022-12-15 Thread Patrick Donnelly
e8 at 2022-12-15T02:32:18.407978Z > > As suggested I was going to upgrade the ceph cluster to 16.2.7 to fix > the mds issue, but it seems none of the running standby daemons is > responding. Suggest also looking at the cephadm logs which may expl

[ceph-users] Re: mds stuck in standby, not one active

2022-12-15 Thread Patrick Donnelly
> running daemons. One daemon should be stopped and removed. > > Do you suggest to force remove these daemons or what could be the > preferred workaround? Hard to say without more information. Please share: ceph fs dump ceph status ceph health detail -- Patrick Donnelly, Ph.D. He

[ceph-users] Re: mds stuck in standby, not one active

2022-12-13 Thread Patrick Donnelly
standby seq 1 > join_fscid=1 addr > [v2:192.168.50.134:1a90/b8b1f33c,v1:192.168.50.134:1a91/b8b1f33c] compat > {c=[1],r=[1],i=[1]}] > [mds.ceph_fs.store3.vcnwzh{ffff:916aff7} state up:standby seq 1 > join_fscid=1 addr > [v2:192.168.50.133:1a90/49cb4e4,v1:192.168.50.133:1a91/4

[ceph-users] Re: mds stuck in standby, not one active

2022-12-13 Thread Patrick Donnelly
The Ceph FS pool is green and clean. Please share: ceph status ceph fs dump -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.

[ceph-users] Re: MDS crashes to damaged metadata

2022-11-30 Thread Patrick Donnelly
gt; exist. Basically our MDS Daemons are crashing each time, when somone tries to > delete a file which does not exist in the data pool but metadata says > otherwise. > > Any suggestions how to fix this problem? > > > Is this it? > > https://

[ceph-users] Re: MDS crashes to damaged metadata

2022-11-30 Thread Patrick Donnelly
oes not exist in the data pool but metadata says > otherwise. > > Any suggestions how to fix this problem? Is this it? https://tracker.ceph.com/issues/38452 Are you running postgres on CephFS by chance? -- Patrick Donnelly, Ph.D. He / Him / His Principa

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-29 Thread Patrick Donnelly
The features may be enabled via the mds_export_ephemeral_random and mds_export_ephemeral_distributed configuration options." Otherwise, maybe you found a bug. I would suggest keeping your round-robin script until you can upgrade to Pacific or Quincy. -- Patrick Donnelly, Ph.D. He / Him / H

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-18 Thread Patrick Donnelly
ndicates they are "previews". > If it is implemented, I would like to get it working - if this is possible at > all. Would you still take a look at the data? I'm willing to look. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GP

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-18 Thread Patrick Donnelly
"mds": { > "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) > octopus (stable)": 12 > }, > "overall": { > "ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) > octopus (stable)": 1070 > } &g

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-18 Thread Patrick Donnelly
> this? Please share the version you're using. "/hpc/home/user" should not show up in the subtree output. If possible, can you privately share with me the output of: - `ceph versions` - `ceph fs dump` - `get subtrees` on each active MDS - `dump tree /hpc/home 0` on each active MDS F

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-18 Thread Patrick Donnelly
dropcaches on client nodes after job completion, so there is > potential for reloading data)? The export only happens once the directory is loaded into cache. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 1

[ceph-users] Re: MDS internal op exportdir despite ephemeral pinning

2022-11-16 Thread Patrick Donnelly
was pruned from the cache, someone did /readdir on that directory thereby loading it into cache, then the MDS authoritative for /home (probably 0?) exported that directory to wherever it should go. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, In

[ceph-users] Re: Temporary shutdown of subcluster and cephfs

2022-10-25 Thread Patrick Donnelly
n. They are not given any time to flush remaining I/O. FYI as this may interest you: we have a ticket to set a flag on the file system to prevent new client mounts: https://tracker.ceph.com/issues/57090 -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F8

[ceph-users] Re: Temporary shutdown of subcluster and cephfs

2022-10-24 Thread Patrick Donnelly
~10 seconds after the last client unmounts to give the MDS time to write out to its journal any outstanding events. > , that is, will an FS come up again after > > - fs fail > ... > - fs set joinable true Yes. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software En

[ceph-users] Re: CephFS constant high write I/O to the metadata pool

2022-10-14 Thread Patrick Donnelly
2 > --- > Any ideas where to look at? Check the perf dump output of the mds: ceph tell mds.:0 perf dump over a period of time to identify what's going on. You can also

[ceph-users] Ceph Leadership Team Meeting Minutes - 2022 Oct 12

2022-10-12 Thread Patrick Donnelly
client + Experienced an outage because some k8s cluster (with cephfs pvcs) was using kernel 5.16.13, which has a known null deref bug, fixed in 5.18. (kernel was coming with Fedora Core OS) + What is the recommended k8s host os in the rhel world for ceph kclients? -- Patrick Donnelly, Ph.D

[ceph-users] Re: MDS Performance and PG/PGP value

2022-10-10 Thread Patrick Donnelly
ATA_IO)' > > floki.log.4.gz floki.log.3.gz floki.log.2.gz floki.log.1.gz floki.log > > floki.log.4.gz:6883 > > floki.log.3.gz:11794 > > floki.log.2.gz:3391 > > floki.log.1.gz:1180 > > floki.log:122 > > If I have the opportunity, I will try to run some be

[ceph-users] Re: CephFS MDS sizing

2022-09-12 Thread Patrick Donnelly
bjecter.omap_rd to see how often the MDS goes out to directory objects to read dentries. You can also look at the mds_mem.ino+ mds_mem.ino- to see how often inodes go in and out of the cache. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG:

[ceph-users] Re: Potential bug in cephfs-data-scan?

2022-08-19 Thread Patrick Donnelly
GB, I stopped it, as something obviously was very wrong. > > It turns out some users had symlinks that looped and even a user had a > symlink to "/". Symlinks are not stored in the data pool. This should be irrelevant. -- Patrick Donnelly, Ph.D. He / Him /

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-16 Thread Patrick Donnelly
2 at 3:13 AM Patrick Donnelly wrote: >> >> On Mon, Aug 15, 2022 at 11:39 AM Daniel Williams wrote: >> > >> > Using ubuntu with apt repository from ceph. >> > >> > Ok that helped me figure out that it's .mgr not mgr. >> > # ceph -v >&

[ceph-users] Re: The next quincy point release

2022-08-15 Thread Patrick Donnelly
s week. > > Dev leads please tag all PRs needed to be included ("needs-qa") ASAP > so they can be tested and merged on time. > > Thx > YuriW > > ___ > Dev mailing list -- d...@ceph.io > To unsubscribe send an email

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Patrick Donnelly
> amd64SQLite3 VFS for Ceph > > Attached ceph-sqlite.log No real good hint in the log unfortunately. I will need the core dump to see where things went wrong. Can you upload it with https://docs.ceph.com/en/quincy/man/8/ceph-post-file/ ?

[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-15 Thread Patrick Donnelly
base via: env CEPH_ARGS='--log_to_file true --log-file foo.log --debug_cephsqlite 20 --debug_ms 1' sqlite3 ... -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-u

[ceph-users] Re: CephFS: permissions of the .snap directory do not inherit ACLs

2022-08-09 Thread Patrick Donnelly
ket with details about your environment and an example. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: CephFS standby-replay has more dns/inos/dirs than the active mds

2022-07-19 Thread Patrick Donnelly
emons have similar stats to the active MDS > they're protecting? What could be causing this to happen? > > Thanks, > Bryan > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@c

[ceph-users] Re: Single vs multiple cephfs file systems pros and cons

2022-07-19 Thread Patrick Donnelly
ilesystems? Major consideration points: cost of having multiple MDS running (more memory/cpu used), inability to move files between the two hierarchies without full copies, and straightforward scaling w/ different file systems. Active-active file systems can often function in a similar way with subtre

[ceph-users] Re: MDS upgrade to Quincy

2022-05-18 Thread Patrick Donnelly
Hi Jimmy, On Fri, Apr 22, 2022 at 11:02 AM Jimmy Spets wrote: > > Does cephadm automatically reduce ranks to 1 or does that have to be done > manually? Automatically. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat,

[ceph-users] Re: MDS upgrade to Quincy

2022-04-21 Thread Patrick Donnelly
ady the first thing you do when upgrading a cluster). -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v17.2.0 Quincy released

2022-04-20 Thread Patrick Donnelly
rom Octopus > (and or pacific < 16.2.7) to Quincy? It's not in the release notes, but > just double checking here [1]. Yes it is necessary. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D _

[ceph-users] Re: Upgrade 16.2.6 -> 16.2.7 - MON assertion failure

2022-01-31 Thread Patrick Donnelly
1 ceph-mon[960]: 9: main() > 2021-12-09T14:56:40.103+00:00 tstmon01 ceph-mon[960]: 10: > __libc_start_main() > 2021-12-09T14:56:40.103+00:00 tstmon01 ceph-mon[960]: 11: _start() I just want to followup that this is indeed a new bug (not an existing bug as I originally though

[ceph-users] Re: cephfs: [ERR] loaded dup inode

2022-01-20 Thread Patrick Donnelly
es that have been snapshotted then they stick around in the stray directory until the snapshot is deleted. There's no way to force purging until the snapshot is also deleted. For this reason, the stray directory size can grow without bound. You need to either upgrade to Pacific where the s

[ceph-users] Re: cephfs: [ERR] loaded dup inode

2022-01-16 Thread Patrick Donnelly
https://github.com/ceph/ceph/pull/44514 which is significantly less disruptive than `ls -lR` or `find`. -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D ___ ceph-users mailing

[ceph-users] Re: fs rename: invalid command

2022-01-07 Thread Patrick Donnelly
ncy might not be available for a while and I was hoping to avoid a > bunch of dependencies on the current FS name in use. > > > On Jan 4, 2022, at 8:16 AM, Patrick Donnelly wrote: > > > > "ceph fs rename" will be available in Quincy. see also: > > https://tr

[ceph-users] Re: fs rename: invalid command

2022-01-04 Thread Patrick Donnelly
_ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Patrick Donnelly, Ph.D. He / Him / His Principal Software Engineer Red Hat, Inc. GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

[ceph-users] Re: Issue Upgrading to 16.2.7 related to mon_mds_skip_sanity.

2021-12-23 Thread Patrick Donnelly
state up:standby seq 1 addr [v2: > 10.10.0.32:6800/881362738,v1:10.10.0.32:6801/881362738] compat > {c=[1],r=[1],i=[7ff]}] The "i=[77f]" indicates to me this may be an MDS older than 16.2.7. This should not otherwise be possible. In any case, I'm not exactly sure how this happened t

[ceph-users] Re: Upgrade 16.2.6 -> 16.2.7 - MON assertion failure

2021-12-09 Thread Patrick Donnelly
ped auto-restarting. Please disable mon_mds_skip_sanity in the mons ceph.conf: [mon] mon_mds_skip_sanity = false The cephadm upgrade sequence is already doing this but I forgot (sorry!) to mention this is required for manual upgrades in the release notes. Pl

[ceph-users] Re: Ganesha + cephfs - multiple exports

2021-12-07 Thread Patrick Donnelly
his, i.e. to > export multiple subdirectories from cephfs (to different NFS clients > perhaps) but share the libcephfs connection? Export blocks do not share libcephfs connections. I don't think there are any plans to change that. [1] https://docs.ceph.com/en/pacific/mgr/nfs/ -- Patrick Donn

[ceph-users] Re: Annoying MDS_CLIENT_RECALL Warning

2021-11-19 Thread Patrick Donnelly
0", > "last_scrub_version": 0, > "symlink": "", > "xattrs": [], > "dirfragtree": { > "splits": [] > }, > "old_inodes": [], > "oldest_snap": 1844674407370

[ceph-users] Re: Annoying MDS_CLIENT_RECALL Warning

2021-11-18 Thread Patrick Donnelly
g, or how can we get rid of it? This reminds me of https://tracker.ceph.com/issues/46830 Suggest monitoring the client session information from the MDS as Dan suggested. You can also try increasing mds_min_caps_working_set to see if that helps. -- Patrick

  1   2   3   >