I agree it should be in release notes or documentation, it took me 3 days to track it down and I was searching for all kinds of combinations of "cephfs nfs" and "ceph nfs permissions". Perhaps just having this thread archived will make it easier for the next person to find the answer, though.
________________________________ From: Eugen Block <ebl...@nde.ag> Sent: Thursday, March 16, 2023 10:30 AM To: Wyll Ingersoll <wyllys.ingers...@keepertech.com> Cc: ceph-users@ceph.io <ceph-users@ceph.io> Subject: Re: [ceph-users] Re: Ceph NFS data - cannot read files, getattr returns NFS4ERR_PERM You found the right keywords yourself (application metadata), but I'm glad it worked for you. I only found this tracker issue [2] which fixes the behavior when issuing a "fs new" command, and it contains the same workaround (set the application metadata). Maybe this should be part of the (upgrade) release notes and the documentation so that users with older clusters are aware that their cephfs pools might be missing the correct metadata. Maybe it is already, I didn't find anything yet. [2] https://tracker.ceph.com/issues/43761 Zitat von Wyll Ingersoll <wyllys.ingers...@keepertech.com>: > YES!! That fixed it. > > I issued the following commands to update the application_metadata > on the cephfs pools and now its working. THANK YOU! > > ceph osd pool application set cephfs_data cephfs data cephfs > ceph osd pool application set cephfs_metadata cephfs data cephfs > > Now the application_metadata looks correct on both pools and I can > read/write the data as expected. > > Is this an official ceph bug or only recorded in the SUSE bug db? > It should be in ceph, IMO, since it is not SUSE specific. > > > -Wyllys Ingersoll > > ________________________________ > From: Eugen Block <ebl...@nde.ag> > Sent: Thursday, March 16, 2023 10:04 AM > To: Wyll Ingersoll <wyllys.ingers...@keepertech.com> > Cc: ceph-users@ceph.io <ceph-users@ceph.io> > Subject: Re: [ceph-users] Re: Ceph NFS data - cannot read files, > getattr returns NFS4ERR_PERM > > It sounds a bit like this [1], doesn't it? Setting the application > metadata is just: > > ceph osd pool application set <cephfs_metadata_pool> cephfs > <cephfs_data_pool> cephfs > > [1] https://www.suse.com/support/kb/doc/?id=000020812 > > Zitat von Wyll Ingersoll <wyllys.ingers...@keepertech.com>: > >> Yes, with this last upgrade (pacific) we migrated to the >> orchestrated model where everything is in containers. Previously, we >> managed nfs-ganesha ourselves and exported shares using FSAL VFS >> over /cephfs mounted on the NFS server. >> >> With orchestrated ceph managed NFS, ganesha runs in a container and >> uses FSAL CEPH instead, which accesses cephfs data via libcephfs >> instead of reading from the mounted local FS. I suspect that is part >> of the problem and is related to getting the right permissions. >> >> Here is something I noticed. On a separate cluster running same >> release, the cephfs_data pool has the following application metadata: >> "application_metadata": { >> "cephfs": { >> "metadata": "cephfs" >> } >> } >> >> But on the upgraded cluster the application_metadata on >> cephfs_data/metadata pools looks like: >> "application_metadata": { >> "cephfs": {} >> } >> >> >> I'm wondering if that has something to do with the permission issues >> because the caps use "tag data=cephfs" to grant the RW permission >> for the OSDs. How do I update the application_metadata on the >> pools? I don't see a subcommand in the rados utility, I'm hoping I >> dont have to write something to do it myself. THough I do see there >> are APIs for updating in in librados, so I could write a short C >> utility to make the change if necessary. >> >> thanks! >> >> ________________________________ >> From: Eugen Block <ebl...@nde.ag> >> Sent: Thursday, March 16, 2023 9:48 AM >> To: Wyll Ingersoll <wyllys.ingers...@keepertech.com> >> Cc: ceph-users@ceph.io <ceph-users@ceph.io> >> Subject: Re: [ceph-users] Re: Ceph NFS data - cannot read files, >> getattr returns NFS4ERR_PERM >> >> That would have been my next question, if it had worked before. So the >> only difference is the nfs-ganesha deployment and different (newer?) >> clients than before? Unfortunately, I don't have any ganesha instance >> running in any of my (test) clusters. Maybe someone else can chime in. >> >> Zitat von Wyll Ingersoll <wyllys.ingers...@keepertech.com>: >> >>> Nope, that didn't work. I updated the caps to add "allow r path=/" >>> to the mds, but it made no difference. I restarted the nfs >>> container and unmounted/mounted the share on the client. >>> >>> The caps now look like: >>> key = xxx >>> caps mds = "allow rw path=/exports/nfs/foobar, allow r path=/" >>> caps mon = "allow r" >>> caps osd = "allow rw pool=.nfs namespace=cephfs, allow rw tag >>> cephfs data=cephfs" >>> >>> >>> This is really frustrating. We can mount the shares and get >>> directory listings, and even create directories and files (empty), >>> but cannot read or write any actual data. This would seem to >>> indicate a permission problem writing to the cephfs data pool, but >>> we haven't tinkered with any of the caps or permissions. >>> >>> tcpdump shows lots of errors when trying to read a file from the share: >>> NFS reply xid 771352420 reply ok 96 getattr ERROR: Operation not permitted >>> >>> One thing to note - this is a system that has been around for years >>> and years and has been upgraded through many iterations of ceph. >>> The cephfs data/metadata pools were probably created on Hammer or >>> Luminous, though I'm not sure if that would matter or not. >>> Everything else operates correctly AFAIK. >>> >>> I may take this over to the Ganesha mailing list to see if they have >>> any ideas. >>> >>> thanks! >>> >>> >>> >>> >>> ________________________________ >>> From: Eugen Block <ebl...@nde.ag> >>> Sent: Thursday, March 16, 2023 3:33 AM >>> To: ceph-users@ceph.io <ceph-users@ceph.io> >>> Subject: [ceph-users] Re: Ceph NFS data - cannot read files, getattr >>> returns NFS4ERR_PERM >>> >>> Hi, >>> >>> we saw this on a Nautilus cluster when Clients were updated so we had >>> to modify the client caps to allow read access for the "/" directory. >>> There's an excerpt in the SUSE docs [1] for that: >>> >>>> If clients with path restriction are used, the MDS capabilities need >>>> to include read access to the root directory. >>>> The allow r path=/ part means that path-restricted clients are able >>>> to see the root volume, but cannot write to it. This may be an issue >>>> for use cases where complete isolation is a requirement. >>> >>> Can you update the caps and test again? >>> >>> Regards, >>> Eugen >>> >>> [1] >>> https://documentation.suse.com/ses/7.1/html/ses-all/cha-ceph-cephfs.html >>> >>> Zitat von Wyll Ingersoll <wyllys.ingers...@keepertech.com>: >>> >>>> ceph pacific 16.2.11 (cephadm managed) >>>> >>>> I have configured some NFS mounts from the ceph GUI from cephfs. We >>>> can mount the filesystems and view file/directory listings, but >>>> cannot read any file data. >>>> The permissions on the shares are RW. We mount from the client >>>> using "vers=4.1". >>>> >>>> Looking at debug logs from the container running nfs-ganesha, I see >>>> the following errors when trying to read a file's content: >>>> 15/03/2023 15:27:13 : epoch 6411e209 : gw01 : ganesha.nfsd-7[svc_8] >>>> complete_op :NFS4 :DEBUG :Status of OP_READ in position 2 = >>>> NFS4ERR_PERM, op response size is 7480 total response size is 7568 >>>> 15/03/2023 15:27:13 : epoch 6411e209 : gw01 : ganesha.nfsd-7[svc_8] >>>> complete_nfs4_compound :NFS4 :DEBUG :End status = NFS4ERR_PERM >>>> lastindex = 3 >>>> >>>> >>>> Also, watching the TCP traffic, I see errors in the NFS protocol >>>> corresponding to these messages: >>>> 11:44:43.745570 IP xxx.747 > gw01.nfs: Flags [P.], seq >>>> 24184536:24184748, ack 11409577, win 602, options [nop,nop,TS val >>>> 342245425 ecr 2683489461], length 212: NFS request xid 156024373 208 >>>> getattr fh 0,1/53 >>>> 11:44:43.745683 IP gw01.nfs > xxx.747: Flags [P.], seq >>>> 11409577:11409677, ack 24184748, win 3081, options [nop,nop,TS val >>>> 2683489461 ecr 342245425], length 100: NFS reply xid 156024373 reply >>>> ok 96 getattr ERROR: Operation not permitted >>>> >>>> So there appears to be a permissions problem where nfs-ganesha is >>>> not able to "getattr" on cephfs data. >>>> >>>> The export looks like this (read from rados): >>>> EXPORT { >>>> FSAL { >>>> name = "CEPH"; >>>> user_id = "nfs.cephfs.7"; >>>> filesystem = "cephfs"; >>>> secret_access_key = "xxx"; >>>> } >>>> export_id = 7; >>>> path = "/exports/nfs/foobar"; >>>> pseudo = "/foobar"; >>>> access_type = "RW"; >>>> squash = "no_root_squash"; >>>> attr_expiration_time = 0; >>>> security_label = false; >>>> protocols = 4; >>>> transports = "TCP"; >>>> } >>>> >>>> ceph auth permissions for the nfs.cephfs.7 client: >>>> [client.nfs.cephfs.7] >>>> key = xxx >>>> caps mds = "allow rw path=/exports/nfs/foobar" >>>> caps mon = "allow r" >>>> caps osd = "allow rw pool=.nfs namespace=cephfs, allow rw tag >>>> cephfs data=cephfs" >>>> >>>> >>>> Any suggestions? >>>> >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> ceph-users mailing list -- ceph-users@ceph.io >>>> To unsubscribe send an email to ceph-users-le...@ceph.io >>> >>> >>> _______________________________________________ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io