[Nfs-ganesha-devel] reg. FSAL_ACE_PERM_WRITE_DATA check in fsal_check_setattr_perms
Hi, Ganesha seems to be checking for FSAL_ACE_PERM_WRITE_DATA permission to change owner/group perms (in the function fsal_check_setattr_perms). In our filesystem, there is another user who is equivalent to the root user. This user should be able to change owner/group of any file like the root user. Can somebody please explain the rationale behind this check and how our requirement of having another super user can be achieved. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] reg. FSAL_ACE_PERM_WRITE_DATA check in fsal_check_setattr_perms
I was referring to this check ---> if (access_check != FSAL_ACE4_MASK_SET(FSAL_ACE_PERM_WRITE_DATA)) { status = CACHE_INODE_FSAL_EPERM; note = "(no ACL to check)"; goto out; } which is done if the user is not owner of the file. As per the code, user can do chown if he is owner or if there is an acl on the file. Can Ganesha just pass the credentials (uid, gid) on to the server for it to decide if chown is allowed on that file by a particular user (irrespective of acls set on that file). That way, certain users can be treated specially by the server and grant them access. > > Looking at the code, we don't check WRITE_DATA for owner checks, only for > > size or time changes. For owner/group changes, we check > > FSAL_ACE_PERM_WRITE_OWNER, which is the correct ACL to check. > > > > Presumably, you could just add an ACL to all files allowing all access to your > > "root" user. This should allow access, correct? > This would be a solution. I am trying to see if we can avoid any on-disk changes. Since NFS is one of the ways to access filesystem it would be better if we can avoid handling it differently. > On 02/13/2017 09:31 AM, Satya Prakash GS wrote: > > Hi, > > > > Ganesha seems to be checking for FSAL_ACE_PERM_WRITE_DATA > permission > > to change owner/group perms (in the function > > fsal_check_setattr_perms). In our filesystem, there is another user > > who is equivalent to the root user. This user should be able to change > > owner/group of any file like the root user. Can somebody please > > explain the rationale behind this check and how our requirement of > > having another super user can be achieved. > If you need a true additional super-user, Ganesha would really need to have > code added to be able to configure such, and work to allow super-user > privileges everywhere appropriate. > What FSAL and what filesystem are you using? We have our own filesystem and FSAL. > Frank -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] reg. FSAL_ACE_PERM_WRITE_DATA check in fsal_check_setattr_perms
>On 02/14/2017 06:48 AM, Satya Prakash GS wrote: >> I was referring to this check ---> >> >> if (access_check != FSAL_ACE4_MASK_SET(FSAL_ACE_PERM_WRITE_DATA)) { >> status = CACHE_INODE_FSAL_EPERM; >> note = "(no ACL to check)"; >> goto out; >> } > Sorry, I assumed an ACL existed on the file. What this check is saying > is that, if there's no ACL, the finest granularity check we can do is > unix permission bits, which is just Read Write Execute (and Write is the > only relevant one here), so only continue if we're looking for Write access. Can Ganesha avoid doing this check and call test_access always with the constructed access_mask. I see nothing should be broken because of this. >> which is done if the user is not owner of the file. >> >> As per the code, user can do chown if he is owner or if there is an >> acl on the file. Can Ganesha just pass the credentials (uid, gid) on >> to the server for it to decide if chown is allowed on that file by a >> particular user (irrespective of acls set on that file). That way, >> certain users can be treated specially by the server and grant them >> access. >> >>>> Looking at the code, we don't check WRITE_DATA for owner checks, only for >>>> size or time changes. For owner/group changes, we check >>>> FSAL_ACE_PERM_WRITE_OWNER, which is the correct ACL to check. >>>> >>>> Presumably, you could just add an ACL to all files allowing all access to >> your >>>> "root" user. This should allow access, correct? >> >>> This would be a solution. >> >> I am trying to see if we can avoid any on-disk changes. Since NFS is >> one of the ways to access filesystem it would be better if we can >> avoid handling it differently. > You don't have to do this in the filesystem; you can have the getattrs() > in your FSAL just always add an ACL to the beginning that allows all > access to your superuser. This could mean interpreting an existing acl and building a new acl if an acl exists on that file. -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] reg. FSAL_ACE_PERM_WRITE_DATA check in fsal_check_setattr_perms
Frank, I have subscribed to the list. Apologies for any inconvenience caused. > This wouldn't actually help since the call to test_access just winds up in > fsal_test_access which isn't going to know about your special super user. > All Ganesha is doing here is not making a test_access call that turns into > an fsal_test_access call that would always fail the permission check - or > actually, I think it might always pass the permission check for files that > don't have an NFS v4 ACL... We would have to change the test_access API to > add permissions to check for in mode tests that are outside the mode > permission checking... > The alternative as a general mechanism is to increase the number of calls to > the underlying filesystem Ganesha makes which is likely to have a negative > impact on other FSAL's performance. Our filesystem doesn't support NFSv4 ACLs. Sorry to prolong this further but just to be clear, we have implemented our own test_access call. In our test_access implementation we have a way to figure out if the user is super user or not. Agree that removing the check would result in a lot of fsal_test_access calls. On Wed, Feb 15, 2017 at 3:57 AM, Frank Filz wrote: > One thing, > > I suggest you subscribe to the nfs-ganesha-devel mailing list. I have made > it so your responses should go through without being a member, but you risk > missing a response if someone doesn't reply-all (or worse, we risk missing a > response that is just sent to you). > >> -Original Message- >> From: Satya Prakash GS [mailto:g.satyaprak...@gmail.com] >> Sent: Tuesday, February 14, 2017 2:07 PM >> To: nfs-ganesha-devel@lists.sourceforge.net >> Subject: Re: [Nfs-ganesha-devel] reg. FSAL_ACE_PERM_WRITE_DATA check >> in fsal_check_setattr_perms >> >> >On 02/14/2017 06:48 AM, Satya Prakash GS wrote: >> >> I was referring to this check ---> >> >> >> >> if (access_check != >> FSAL_ACE4_MASK_SET(FSAL_ACE_PERM_WRITE_DATA)) { >> >> status = CACHE_INODE_FSAL_EPERM; >> >> note = "(no ACL to check)"; >> >> goto out; >> >> } >> >> > Sorry, I assumed an ACL existed on the file. What this check is >> > saying is that, if there's no ACL, the finest granularity check we can >> > do is unix permission bits, which is just Read Write Execute (and >> > Write is the only relevant one here), so only continue if we're looking > for >> Write access. >> >> Can Ganesha avoid doing this check and call test_access always with the >> constructed access_mask. I see nothing should be broken because of this. > > This wouldn't actually help since the call to test_access just winds up in > fsal_test_access which isn't going to know about your special super user. > All Ganesha is doing here is not making a test_access call that turns into > an fsal_test_access call that would always fail the permission check - or > actually, I think it might always pass the permission check for files that > don't have an NFS v4 ACL... We would have to change the test_access API to > add permissions to check for in mode tests that are outside the mode > permission checking... > > The alternative as a general mechanism is to increase the number of calls to > the underlying filesystem Ganesha makes which is likely to have a negative > impact on other FSAL's performance. > >> >> which is done if the user is not owner of the file. >> >> >> >> As per the code, user can do chown if he is owner or if there is an >> >> acl on the file. Can Ganesha just pass the credentials (uid, gid) on >> >> to the server for it to decide if chown is allowed on that file by a >> >> particular user (irrespective of acls set on that file). That way, >> >> certain users can be treated specially by the server and grant them >> >> access. >> >> >> >>>> Looking at the code, we don't check WRITE_DATA for owner checks, >> >>>> only for size or time changes. For owner/group changes, we check >> >>>> FSAL_ACE_PERM_WRITE_OWNER, which is the correct ACL to check. >> >>>> >> >>>> Presumably, you could just add an ACL to all files allowing all >> >>>> access to >> >> your >> >>>> "root" user. This should allow access, correct? >> >> >> >>> This would be a solution. >> >> >> >> I am trying to see if we can avoid any on-disk changes. Since NFS is >> >> one of the ways to access filesystem
Re: [Nfs-ganesha-devel] reg. FSAL_ACE_PERM_WRITE_DATA check in fsal_check_setattr_perms
Daniel/Frank, We are using v2.3-stable at the moment. I am yet to go through stackable fsals to understand your previous comments. Also, we want to make sure that we aren't hit when we upgrade to v2.4 or latest. I like the is_super_logic that Frank has proposed, can we have that in Ganesha. I can own the task and publish the change. Thanks, Satya. On Thu, Feb 16, 2017 at 12:50 AM, Frank Filz wrote: > >> Frank, >> >> I have subscribed to the list. Apologies for any inconvenience caused. >> >> > This wouldn't actually help since the call to test_access just winds >> > up in fsal_test_access which isn't going to know about your special super >> user. >> > All Ganesha is doing here is not making a test_access call that turns >> > into an fsal_test_access call that would always fail the permission >> > check - or actually, I think it might always pass the permission check >> > for files that don't have an NFS v4 ACL... We would have to change the >> > test_access API to add permissions to check for in mode tests that are >> > outside the mode permission checking... >> >> > The alternative as a general mechanism is to increase the number of >> > calls to the underlying filesystem Ganesha makes which is likely to >> > have a negative impact on other FSAL's performance. >> >> Our filesystem doesn't support NFSv4 ACLs. Sorry to prolong this further but >> just to be clear, we have implemented our own test_access call. In our >> test_access implementation we have a way to figure out if the user is super >> user or not. Agree that removing the check would result in a lot of >> fsal_test_access calls. > > What version of Ganesha do you use? 2.4 and later will not ever call your > FSAL's test_access because FSAL_MDCACHE always calls fsal_test_access and > never calls the underlying FSAL's own test_access. > > Maybe what we need is a way for places that are checking for super user to > call an FSAL is_super_user(creds) method, which of could would default to > returning true only for uid == 0. > > Your implementation of course could do whatever you need to do. > > Then we just have to get out of the habit of checking for uid == 0 and > instead invoke is_super_user(creds)... > > Frank > > > > --- > This email has been checked for viruses by Avast antivirus software. > https://www.avast.com/antivirus > -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] Permission denied error with Kerberos enabled
Hi, I am seeing "Permission denied" errors while running iozone on nfs client with kerberos enabled. Digging further, I found there are a lot of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates 2 errors from server and tries to refresh the credentials. On the third call it would throw an error to the application. http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343 2395 switch ((n = ntohl(*p++))) { 2396 case RPC_AUTH_REJECTEDCRED: 2397 case RPC_AUTH_REJECTEDVERF: 2398 case RPCSEC_GSS_CREDPROBLEM: 2399 case RPCSEC_GSS_CTXPROBLEM: 2400 if (!task->tk_cred_retry) 2401 break; 2402 task->tk_cred_retry--; 2403 dprintk("RPC: %5u %s: retry stale creds\n", 2404 task->tk_pid, __func__); 2405 rpcauth_invalcred(task); On the client I have seen this message twice : Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry stale creds Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS cred 880544ce4600 Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request 8804062e7000 Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport 8808723c5800 Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue "xprt_sending" time 25264836677) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 added to queue 8808723c5990 "xprt_sending" Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_wake_up_task (now 25264836722) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 removed from queue 8808723c5990 "xprt_sending" Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801 Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status -11) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_retry_reserve (status 0) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 reserved req 8806c2e01a00 xid 929383d1 Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status 0) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_refresh (status 0) Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 refreshing RPCSEC_GSS cred 88086f634240 On the third occurrence the filesystem OP failed : Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801 Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_status (status 20) Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call rejected 2 Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call failed with error -13 Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 return 0, status -13 Say, the ticket has expired (within the renewable lifetime) and the server did not find the ticket in the cache for the first time but the second/third call shouldn't ideally fail when the credentials were just refreshed through an upcall. Unavailability of the creds in the cache/a failing svcauth_gss_accept_sec_context call could throw the REJECTEDCRED error. Could you share some pointers on which is more likely or if there is something else that could cause this issue. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, SlashDot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
With libntirpc debugs enabled I could see all the three retries are failing because of the unavailability of the creds in the cache. The credentials are being removed by the reaper in the authgss_ctx_gc_idle because of this condition - abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen >From the code, I can see that only a further RPCSEC_GSS_INIT call from the client can repopulate the credentials in the cache. I am not sure how server can dictate client to establish the context again. Any help is appreciated. Thanks, Satya. On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS wrote: > Hi, > > I am seeing "Permission denied" errors while running iozone on nfs > client with kerberos enabled. Digging further, I found there are a lot > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates > 2 errors from server and tries to refresh the credentials. On the > third call it would throw an error to the application. > > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343 > > 2395 switch ((n = ntohl(*p++))) { > 2396 case RPC_AUTH_REJECTEDCRED: > 2397 case RPC_AUTH_REJECTEDVERF: > 2398 case RPCSEC_GSS_CREDPROBLEM: > 2399 case RPCSEC_GSS_CTXPROBLEM: > 2400 if (!task->tk_cred_retry) > 2401 break; > 2402 task->tk_cred_retry--; > 2403 dprintk("RPC: %5u %s: retry stale creds\n", > 2404 task->tk_pid, __func__); > 2405 rpcauth_invalcred(task); > > > On the client I have seen this message twice : > > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry > stale creds > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS > cred 880544ce4600 > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request 8804062e7000 > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport > 8808723c5800 > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue > "xprt_sending" time 25264836677) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 added to queue > 8808723c5990 "xprt_sending" > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_wake_up_task (now > 25264836722) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 removed from queue > 8808723c5990 "xprt_sending" > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801 > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status -11) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_retry_reserve (status 0) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 reserved req > 8806c2e01a00 xid 929383d1 > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status 0) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_refresh (status 0) > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 refreshing RPCSEC_GSS > cred 88086f634240 > > On the third occurrence the filesystem OP failed : > > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801 > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_status (status 20) > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call rejected > 2 > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call > failed with error -13 > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 return 0, status -13 > > Say, the ticket has expired (within the renewable lifetime) and the > server did not find the ticket in the cache for the first time but the > second/third call shouldn't ideally fail when the credentials were > just refreshed through an upcall. Unavailability of the creds in the > cache/a failing svcauth_gss_accept_sec_context call could throw the > REJECTEDCRED error. Could you share some pointers on which is more > likely or if there is something else that could cause this issue. > > Thanks, > Satya. -- Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
>On 3/7/17 4:56 AM, William Allen Simpson wrote: > On 3/6/17 6:58 PM, Matt Benjamin wrote: >> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be >> correct. If the client has just refreshed its credentials, why is it >> continuing to send with the expired context? >> Thank you for the reply. The client may not be sending expired credentials but it is supposed to reestablish the credentials using RPC_GSS_PROC_DESTROY/RPC_GSS_PROC_INIT which I guess is not happening. I am continuing to debug this further. As per the RFC, Ganesha is supposed to be throwing AUTH_REJECTEDCRED instead of RPCSEC_GSS_CREDPROBLEM when it doesn't find credentials in the cache. However the nfs client handles AUTH_REJECTEDCRED, RPCSEC_GSS_CREDPROBLEM similarly. I am not hopeful of this change but I can give it a try. > I don't know, but I'll take a look. Now that we always have a server > for a client, perhaps the cache can be moved into a shared structure? >Sorry, thought that was our ntirpc client. Looking back, that's the >kernel client. Not much we can do about the kernel client other than >report a bug. William, I want to be sure it's a client bug and not Ganesha bug before putting it on the kernel mailing list. Given that the issue is reproducible twice/thrice a week I am wondering how it would have gone unreported so far. Regards, Satya. On Tue, Mar 7, 2017 at 5:28 AM, Matt Benjamin wrote: > Hi Satya, > > Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be > correct. If the client has just refreshed its credentials, why is it > continuing to send with the expired context? > > Matt > > - Original Message - >> From: "Satya Prakash GS" >> To: nfs-ganesha-devel@lists.sourceforge.net >> Sent: Monday, March 6, 2017 1:10:36 PM >> Subject: Re: [Nfs-ganesha-devel] Permission denied error with Kerberos >> enabled >> >> With libntirpc debugs enabled I could see all the three retries are >> failing because of the unavailability of the creds in the cache. The >> credentials are being removed by the reaper in the authgss_ctx_gc_idle >> because of this condition - >> abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen >> >> >From the code, I can see that only a further RPCSEC_GSS_INIT call from >> the client can repopulate the credentials in the cache. I am not sure >> how server can dictate client to establish the context again. >> >> Any help is appreciated. >> >> Thanks, >> Satya. >> >> On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS >> wrote: >> > Hi, >> > >> > I am seeing "Permission denied" errors while running iozone on nfs >> > client with kerberos enabled. Digging further, I found there are a lot >> > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates >> > 2 errors from server and tries to refresh the credentials. On the >> > third call it would throw an error to the application. >> > >> > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343 >> > >> > 2395 switch ((n = ntohl(*p++))) { >> > 2396 case RPC_AUTH_REJECTEDCRED: >> > 2397 case RPC_AUTH_REJECTEDVERF: >> > 2398 case RPCSEC_GSS_CREDPROBLEM: >> > 2399 case RPCSEC_GSS_CTXPROBLEM: >> > 2400 if (!task->tk_cred_retry) >> > 2401 break; >> > 2402 task->tk_cred_retry--; >> > 2403 dprintk("RPC: %5u %s: retry stale creds\n", >> > 2404 task->tk_pid, __func__); >> > 2405 rpcauth_invalcred(task); >> > >> > >> > On the client I have seen this message twice : >> > >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20) >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry >> > stale creds >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS >> > cred 880544ce4600 >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request >> > 8804062e7000 >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0) >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport >> > 8808723c5800 >> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue &
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
Looks like the gen field in svc_rpc_gss_data is used to check the freshness of a context. However it is not initialized to axp->gen in authgss_ctx_hash_set. Will this not result in evicting the entries out early or am I missing something ? Thanks, Satya. On Tue, Mar 7, 2017 at 4:36 PM, Satya Prakash GS wrote: >>On 3/7/17 4:56 AM, William Allen Simpson wrote: >> On 3/6/17 6:58 PM, Matt Benjamin wrote: >>> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be >>> correct. If the client has just refreshed its credentials, why is it >>> continuing to send with the expired context? >>> > > Thank you for the reply. > > The client may not be sending expired credentials but it is supposed > to reestablish the credentials using > RPC_GSS_PROC_DESTROY/RPC_GSS_PROC_INIT which I guess is not happening. > I am continuing to debug this further. > > As per the RFC, Ganesha is supposed to be throwing AUTH_REJECTEDCRED > instead of RPCSEC_GSS_CREDPROBLEM when it doesn't find credentials in > the cache. > > However the nfs client handles AUTH_REJECTEDCRED, > RPCSEC_GSS_CREDPROBLEM similarly. I am not hopeful of this change but > I can give it a try. > >> I don't know, but I'll take a look. Now that we always have a server >> for a client, perhaps the cache can be moved into a shared structure? > >>Sorry, thought that was our ntirpc client. Looking back, that's the >>kernel client. Not much we can do about the kernel client other than >>report a bug. > > William, > I want to be sure it's a client bug and not Ganesha bug before putting > it on the kernel mailing list. Given that the issue is reproducible > twice/thrice a week I am wondering how it would have gone unreported > so far. > > Regards, > Satya. > > > > On Tue, Mar 7, 2017 at 5:28 AM, Matt Benjamin wrote: >> Hi Satya, >> >> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be >> correct. If the client has just refreshed its credentials, why is it >> continuing to send with the expired context? >> >> Matt >> >> - Original Message - >>> From: "Satya Prakash GS" >>> To: nfs-ganesha-devel@lists.sourceforge.net >>> Sent: Monday, March 6, 2017 1:10:36 PM >>> Subject: Re: [Nfs-ganesha-devel] Permission denied error with Kerberos >>> enabled >>> >>> With libntirpc debugs enabled I could see all the three retries are >>> failing because of the unavailability of the creds in the cache. The >>> credentials are being removed by the reaper in the authgss_ctx_gc_idle >>> because of this condition - >>> abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen >>> >>> >From the code, I can see that only a further RPCSEC_GSS_INIT call from >>> the client can repopulate the credentials in the cache. I am not sure >>> how server can dictate client to establish the context again. >>> >>> Any help is appreciated. >>> >>> Thanks, >>> Satya. >>> >>> On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS >>> wrote: >>> > Hi, >>> > >>> > I am seeing "Permission denied" errors while running iozone on nfs >>> > client with kerberos enabled. Digging further, I found there are a lot >>> > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates >>> > 2 errors from server and tries to refresh the credentials. On the >>> > third call it would throw an error to the application. >>> > >>> > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343 >>> > >>> > 2395 switch ((n = ntohl(*p++))) { >>> > 2396 case RPC_AUTH_REJECTEDCRED: >>> > 2397 case RPC_AUTH_REJECTEDVERF: >>> > 2398 case RPCSEC_GSS_CREDPROBLEM: >>> > 2399 case RPCSEC_GSS_CTXPROBLEM: >>> > 2400 if (!task->tk_cred_retry) >>> > 2401 break; >>> > 2402 task->tk_cred_retry--; >>> > 2403 dprintk("RPC: %5u %s: retry stale creds\n", >>> > 2404 task->tk_pid, __func__); >>> > 2405 rpcauth_invalcred(task); >>> > >>> > >>> > On the client I have seen this message twice : >>> > >>> > Feb 26 10:27:01 atsqa6c71 ker
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
Is this a possibility : Server first rejects a client op with CREDPROBLEM/REJECTEDCRED, Client does an upcall and gssd initializes the context with the server. However the server recycles it immediately before the operation was retried (looks like there is a bug in the LRU implementation on Ganesha. To make things worse, I enabled the server debugs and it slowed down the client operations making the eviction of the entry easier). This happens thrice failing the client op. Thanks, Satya. On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS wrote: > Looks like the gen field in svc_rpc_gss_data is used to check the > freshness of a context. However it is not initialized to axp->gen in > authgss_ctx_hash_set. > Will this not result in evicting the entries out early or am I missing > something ? > > Thanks, > Satya. > -- Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
On Sat, Mar 11, 2017 at 12:37 AM, William Allen Simpson wrote: > I'm not familiar with this code, so not likely to be much help. > Looks mostly written by Matt, but Malahal made the most recent > changes in July 2016. > > On 3/10/17 9:35 AM, Satya Prakash GS wrote: >> >> Is this a possibility : >> >> Server first rejects a client op with CREDPROBLEM/REJECTEDCRED, >> Client does an upcall and gssd initializes the context with the server. >> However the server recycles it immediately before the operation was >> retried (looks like there is a bug in the LRU implementation on >> Ganesha. To make things worse, I enabled the server debugs and it >> slowed down the client operations making the eviction of the entry >> easier). This happens thrice failing the client op. >> > Problem is not obvious. > > axp->gen is initialized to zero with the rest of *axp -- mem_zalloc(). > > gd->gen is initialized to zero by alloc_svc_rpc_gss_data(). > > axp->gen is bumped by one (++) each time it is handled by LRU code in > authgss_ctx_hash_get(). > If a node gen isn't getting incremented it means that node is not being looked up often. > atomic_inc_uint32_t(&gd->gen) is immediately after that. > > You think gd->gen also needs to be set to axp->gen in _set()? > > I'm not sure they are related. There are many gd per axp, so > axp->gen could be much higher than gd->gen. > >From authgss_ctx_gc_idle -> if (abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen) { Remove the entry from the tree; //gd is no more in the cache after this } Translates to - gd wasn't looked up in quite sometime let's clean it up. //gss.max_idle_gen -> by default set to 1024 If tree's gen is 5000 and a new node gets inserted into the tree, node gen shouldn't start at 0 or it might pass the above condition in the next authgss_ctx_gc_idle call. > Both _get and _set are only called in svc_auth_gss.c _svcauth_gss(). > > Admittedly, it is hard to track that there are 2 fields both called gen. > >> Thanks, >> Satya. >> >> On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS >> wrote: >>> >>> Looks like the gen field in svc_rpc_gss_data is used to check the >>> freshness of a context. However it is not initialized to axp->gen in >>> authgss_ctx_hash_set. >>> Will this not result in evicting the entries out early or am I missing >>> something ? >>> >>> Thanks, >>> Satya. >>> >> >> >> -- >> Announcing the Oxford Dictionaries API! The API offers world-renowned >> dictionary content that is easy and intuitive to access. Sign up for an >> account today to start using our lexical data to power your apps and >> projects. Get started today and enter our developer competition. >> http://sdm.link/oxford >> ___ >> Nfs-ganesha-devel mailing list >> Nfs-ganesha-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> > Thanks, Satya. -- Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
We are using 2.3-stable. Given that most of our testing has been done it's a bit difficult for us to move to 2.5 now but we can take fixes from 2.5. I put a similar fix (to the existing one in 2.5) but I am running into another issue around the ticket renew time. Operations are failing on the client with auth check failed error -13 (Permission denied). I see this after every ticket renewal : 27510445:442377371:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114 gss_validate 27510447:442377373:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114 gss_validate: gss_verify_mic returned error 0x000c 27510448:442377374:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114 gss_validate failed ret -13. 27510449:442377375:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114 rpc_verify_header: auth check failed with -13 27510450:442377376:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114 rpc_verify_header: retrying gss_verify_mic failed on client with error code set to GSS_S_CONTEXT_EXPIRED. Either the server or the client is using the wrong context while wrapping/unwrapping. Do you remember fixing bug like that ? Thanks, Satya. On Sat, Mar 11, 2017 at 5:55 PM, Malahal Naineni wrote: > gd->gen is not used in the latest code. If I remember, there was a bug > removing recent cached entries resulting in permission errors. What > version are you using? Try using V2.5. > > Regards, Malahal. > > On Sat, Mar 11, 2017 at 12:54 AM, Satya Prakash GS > wrote: >> On Sat, Mar 11, 2017 at 12:37 AM, William Allen Simpson >> wrote: >>> I'm not familiar with this code, so not likely to be much help. >>> Looks mostly written by Matt, but Malahal made the most recent >>> changes in July 2016. >>> >>> On 3/10/17 9:35 AM, Satya Prakash GS wrote: >>>> >>>> Is this a possibility : >>>> >>>> Server first rejects a client op with CREDPROBLEM/REJECTEDCRED, >>>> Client does an upcall and gssd initializes the context with the server. >>>> However the server recycles it immediately before the operation was >>>> retried (looks like there is a bug in the LRU implementation on >>>> Ganesha. To make things worse, I enabled the server debugs and it >>>> slowed down the client operations making the eviction of the entry >>>> easier). This happens thrice failing the client op. >>>> >>> Problem is not obvious. >>> >>> axp->gen is initialized to zero with the rest of *axp -- mem_zalloc(). >>> >>> gd->gen is initialized to zero by alloc_svc_rpc_gss_data(). >>> >>> axp->gen is bumped by one (++) each time it is handled by LRU code in >>> authgss_ctx_hash_get(). >>> >> >> If a node gen isn't getting incremented it means that node is not >> being looked up often. >> >>> atomic_inc_uint32_t(&gd->gen) is immediately after that. >>> >>> You think gd->gen also needs to be set to axp->gen in _set()? >>> >> >>> I'm not sure they are related. There are many gd per axp, so >>> axp->gen could be much higher than gd->gen. >>> >> >> >From authgss_ctx_gc_idle -> >> >> if (abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen) { >> Remove the entry from the tree; //gd is no more in the cache after this >> } >> >> Translates to - gd wasn't looked up in quite sometime let's clean it up. >> >> //gss.max_idle_gen -> by default set to 1024 >> >> If tree's gen is 5000 and a new node gets inserted into the tree, node >> gen shouldn't start at 0 or it might pass the above condition in the >> next authgss_ctx_gc_idle call. >> >>> Both _get and _set are only called in svc_auth_gss.c _svcauth_gss(). >>> >>> Admittedly, it is hard to track that there are 2 fields both called gen. >>> >>>> Thanks, >>>> Satya. >>>> >>>> On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS >>>> wrote: >>>>> >>>>> Looks like the gen field in svc_rpc_gss_data is used to check the >>>>> freshness of a context. However it is not initialized to axp->gen in >>>>> authgss_ctx_hash_set. >>>>> Will this not result in evicting the entries out early or am I missing >>>>> something ? >>>>> >>>>> Thanks, >>>>> Satya. >>>>> >>>> >>>> >>>> -- >>>> Announcing the Oxford Dictionaries API! The API offers worl
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
My bad, I should have mentioned the version in the original post. Mahalal was kind enough to share a list of relevant commits. With the patches I continued to see the issue. I suspect the client code is not handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic. Instead I fixed the server code to timeout the ticket 5 mins before the actual timeout (Ganesha is already timing the ticket 5 seconds earlier). So far, the issue hasn't got reproduced but I will continue running the test for a day or two before confirming if the fix works. Do you see any issue with this fix ? Thanks, Satya. On Sun, Mar 12, 2017 at 8:26 PM, Malahal Naineni wrote: >>> Indeed, 2.4 was mostly a bug fix release > > Actually, 2.4 has couple big features as far as ganesha project is > concerned, but Bill is probably indicating that libntirpc > corresponding to ganesha2.4 is mostly bug fix release. > > Regards, Malahal. > > On Sun, Mar 12, 2017 at 8:15 PM, William Allen Simpson > wrote: >> On 3/11/17 8:15 AM, Satya Prakash GS wrote: >>> >>> We are using 2.3-stable. Given that most of our testing has been done >>> it's a bit difficult for us to move to 2.5 now but we can take fixes >>> from 2.5. >>> >> Sorry, I should have asked long ago what version you were using. >> >> On this list, I always assume that you are using the most recent -dev >> release. There are an awful lot of bug fixes since 2.3. Indeed, 2.4 >> was mostly a bug fix release, and 2.5 is supposed to be a performance >> release (but has a fair number of bug fixes, too). -- Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
Has anyone seen client ops failing with error -13 because of context expiry on client (gss_verify_mic fails). Surprisingly with little load, it's consistently reproducible on my setup. Can someone point me to the relevant commits if this has already been fixed. Thanks, Satya. On Mon, Mar 13, 2017 at 4:01 PM, Satya Prakash GS wrote: > My bad, I should have mentioned the version in the original post. > > Mahalal was kind enough to share a list of relevant commits. With the > patches I continued to see the issue. I suspect the client code is not > handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic. > Instead I fixed the server code to timeout the ticket 5 mins before > the actual timeout (Ganesha is already timing the ticket 5 seconds > earlier). > So far, the issue hasn't got reproduced but I will continue running > the test for a day or two before confirming if the fix works. Do you > see any issue with this fix ? > > Thanks, > Satya. > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled
Here are the reproduction steps: I have 3 different servers hosting nfs client, server and KDC. I set the ticket lifetime to 10 minutes on the client and server (in krb5.conf). When adding a principal I used specified "-maxlife "10 minutes" -maxrenew 2017-04-30". I specified max_life (to 10 mins) in the kdc.conf file. I am using machine credentials on the client (running operation as root user). Run iozone or bonnie from 2 different clients and you should see the issue within an hour. The issue seems to be with the clock-skew which is set to 5 minutes by default. The server is seeing context timeout of 15 mins while it should have been 10 mins (taking the clock-skew into account). Client is rejecting the server messages if the context is used for more than 10 mins (on the server). This happens thrice and the user operation fails. Please let me know if you need any other details. Thanks, Satya. On Sun, Mar 19, 2017 at 5:08 PM, Malahal Naineni wrote: > If I understand, you have renewable ticket and commands fail when the > ticket expires? I will let our folks tests it. Any more details on > reproducing this issue. > > On Fri, Mar 17, 2017 at 9:59 AM, Satya Prakash GS > wrote: >> Has anyone seen client ops failing with error -13 because of context >> expiry on client (gss_verify_mic fails). >> Surprisingly with little load, it's consistently reproducible on my setup. >> Can someone point me to the relevant commits if this has already been fixed. >> >> Thanks, >> Satya. >> >> On Mon, Mar 13, 2017 at 4:01 PM, Satya Prakash GS >> wrote: >>> My bad, I should have mentioned the version in the original post. >>> >>> Mahalal was kind enough to share a list of relevant commits. With the >>> patches I continued to see the issue. I suspect the client code is not >>> handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic. >>> Instead I fixed the server code to timeout the ticket 5 mins before >>> the actual timeout (Ganesha is already timing the ticket 5 seconds >>> earlier). >>> So far, the issue hasn't got reproduced but I will continue running >>> the test for a day or two before confirming if the fix works. Do you >>> see any issue with this fix ? >>> >>> Thanks, >>> Satya. >>> -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] drc and non-cacheable ops
Hi, I have been looking at the drc code, I see operations like READ, READDIR, etc are not cached in drc. Can a compound operations have mix of both cacheable and non-cacheable operations. For example, can client send both SETATTR and READ as part of one compound operation (if concurrent operations are going on). If there is a mix of operations looks like DRC doesn't cache the operation. Is this ok ? Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] drc and non-cacheable ops
Can somebody please reply to this. Thanks, Satya. On Wed, Apr 26, 2017 at 3:02 PM, Satya Prakash GS wrote: > Hi, > > I have been looking at the drc code, I see operations like READ, > READDIR, etc are not cached in drc. Can a compound operations have mix > of both cacheable and non-cacheable operations. For example, can > client send both SETATTR and READ as part of one compound operation > (if concurrent operations are going on). If there is a mix of > operations looks like DRC doesn't cache the operation. Is this ok ? > > Thanks, > Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] drc refcnt
Hi, DRC refcnt is incremented on every get_drc. However, every nfs_dupreq_finish doesn't call a put_drc. How is it ensured that the drc refcnt drops to zero. On doing an umount, is drc eventually cleaned up. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] drc refcnt
Daniel, I meant to say - nfs_dupreq_finish doesn't call put_drc always. It does only if it meets certain criteria (drc_should_retire). Say the maxsize is 1000, hiwat is 800 and retire window size = 0. At the time of unmount if the drc size is just 100 wouldn't the refcount stay > 0. Thanks, Satya. >nfs_dupreq_finish() calls dupreq_entry_put() at about line 1238, and >nfs_dupreq_put_drc() at about line 1222, so I think this is okay. >Daniel >On 05/01/2017 11:08 AM, Satya Prakash GS wrote: >> Hi, >> >> DRC refcnt is incremented on every get_drc. However, every >> nfs_dupreq_finish doesn't call a put_drc. How is it ensured that the >> drc refcnt drops to zero. On doing an umount, is drc eventually >> cleaned up. >> >> Thanks, >> Satya. >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Nfs-ganesha-devel mailing list >> Nfs-ganesha-devel@... >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> On Mon, May 1, 2017 at 9:09 PM, Matt Benjamin wrote: > Hi Satya, > > I don't -think- that's the case (that DRCs are leaked). If so, we would > certainly wish to correct it. Malahal has most recently updated these code > paths. > > Regards, > > Matt > > - Original Message - >> From: "Satya Prakash GS" >> To: nfs-ganesha-devel@lists.sourceforge.net >> Sent: Monday, May 1, 2017 11:08:48 AM >> Subject: [Nfs-ganesha-devel] drc refcnt >> >> Hi, >> >> DRC refcnt is incremented on every get_drc. However, every >> nfs_dupreq_finish doesn't call a put_drc. How is it ensured that the >> drc refcnt drops to zero. On doing an umount, is drc eventually >> cleaned up. >> >> Thanks, >> Satya. >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Nfs-ganesha-devel mailing list >> Nfs-ganesha-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> > > -- > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] drc refcnt
> On Tue, May 2, 2017 at 7:58 AM, Malahal Naineni wrote: > A dupreq will place a refcount on its DRC when it calls xxx_get_drc, so we > will release that DRC refcount when we free the dupreq. Ok, so every dupreq holds a ref on the drc. In case of drc cache hit, a dupreq entry can ref the drc more than once. This is still fine because unless the dupreq entry ref goes to zero the drc isn't freed. > nfs_dupreq_finish() shouldn't free its own dupreq. When it does free some > other dupreq, we will release DRC refcount corresponding to that dupreq. > When we free all dupreqs that belong to a DRC In the case of a disconnected client when are all the dupreqs freed ? When all the filesystem operations subside from a client (mount point is no longer in use), nfs_dupreq_finish doesn't get called anymore. This is the only place where dupreq entries are removed from the drc. If the entries aren't removed from drc, drc refcnt doesn't go to 0. >, its refcount should go to > zero (maybe another ref is held by the socket itself, so the socket has to > be closed as well). > > > In fact, if we release DRC refcount without freeing the dupreq, that would > be a bug! > > Regards, Malahal. > Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] drc and non-cacheable ops
Ok. DRC doesn't work on requests with cacheable and non-cacheable ops in it. Thank you Matt. Regards, Satya. On Mon, May 1, 2017 at 9:07 PM, Matt Benjamin wrote: > Hi Satya, > > That is expected, yes. I'm not aware of all possible implications. The > issue of compound ops, specifically, is evidently only present in NFSv4.0 (or > 4.1, the DRC is not used). > > Matt > > ----- Original Message - >> From: "Satya Prakash GS" >> To: nfs-ganesha-devel@lists.sourceforge.net >> Sent: Monday, May 1, 2017 10:58:11 AM >> Subject: Re: [Nfs-ganesha-devel] drc and non-cacheable ops >> >> Can somebody please reply to this. >> >> Thanks, >> Satya. >> >> On Wed, Apr 26, 2017 at 3:02 PM, Satya Prakash GS >> wrote: >> > Hi, >> > >> > I have been looking at the drc code, I see operations like READ, >> > READDIR, etc are not cached in drc. Can a compound operations have mix >> > of both cacheable and non-cacheable operations. For example, can >> > client send both SETATTR and READ as part of one compound operation >> > (if concurrent operations are going on). If there is a mix of >> > operations looks like DRC doesn't cache the operation. Is this ok ? >> > >> > Thanks, >> > Satya. >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Nfs-ganesha-devel mailing list >> Nfs-ganesha-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> > > -- > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] drc refcnt
>On Tue, May 2, 2017 at 11:51 AM, Malahal Naineni wrote: > Sorry, every cacheable request holds a ref on its DRC as well as its DUPREQ. > The ref on DUPREQ should be released when the request goes away (via > nfs_dupreq_rele). The ref on DRC will be released when the corresponding > DUPREQ request gets released. Since we release DUPREQs while processing > other requests, you are right that the DRC won't be freed if there are no > more requests that would use the same DRC. Ok. > I think we should be freeing dupreq periodically using a timed function, > something like that drc_free_expired. Refcnt of the dupreq entries within the stale drc (of the disconnect client) is +ve since it starts at 2 (it is decremented once in nfs_dupreq_rele). First the refcnt of the dupreq_entries should be decremented and freed when it reaches 0. On freeing a dupreq entry, refcnt of drc should be decremented. All of the above should happen in free_expired or a similar function. > Regards, Malahal. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
[Nfs-ganesha-devel] reg. drc nested locks
Hi, In nfs_dupreq_start and nfs_dupreq_finish when allocating/freeing a dupreq_entry we are trying hard to keep both dupreq_q and the rbtree in sync acquiring both the partition lock and the drc (t->mtx, drc->mtx). This requires dropping and reacquiring locks at certain places. Can these nested locks be changed to take locks one after the other. For example at the time of allocation, we could choose to do this - PTHREAD_MUTEX_lock(&t->mtx); /* partition lock */ nv = rbtree_x_cached_lookup(&drc->xt, t, &dk->rbt_k, dk->hk); if (!nv) { dk->refcnt = 2; (void)rbtree_x_cached_insert(&drc->xt, t, &dk->rbt_k, dk->hk); PTHREAD_MUTEX_unlock(&t->mtx); /* partition lock */ PTHREAD_MUTEX_lock(&drc->mtx); TAILQ_INSERT_TAIL(&drc->dupreq_q, dk, fifo_q); ++(drc->size); PTHREAD_MUTEX_unlock(&drc->mtx); } I am assuming this would simplify the lock code a lot. If there is a case where this would introduce a race please let me know. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] reg. drc nested locks
Thank you for the quick reply. In dupreq_finish, as part of retiring the drc quite a few locks are acquired and dropped (per entry). I want to fix a bug where drc retire will happen as part of a different function (this will be called from free_expired). The existing logic gets carried over to the new function and I was thinking that we may not have to acquire and release lock so many times. Thanks, Satya. On Thu, May 4, 2017 at 1:21 AM, Matt Benjamin wrote: > Hi Satya, > > Sorry, my recommendation would be, we do not change locking to be more coarse > grained, and in general, should update it in response to an indication that > it is incorrect, not to improve readability in the first instance. > > Regards, > > Matt > > - Original Message - >> From: "Matt Benjamin" >> To: "Satya Prakash GS" >> Cc: nfs-ganesha-devel@lists.sourceforge.net, "Malahal Naineni" >> >> Sent: Wednesday, May 3, 2017 3:43:06 PM >> Subject: Re: [Nfs-ganesha-devel] reg. drc nested locks >> >> No? >> >> Matt >> >> - Original Message - >> > From: "Satya Prakash GS" >> > To: nfs-ganesha-devel@lists.sourceforge.net, "Malahal Naineni" >> > >> > Sent: Wednesday, May 3, 2017 3:34:31 PM >> > Subject: [Nfs-ganesha-devel] reg. drc nested locks >> > >> > Hi, >> > >> > In nfs_dupreq_start and nfs_dupreq_finish when allocating/freeing a >> > dupreq_entry we are trying hard to keep both dupreq_q and the rbtree >> > in sync acquiring both the partition lock and the drc (t->mtx, >> > drc->mtx). This requires dropping and reacquiring locks at certain >> > places. Can these nested locks be changed to take locks one after the >> > other. >> > >> > For example at the time of allocation, we could choose to do this - >> > >> > PTHREAD_MUTEX_lock(&t->mtx); /* partition lock */ >> > nv = rbtree_x_cached_lookup(&drc->xt, t, &dk->rbt_k, dk->hk); >> > if (!nv) { >> > dk->refcnt = 2; >> > (void)rbtree_x_cached_insert(&drc->xt, t, >> > &dk->rbt_k, dk->hk); >> > PTHREAD_MUTEX_unlock(&t->mtx); /* partition lock */ >> > >> > PTHREAD_MUTEX_lock(&drc->mtx); >> > TAILQ_INSERT_TAIL(&drc->dupreq_q, dk, fifo_q); >> > ++(drc->size); >> > PTHREAD_MUTEX_unlock(&drc->mtx); >> > } >> > >> > I am assuming this would simplify the lock code a lot. >> > If there is a case where this would introduce a race please let me know. >> > >> > Thanks, >> > Satya. >> > >> > -- >> > Check out the vibrant tech community on one of the world's most >> > engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> > ___ >> > Nfs-ganesha-devel mailing list >> > Nfs-ganesha-devel@lists.sourceforge.net >> > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> > >> >> -- >> Matt Benjamin >> Red Hat, Inc. >> 315 West Huron Street, Suite 140A >> Ann Arbor, Michigan 48103 >> >> http://www.redhat.com/en/technologies/storage >> >> tel. 734-821-5101 >> fax. 734-769-8938 >> cel. 734-216-5309 >> > > -- > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] drc refcnt
Agree, existing retire logic in dupreq_finish will not be changed. In addition to this, stale drc objects will be handled in the timeout function. Stale drc objects are drcs which aren't referenced in a while (maintain last_used timestamp in drc and update it on every ref). This stale drc can have upto 1000 dupreq objects. static void handle_stale_drcs(drc_t *drc) { lock_drc; while (entry_exists_in_list(dupreq_q)) { dv = TAILQ_REMOVE(dupreq_q); dv->next = prev; prev = dv; } unlock_drc; Now the dupreq entries are in the list pointed to by dv; /* At this point only other references to dv are threads which are actively using the dv. * These threads will just do entry_put and not add the dv back to dupreq_q list. * So, we arrested all the double frees. */ while (entry_exists_in_list(dv)) { hk = dv->hk; lock_partition(hk); remove_from_rb_tree(dv); entry_put(dv); unlock_partition(hk); } put_drc(drc); } This is what I had in mind. I could possibly have missed some race. Thanks, Satya. On Thu, May 4, 2017 at 8:07 AM, Malahal Naineni wrote: > Matt, you are correct. We lose some memory (drc and dupreqs) for a client > that never reconnects. Doing solely time based strategy is not scalable as > well unless we fork multiple threads for doing this. My understanding is > that there will be one time based strategy (hopefully, the time is long > enough that it does not interfere with current strategy) in __addition__ to > the current retiring strategy. > > Regards, Malahal. > > On Thu, May 4, 2017 at 3:56 AM, Matt Benjamin wrote: >> >> Hi Guys, >> >> To get on the record here, the current retire strategy using new requests >> to retire old ones is an intrinsic good, particularly with TCP and related >> cots-ord transports where requests are totally ordered. I don't think >> moving to a strictly time-based strategy is preferable. Apparently the >> actually observed or theorized issue has to do with not disposing of >> requests in invalidated DRCs? That seems to be a special case, no? >> >> Matt >> >> - Original Message - >> > From: "Malahal Naineni" >> > To: "Satya Prakash GS" >> > Cc: "Matt Benjamin" , >> > nfs-ganesha-devel@lists.sourceforge.net >> > Sent: Tuesday, May 2, 2017 2:21:48 AM >> > Subject: Re: [Nfs-ganesha-devel] drc refcnt >> > >> > Sorry, every cacheable request holds a ref on its DRC as well as its >> > DUPREQ. The ref on DUPREQ should be released when the request goes away >> > (via nfs_dupreq_rele). The ref on DRC will be released when the >> > corresponding DUPREQ request gets released. Since we release DUPREQs >> > while >> > processing other requests, you are right that the DRC won't be freed if >> > there are no more requests that would use the same DRC. >> > >> > I think we should be freeing dupreq periodically using a timed function, >> > something like that drc_free_expired. >> > >> > Regards, Malahal. >> > >> > >> > >> > On Tue, May 2, 2017 at 10:38 AM, Satya Prakash GS >> > >> > wrote: >> > >> > > > On Tue, May 2, 2017 at 7:58 AM, Malahal Naineni >> > > wrote: >> > > > A dupreq will place a refcount on its DRC when it calls xxx_get_drc, >> > > > so >> > > we >> > > > will release that DRC refcount when we free the dupreq. >> > > >> > > Ok, so every dupreq holds a ref on the drc. In case of drc cache hit, >> > > a dupreq entry can ref the >> > > drc more than once. This is still fine because unless the dupreq entry >> > > ref goes to zero the drc isn't freed. >> > > >> > > > nfs_dupreq_finish() shouldn't free its own dupreq. When it does free >> > > > some >> > > > other dupreq, we will release DRC refcount corresponding to that >> > > > dupreq. >> > > >> > > > When we free all dupreqs that belong to a DRC >> > > >> > > In the case of a disconnected client when are all the dupreqs freed ? >> > > >> > > When all the filesystem operations subside from a client (mount point >> > > is no longer in use), >> > > nfs_dupreq_finish doesn't get called anymore. This is the only place >> > > where dupreq entries are removed from >> > > the drc. If the entries aren't removed from drc, drc refcnt doesn't go >> > > to >> > > 0. >> > > >> > > >
[Nfs-ganesha-devel] review request https://review.gerrithub.io/#/c/390652/
Can somebody please review this change : https://review.gerrithub.io/#/c/390652/ It addresses this issue : Leak in DRC when client disconnects nfs_dupreq_finish doesn't call put_drc always. It does only if it meets certain criteria (drc_should_retire). This can leak the drc and the dupreq entries within it when the client disconnects. More information can be found here : https://sourceforge.net/p/nfs-ganesha/mailman/message/35815930/ Main idea behind the change. Introduced a new drc queue which holds all the active drc objects (tcp_drc_q in drc_st). Every new drc is added to tcp_drc_q initially. Eventually it is moved to tcp_drc_recycle_q. Drcs are freed from tcp_drc_recycle_q. Every drc is either in the active drc queue or in the recycle queue. DRC Refcount and transition from active drc to recycle queue : Drc refcnt is initialized to 2. In dupreq_start, increment the drc refcount. In dupreq_rele, decrement the drc refcnt. Drc refcnt is also decremented in nfs_rpc_free_user_data. When drc refcnt goes to 0 and drc is found not in use for 10 minutes, pick it up and free the entries in iterations of 32 items at at time. Once the dupreq entries goes to 0, remove the drc from tcp_drc_q and add it to tcp_drc_recycle_q. Today, entries added to tcp_drc_recycle_q are cleaned up periodically. Same logic should clean up these entries too. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] review request https://review.gerrithub.io/#/c/390652/
I had replied to the comments on the same day Matt posted. My replies show as drafts, looks like I have to publish them. I don't see a publish button either. Can you guys help me out. Thanks, Satya. On 9 Mar 2018 20:48, "Frank Filz" wrote: > Matt had called for additional discussion on this, so let's get that > discussion going. > > Could you address Matt's questions? > > Frank > > > -Original Message- > > From: Satya Prakash GS [mailto:g.satyaprak...@gmail.com] > > Sent: Friday, March 9, 2018 4:17 AM > > To: nfs-ganesha-devel@lists.sourceforge.net > > Cc: Malahal Naineni ; Frank Filz > > > > Subject: review request https://review.gerrithub.io/#/c/390652/ > > > > Can somebody please review this change : > > https://review.gerrithub.io/#/c/390652/ > > > > It addresses this issue : > > > > Leak in DRC when client disconnects nfs_dupreq_finish doesn't call > put_drc > > always. It does only if it meets certain criteria (drc_should_retire). > This can leak > > the drc and the dupreq entries within it when the client disconnects. > More > > information can be found here : https://sourceforge.net/p/nfs- > > ganesha/mailman/message/35815930/ > > > > > > > > Main idea behind the change. > > > > Introduced a new drc queue which holds all the active drc objects > (tcp_drc_q in > > drc_st). > > Every new drc is added to tcp_drc_q initially. Eventually it is moved to > > tcp_drc_recycle_q. Drcs are freed from tcp_drc_recycle_q. Every drc is > either in > > the active drc queue or in the recycle queue. > > > > DRC Refcount and transition from active drc to recycle queue : > > > > Drc refcnt is initialized to 2. In dupreq_start, increment the drc > refcount. In > > dupreq_rele, decrement the drc refcnt. Drc refcnt is also decremented in > > nfs_rpc_free_user_data. When drc refcnt goes to 0 and drc is found not > in use > > for 10 minutes, pick it up and free the entries in iterations of 32 > items at at time. > > Once the dupreq entries goes to 0, remove the drc from tcp_drc_q and add > it to > > tcp_drc_recycle_q. Today, entries added to tcp_drc_recycle_q are cleaned > up > > periodically. Same logic should clean up these entries too. > > > > Thanks, > > Satya. > > -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] review request https://review.gerrithub.io/#/c/390652/
Aah. Now I could publish the comments. Thank you Matt. Regards, Satya. On Fri, Mar 9, 2018 at 9:53 PM, Matt Benjamin wrote: > Hi Satya, > > To reply, to a reply on the top level (can even be blank), all your > inline comments will publish then. > > Matt > > On Fri, Mar 9, 2018 at 11:21 AM, Satya Prakash GS > wrote: >> I had replied to the comments on the same day Matt posted. My replies show >> as drafts, looks like I have to publish them. I don't see a publish button >> either. Can you guys help me out. >> >> Thanks, >> Satya. >> >> On 9 Mar 2018 20:48, "Frank Filz" wrote: >>> >>> Matt had called for additional discussion on this, so let's get that >>> discussion going. >>> >>> Could you address Matt's questions? >>> >>> Frank >>> >>> > -Original Message- >>> > From: Satya Prakash GS [mailto:g.satyaprak...@gmail.com] >>> > Sent: Friday, March 9, 2018 4:17 AM >>> > To: nfs-ganesha-devel@lists.sourceforge.net >>> > Cc: Malahal Naineni ; Frank Filz >>> > >>> > Subject: review request https://review.gerrithub.io/#/c/390652/ >>> > >>> > Can somebody please review this change : >>> > https://review.gerrithub.io/#/c/390652/ >>> > >>> > It addresses this issue : >>> > >>> > Leak in DRC when client disconnects nfs_dupreq_finish doesn't call >>> > put_drc >>> > always. It does only if it meets certain criteria (drc_should_retire). >>> > This can leak >>> > the drc and the dupreq entries within it when the client disconnects. >>> > More >>> > information can be found here : https://sourceforge.net/p/nfs- >>> > ganesha/mailman/message/35815930/ >>> > >>> > >>> > >>> > Main idea behind the change. >>> > >>> > Introduced a new drc queue which holds all the active drc objects >>> > (tcp_drc_q in >>> > drc_st). >>> > Every new drc is added to tcp_drc_q initially. Eventually it is moved to >>> > tcp_drc_recycle_q. Drcs are freed from tcp_drc_recycle_q. Every drc is >>> > either in >>> > the active drc queue or in the recycle queue. >>> > >>> > DRC Refcount and transition from active drc to recycle queue : >>> > >>> > Drc refcnt is initialized to 2. In dupreq_start, increment the drc >>> > refcount. In >>> > dupreq_rele, decrement the drc refcnt. Drc refcnt is also decremented in >>> > nfs_rpc_free_user_data. When drc refcnt goes to 0 and drc is found not >>> > in use >>> > for 10 minutes, pick it up and free the entries in iterations of 32 >>> > items at at time. >>> > Once the dupreq entries goes to 0, remove the drc from tcp_drc_q and add >>> > it to >>> > tcp_drc_recycle_q. Today, entries added to tcp_drc_recycle_q are cleaned >>> > up >>> > periodically. Same logic should clean up these entries too. >>> > >>> > Thanks, >>> > Satya. >>> >> >> -- >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> ___ >> Nfs-ganesha-devel mailing list >> Nfs-ganesha-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >> > > > > -- > > Matt Benjamin > Red Hat, Inc. > 315 West Huron Street, Suite 140A > Ann Arbor, Michigan 48103 > > http://www.redhat.com/en/technologies/storage > > tel. 734-821-5101 > fax. 734-769-8938 > cel. 734-216-5309 -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
Re: [Nfs-ganesha-devel] nfs4 idmapping issue with sssd fully qualified domain names
This list has been deprecated. Please subscribe to the new devel list at lists.nfs-ganesha.org. We hit the exact bug that was mentioned here. https://bugzilla.redhat.com/show_bug.cgi?id=1378557 The issue was happening only with multiple AD domains configured and trust established between them. libnfsidmap was stripping the domain name if a username is passed in fully-qualified domain name format. No-Strip option in idmapd.conf has to be set to "both" to stop nfsidmap from stripping the domain name. Even with this option set, I could not get it working with libnfsidmap-0.25/26. I could only get it working with libnfsidmap-0.27. I compiled it, replaced both libraries, libnfsidmap.so and nsswitch.so and username was properly being passed to the layer below (sssd/winbind). With this fix, id is properly being resolved with sssd and winbind. Thanks, Satya. -- Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot ___ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel