Looks like the gen field in svc_rpc_gss_data is used to check the freshness of a context. However it is not initialized to axp->gen in authgss_ctx_hash_set. Will this not result in evicting the entries out early or am I missing something ?
Thanks, Satya. On Tue, Mar 7, 2017 at 4:36 PM, Satya Prakash GS <g.satyaprak...@gmail.com> wrote: >>On 3/7/17 4:56 AM, William Allen Simpson wrote: >> On 3/6/17 6:58 PM, Matt Benjamin wrote: >>> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be >>> correct. If the client has just refreshed its credentials, why is it >>> continuing to send with the expired context? >>> > > Thank you for the reply. > > The client may not be sending expired credentials but it is supposed > to reestablish the credentials using > RPC_GSS_PROC_DESTROY/RPC_GSS_PROC_INIT which I guess is not happening. > I am continuing to debug this further. > > As per the RFC, Ganesha is supposed to be throwing AUTH_REJECTEDCRED > instead of RPCSEC_GSS_CREDPROBLEM when it doesn't find credentials in > the cache. > > However the nfs client handles AUTH_REJECTEDCRED, > RPCSEC_GSS_CREDPROBLEM similarly. I am not hopeful of this change but > I can give it a try. > >> I don't know, but I'll take a look. Now that we always have a server >> for a client, perhaps the cache can be moved into a shared structure? > >>Sorry, thought that was our ntirpc client. Looking back, that's the >>kernel client. Not much we can do about the kernel client other than >>report a bug. > > William, > I want to be sure it's a client bug and not Ganesha bug before putting > it on the kernel mailing list. Given that the issue is reproducible > twice/thrice a week I am wondering how it would have gone unreported > so far. > > Regards, > Satya. > > > > On Tue, Mar 7, 2017 at 5:28 AM, Matt Benjamin <mbenja...@redhat.com> wrote: >> Hi Satya, >> >> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be >> correct. If the client has just refreshed its credentials, why is it >> continuing to send with the expired context? >> >> Matt >> >> ----- Original Message ----- >>> From: "Satya Prakash GS" <g.satyaprak...@gmail.com> >>> To: nfs-ganesha-devel@lists.sourceforge.net >>> Sent: Monday, March 6, 2017 1:10:36 PM >>> Subject: Re: [Nfs-ganesha-devel] Permission denied error with Kerberos >>> enabled >>> >>> With libntirpc debugs enabled I could see all the three retries are >>> failing because of the unavailability of the creds in the cache. The >>> credentials are being removed by the reaper in the authgss_ctx_gc_idle >>> because of this condition - >>> abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen >>> >>> >From the code, I can see that only a further RPCSEC_GSS_INIT call from >>> the client can repopulate the credentials in the cache. I am not sure >>> how server can dictate client to establish the context again. >>> >>> Any help is appreciated. >>> >>> Thanks, >>> Satya. >>> >>> On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS >>> <g.satyaprak...@gmail.com> wrote: >>> > Hi, >>> > >>> > I am seeing "Permission denied" errors while running iozone on nfs >>> > client with kerberos enabled. Digging further, I found there are a lot >>> > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates >>> > 2 errors from server and tries to refresh the credentials. On the >>> > third call it would throw an error to the application. >>> > >>> > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343 >>> > >>> > 2395 switch ((n = ntohl(*p++))) { >>> > 2396 case RPC_AUTH_REJECTEDCRED: >>> > 2397 case RPC_AUTH_REJECTEDVERF: >>> > 2398 case RPCSEC_GSS_CREDPROBLEM: >>> > 2399 case RPCSEC_GSS_CTXPROBLEM: >>> > 2400 if (!task->tk_cred_retry) >>> > 2401 break; >>> > 2402 task->tk_cred_retry--; >>> > 2403 dprintk("RPC: %5u %s: retry stale creds\n", >>> > 2404 task->tk_pid, __func__); >>> > 2405 rpcauth_invalcred(task); >>> > >>> > >>> > On the client I have seen this message twice : >>> > >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry >>> > stale creds >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS >>> > cred ffff880544ce4600 >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request >>> > ffff8804062e7000 >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport >>> > ffff8808723c5800 >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue >>> > "xprt_sending" time 25264836677) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 added to queue >>> > ffff8808723c5990 "xprt_sending" >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_wake_up_task (now >>> > 25264836722) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 removed from queue >>> > ffff8808723c5990 "xprt_sending" >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801 >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status >>> > -11) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_retry_reserve (status 0) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 reserved req >>> > ffff8806c2e01a00 xid 929383d1 >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status 0) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_refresh (status 0) >>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 refreshing RPCSEC_GSS >>> > cred ffff88086f634240 >>> > >>> > On the third occurrence the filesystem OP failed : >>> > >>> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801 >>> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_status (status 20) >>> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_decode (status 20) >>> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call >>> > rejected 2 >>> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call >>> > failed with error -13 >>> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 return 0, status -13 >>> > >>> > Say, the ticket has expired (within the renewable lifetime) and the >>> > server did not find the ticket in the cache for the first time but the >>> > second/third call shouldn't ideally fail when the credentials were >>> > just refreshed through an upcall. Unavailability of the creds in the >>> > cache/a failing svcauth_gss_accept_sec_context call could throw the >>> > REJECTEDCRED error. Could you share some pointers on which is more >>> > likely or if there is something else that could cause this issue. >>> > >>> > Thanks, >>> > Satya. >>> >>> ------------------------------------------------------------------------------ >>> Announcing the Oxford Dictionaries API! The API offers world-renowned >>> dictionary content that is easy and intuitive to access. Sign up for an >>> account today to start using our lexical data to power your apps and >>> projects. Get started today and enter our developer competition. >>> http://sdm.link/oxford >>> _______________________________________________ >>> Nfs-ganesha-devel mailing list >>> Nfs-ganesha-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel >>> >> >> -- >> Matt Benjamin >> Red Hat, Inc. >> 315 West Huron Street, Suite 140A >> Ann Arbor, Michigan 48103 >> >> http://www.redhat.com/en/technologies/storage >> >> tel. 734-821-5101 >> fax. 734-769-8938 >> cel. 734-216-5309 ------------------------------------------------------------------------------ Announcing the Oxford Dictionaries API! The API offers world-renowned dictionary content that is easy and intuitive to access. Sign up for an account today to start using our lexical data to power your apps and projects. Get started today and enter our developer competition. http://sdm.link/oxford _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel