[Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-01 Thread Satya Prakash GS
Hi,

I am seeing "Permission denied" errors while running iozone on nfs
client with kerberos enabled. Digging further, I found there are a lot
of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates
2 errors from server and tries to refresh the credentials. On the
third call it would throw an error to the application.

http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343

2395 switch ((n = ntohl(*p++))) {
2396 case RPC_AUTH_REJECTEDCRED:
2397 case RPC_AUTH_REJECTEDVERF:
2398 case RPCSEC_GSS_CREDPROBLEM:
2399 case RPCSEC_GSS_CTXPROBLEM:
2400 if (!task->tk_cred_retry)
2401 break;
2402 task->tk_cred_retry--;
2403 dprintk("RPC: %5u %s: retry stale creds\n",
2404 task->tk_pid, __func__);
2405 rpcauth_invalcred(task);


On the client I have seen this message twice :

Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry
stale creds
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS
cred 880544ce4600
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request 8804062e7000
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport
8808723c5800
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue
"xprt_sending" time 25264836677)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 added to queue
8808723c5990 "xprt_sending"
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_wake_up_task (now
25264836722)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 removed from queue
8808723c5990 "xprt_sending"
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status -11)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_retry_reserve (status 0)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 reserved req
8806c2e01a00 xid 929383d1
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status 0)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_refresh (status 0)
Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 refreshing RPCSEC_GSS
cred 88086f634240

On the third occurrence the filesystem OP failed :

Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801
Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call rejected 2
Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call
failed with error -13
Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 return 0, status -13

Say, the ticket has expired (within the renewable lifetime) and the
server did not find the ticket in the cache for the first time but the
second/third call shouldn't ideally fail when the credentials were
just refreshed through an upcall. Unavailability of the creds in the
cache/a failing svcauth_gss_accept_sec_context call could throw the
REJECTEDCRED error. Could you share some pointers on which is more
likely or if there is something else that could cause this issue.

Thanks,
Satya.

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-06 Thread Satya Prakash GS
With libntirpc debugs enabled I could see all the three retries are
failing because of the unavailability of the creds in the cache. The
credentials are being removed by the reaper in the authgss_ctx_gc_idle
because of this condition -
abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen

>From the code, I can see that only a further RPCSEC_GSS_INIT call from
the client can repopulate the credentials in the cache. I am not sure
how server can dictate client to establish the context again.

Any help is appreciated.

Thanks,
Satya.

On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS
 wrote:
> Hi,
>
> I am seeing "Permission denied" errors while running iozone on nfs
> client with kerberos enabled. Digging further, I found there are a lot
> of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates
> 2 errors from server and tries to refresh the credentials. On the
> third call it would throw an error to the application.
>
> http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343
>
> 2395 switch ((n = ntohl(*p++))) {
> 2396 case RPC_AUTH_REJECTEDCRED:
> 2397 case RPC_AUTH_REJECTEDVERF:
> 2398 case RPCSEC_GSS_CREDPROBLEM:
> 2399 case RPCSEC_GSS_CTXPROBLEM:
> 2400 if (!task->tk_cred_retry)
> 2401 break;
> 2402 task->tk_cred_retry--;
> 2403 dprintk("RPC: %5u %s: retry stale creds\n",
> 2404 task->tk_pid, __func__);
> 2405 rpcauth_invalcred(task);
>
>
> On the client I have seen this message twice :
>
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry
> stale creds
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS
> cred 880544ce4600
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request 8804062e7000
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport
> 8808723c5800
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue
> "xprt_sending" time 25264836677)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 added to queue
> 8808723c5990 "xprt_sending"
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_wake_up_task (now
> 25264836722)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 removed from queue
> 8808723c5990 "xprt_sending"
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status -11)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_retry_reserve (status 0)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 reserved req
> 8806c2e01a00 xid 929383d1
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status 0)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_refresh (status 0)
> Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 refreshing RPCSEC_GSS
> cred 88086f634240
>
> On the third occurrence the filesystem OP failed :
>
> Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801
> Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
> Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
> Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call rejected 
> 2
> Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call
> failed with error -13
> Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 return 0, status -13
>
> Say, the ticket has expired (within the renewable lifetime) and the
> server did not find the ticket in the cache for the first time but the
> second/third call shouldn't ideally fail when the credentials were
> just refreshed through an upcall. Unavailability of the creds in the
> cache/a failing svcauth_gss_accept_sec_context call could throw the
> REJECTEDCRED error. Could you share some pointers on which is more
> likely or if there is something else that could cause this issue.
>
> Thanks,
> Satya.

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-06 Thread Matt Benjamin
Hi Satya,

Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
correct.  If the client has just refreshed its credentials, why is it 
continuing to send with the expired context?

Matt

- Original Message -
> From: "Satya Prakash GS" 
> To: nfs-ganesha-devel@lists.sourceforge.net
> Sent: Monday, March 6, 2017 1:10:36 PM
> Subject: Re: [Nfs-ganesha-devel] Permission denied error with Kerberos
> enabled
> 
> With libntirpc debugs enabled I could see all the three retries are
> failing because of the unavailability of the creds in the cache. The
> credentials are being removed by the reaper in the authgss_ctx_gc_idle
> because of this condition -
> abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen
> 
> >From the code, I can see that only a further RPCSEC_GSS_INIT call from
> the client can repopulate the credentials in the cache. I am not sure
> how server can dictate client to establish the context again.
> 
> Any help is appreciated.
> 
> Thanks,
> Satya.
> 
> On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS
>  wrote:
> > Hi,
> >
> > I am seeing "Permission denied" errors while running iozone on nfs
> > client with kerberos enabled. Digging further, I found there are a lot
> > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates
> > 2 errors from server and tries to refresh the credentials. On the
> > third call it would throw an error to the application.
> >
> > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343
> >
> > 2395 switch ((n = ntohl(*p++))) {
> > 2396 case RPC_AUTH_REJECTEDCRED:
> > 2397 case RPC_AUTH_REJECTEDVERF:
> > 2398 case RPCSEC_GSS_CREDPROBLEM:
> > 2399 case RPCSEC_GSS_CTXPROBLEM:
> > 2400 if (!task->tk_cred_retry)
> > 2401 break;
> > 2402 task->tk_cred_retry--;
> > 2403 dprintk("RPC: %5u %s: retry stale creds\n",
> > 2404 task->tk_pid, __func__);
> > 2405 rpcauth_invalcred(task);
> >
> >
> > On the client I have seen this message twice :
> >
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry
> > stale creds
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS
> > cred 880544ce4600
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request
> > 8804062e7000
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport
> > 8808723c5800
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue
> > "xprt_sending" time 25264836677)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 added to queue
> > 8808723c5990 "xprt_sending"
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_wake_up_task (now
> > 25264836722)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 removed from queue
> > 8808723c5990 "xprt_sending"
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status
> > -11)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_retry_reserve (status 0)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 reserved req
> > 8806c2e01a00 xid 929383d1
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserveresult (status 0)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_refresh (status 0)
> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 refreshing RPCSEC_GSS
> > cred 88086f634240
> >
> > On the third occurrence the filesystem OP failed :
> >
> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 __rpc_execute flags=0x801
> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call
> > rejected 2
> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: call
> > failed with error -13
> > Feb 26 10:28:25 atsqa6c71 kernel: RPC: 39431 return 0, status -13
> >
> > Say, the ticket has expired (within the renewable lifetime) and the
> > server did not find the tic

Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-07 Thread William Allen Simpson
On 3/6/17 6:58 PM, Matt Benjamin wrote:
> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
> correct.  If the client has just refreshed its credentials, why is it 
> continuing to send with the expired context?
>
I don't know, but I'll take a look.  Now that we always have a server
for a client, perhaps the cache can be moved into a shared structure?3

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-07 Thread William Allen Simpson
On 3/7/17 4:56 AM, William Allen Simpson wrote:
> On 3/6/17 6:58 PM, Matt Benjamin wrote:
>> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
>> correct.  If the client has just refreshed its credentials, why is it 
>> continuing to send with the expired context?
>>
> I don't know, but I'll take a look.  Now that we always have a server
> for a client, perhaps the cache can be moved into a shared structure?

Sorry, thought that was our ntirpc client.  Looking back, that's the
kernel client.  Not much we can do about the kernel client other than
report a bug.

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-07 Thread Satya Prakash GS
>On 3/7/17 4:56 AM, William Allen Simpson wrote:
> On 3/6/17 6:58 PM, Matt Benjamin wrote:
>> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
>> correct.  If the client has just refreshed its credentials, why is it 
>> continuing to send with the expired context?
>>

Thank you for the reply.

The client may not be sending expired credentials but it is supposed
to reestablish the credentials using
RPC_GSS_PROC_DESTROY/RPC_GSS_PROC_INIT which I guess is not happening.
I am continuing to debug this further.

As per the RFC, Ganesha is supposed to be throwing AUTH_REJECTEDCRED
instead of RPCSEC_GSS_CREDPROBLEM when it doesn't find credentials in
the cache.

However the nfs client handles AUTH_REJECTEDCRED,
RPCSEC_GSS_CREDPROBLEM similarly. I am not hopeful of this change but
I can give it a try.

> I don't know, but I'll take a look.  Now that we always have a server
> for a client, perhaps the cache can be moved into a shared structure?

>Sorry, thought that was our ntirpc client.  Looking back, that's the
>kernel client.  Not much we can do about the kernel client other than
>report a bug.

William,
I want to be sure it's a client bug and not Ganesha bug before putting
it on the kernel mailing list. Given that the issue is reproducible
twice/thrice a week I am wondering how it would have gone unreported
so far.

Regards,
Satya.



On Tue, Mar 7, 2017 at 5:28 AM, Matt Benjamin  wrote:
> Hi Satya,
>
> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
> correct.  If the client has just refreshed its credentials, why is it 
> continuing to send with the expired context?
>
> Matt
>
> - Original Message -
>> From: "Satya Prakash GS" 
>> To: nfs-ganesha-devel@lists.sourceforge.net
>> Sent: Monday, March 6, 2017 1:10:36 PM
>> Subject: Re: [Nfs-ganesha-devel] Permission denied error with Kerberos   
>>  enabled
>>
>> With libntirpc debugs enabled I could see all the three retries are
>> failing because of the unavailability of the creds in the cache. The
>> credentials are being removed by the reaper in the authgss_ctx_gc_idle
>> because of this condition -
>> abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen
>>
>> >From the code, I can see that only a further RPCSEC_GSS_INIT call from
>> the client can repopulate the credentials in the cache. I am not sure
>> how server can dictate client to establish the context again.
>>
>> Any help is appreciated.
>>
>> Thanks,
>> Satya.
>>
>> On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS
>>  wrote:
>> > Hi,
>> >
>> > I am seeing "Permission denied" errors while running iozone on nfs
>> > client with kerberos enabled. Digging further, I found there are a lot
>> > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates
>> > 2 errors from server and tries to refresh the credentials. On the
>> > third call it would throw an error to the application.
>> >
>> > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343
>> >
>> > 2395 switch ((n = ntohl(*p++))) {
>> > 2396 case RPC_AUTH_REJECTEDCRED:
>> > 2397 case RPC_AUTH_REJECTEDVERF:
>> > 2398 case RPCSEC_GSS_CREDPROBLEM:
>> > 2399 case RPCSEC_GSS_CTXPROBLEM:
>> > 2400 if (!task->tk_cred_retry)
>> > 2401 break;
>> > 2402 task->tk_cred_retry--;
>> > 2403 dprintk("RPC: %5u %s: retry stale creds\n",
>> > 2404 task->tk_pid, __func__);
>> > 2405 rpcauth_invalcred(task);
>> >
>> >
>> > On the client I have seen this message twice :
>> >
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_status (status 20)
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_decode (status 20)
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 rpc_verify_header: retry
>> > stale creds
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 invalidating RPCSEC_GSS
>> > cred 880544ce4600
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 release request
>> > 8804062e7000
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 call_reserve (status 0)
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 failed to lock transport
>> > 8808723c5800
>> > Feb 26 10:27:01 atsqa6c71 kernel: RPC: 39431 sleep_on(queue
&

Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-09 Thread Satya Prakash GS
Looks like the gen field in svc_rpc_gss_data is used to check the
freshness of a context. However it is not initialized to axp->gen in
authgss_ctx_hash_set.
Will this not result in evicting the entries out early or am I missing
something ?

Thanks,
Satya.


On Tue, Mar 7, 2017 at 4:36 PM, Satya Prakash GS
 wrote:
>>On 3/7/17 4:56 AM, William Allen Simpson wrote:
>> On 3/6/17 6:58 PM, Matt Benjamin wrote:
>>> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
>>> correct.  If the client has just refreshed its credentials, why is it 
>>> continuing to send with the expired context?
>>>
>
> Thank you for the reply.
>
> The client may not be sending expired credentials but it is supposed
> to reestablish the credentials using
> RPC_GSS_PROC_DESTROY/RPC_GSS_PROC_INIT which I guess is not happening.
> I am continuing to debug this further.
>
> As per the RFC, Ganesha is supposed to be throwing AUTH_REJECTEDCRED
> instead of RPCSEC_GSS_CREDPROBLEM when it doesn't find credentials in
> the cache.
>
> However the nfs client handles AUTH_REJECTEDCRED,
> RPCSEC_GSS_CREDPROBLEM similarly. I am not hopeful of this change but
> I can give it a try.
>
>> I don't know, but I'll take a look.  Now that we always have a server
>> for a client, perhaps the cache can be moved into a shared structure?
>
>>Sorry, thought that was our ntirpc client.  Looking back, that's the
>>kernel client.  Not much we can do about the kernel client other than
>>report a bug.
>
> William,
> I want to be sure it's a client bug and not Ganesha bug before putting
> it on the kernel mailing list. Given that the issue is reproducible
> twice/thrice a week I am wondering how it would have gone unreported
> so far.
>
> Regards,
> Satya.
>
>
>
> On Tue, Mar 7, 2017 at 5:28 AM, Matt Benjamin  wrote:
>> Hi Satya,
>>
>> Looking briefly at section 5.3.3.3 of rfc2203, it seems like that would be 
>> correct.  If the client has just refreshed its credentials, why is it 
>> continuing to send with the expired context?
>>
>> Matt
>>
>> - Original Message -
>>> From: "Satya Prakash GS" 
>>> To: nfs-ganesha-devel@lists.sourceforge.net
>>> Sent: Monday, March 6, 2017 1:10:36 PM
>>> Subject: Re: [Nfs-ganesha-devel] Permission denied error with Kerberos  
>>>   enabled
>>>
>>> With libntirpc debugs enabled I could see all the three retries are
>>> failing because of the unavailability of the creds in the cache. The
>>> credentials are being removed by the reaper in the authgss_ctx_gc_idle
>>> because of this condition -
>>> abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen
>>>
>>> >From the code, I can see that only a further RPCSEC_GSS_INIT call from
>>> the client can repopulate the credentials in the cache. I am not sure
>>> how server can dictate client to establish the context again.
>>>
>>> Any help is appreciated.
>>>
>>> Thanks,
>>> Satya.
>>>
>>> On Wed, Mar 1, 2017 at 7:35 PM, Satya Prakash GS
>>>  wrote:
>>> > Hi,
>>> >
>>> > I am seeing "Permission denied" errors while running iozone on nfs
>>> > client with kerberos enabled. Digging further, I found there are a lot
>>> > of AUTH_REJECTEDCRED messages in nfs server log. NFS client tolerates
>>> > 2 errors from server and tries to refresh the credentials. On the
>>> > third call it would throw an error to the application.
>>> >
>>> > http://lxr.free-electrons.com/source/net/sunrpc/clnt.c#L2343
>>> >
>>> > 2395 switch ((n = ntohl(*p++))) {
>>> > 2396 case RPC_AUTH_REJECTEDCRED:
>>> > 2397 case RPC_AUTH_REJECTEDVERF:
>>> > 2398 case RPCSEC_GSS_CREDPROBLEM:
>>> > 2399 case RPCSEC_GSS_CTXPROBLEM:
>>> > 2400 if (!task->tk_cred_retry)
>>> > 2401 break;
>>> > 2402 task->tk_cred_retry--;
>>> > 2403 dprintk("RPC: %5u %s: retry stale creds\n",
>>> > 2404 task->tk_pid, __func__);
>>> > 2405 rpcauth_invalcred(task);
>>> >
>>> >
>>> > On the client I have seen this message twice :
>>> >
>>> > Feb 26 10:27:01 atsqa6c71 ker

Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-10 Thread Satya Prakash GS
Is this a possibility :

Server first rejects a client op with CREDPROBLEM/REJECTEDCRED,
Client does an upcall and gssd initializes the context with the server.
However the server recycles it immediately before the operation was
retried (looks like there is a bug in the LRU implementation on
Ganesha. To make things worse, I enabled the server debugs and it
slowed down the client operations making the eviction of the entry
easier). This happens thrice failing the client op.

Thanks,
Satya.

On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS
 wrote:
> Looks like the gen field in svc_rpc_gss_data is used to check the
> freshness of a context. However it is not initialized to axp->gen in
> authgss_ctx_hash_set.
> Will this not result in evicting the entries out early or am I missing
> something ?
>
> Thanks,
> Satya.
>

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-10 Thread William Allen Simpson
I'm not familiar with this code, so not likely to be much help.
Looks mostly written by Matt, but Malahal made the most recent
changes in July 2016.

On 3/10/17 9:35 AM, Satya Prakash GS wrote:
> Is this a possibility :
>
> Server first rejects a client op with CREDPROBLEM/REJECTEDCRED,
> Client does an upcall and gssd initializes the context with the server.
> However the server recycles it immediately before the operation was
> retried (looks like there is a bug in the LRU implementation on
> Ganesha. To make things worse, I enabled the server debugs and it
> slowed down the client operations making the eviction of the entry
> easier). This happens thrice failing the client op.
>
Problem is not obvious.

axp->gen is initialized to zero with the rest of *axp -- mem_zalloc().

gd->gen is initialized to zero by alloc_svc_rpc_gss_data().

axp->gen is bumped by one (++) each time it is handled by LRU code in
authgss_ctx_hash_get().

atomic_inc_uint32_t(&gd->gen) is immediately after that.

You think gd->gen also needs to be set to axp->gen in _set()?

I'm not sure they are related.  There are many gd per axp, so
axp->gen could be much higher than gd->gen.

Both _get and _set are only called in svc_auth_gss.c _svcauth_gss().

Admittedly, it is hard to track that there are 2 fields both called gen.

> Thanks,
> Satya.
>
> On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS
>  wrote:
>> Looks like the gen field in svc_rpc_gss_data is used to check the
>> freshness of a context. However it is not initialized to axp->gen in
>> authgss_ctx_hash_set.
>> Will this not result in evicting the entries out early or am I missing
>> something ?
>>
>> Thanks,
>> Satya.
>>
>
> --
> Announcing the Oxford Dictionaries API! The API offers world-renowned
> dictionary content that is easy and intuitive to access. Sign up for an
> account today to start using our lexical data to power your apps and
> projects. Get started today and enter our developer competition.
> http://sdm.link/oxford
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>


--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-10 Thread Satya Prakash GS
On Sat, Mar 11, 2017 at 12:37 AM, William Allen Simpson
 wrote:
> I'm not familiar with this code, so not likely to be much help.
> Looks mostly written by Matt, but Malahal made the most recent
> changes in July 2016.
>
> On 3/10/17 9:35 AM, Satya Prakash GS wrote:
>>
>> Is this a possibility :
>>
>> Server first rejects a client op with CREDPROBLEM/REJECTEDCRED,
>> Client does an upcall and gssd initializes the context with the server.
>> However the server recycles it immediately before the operation was
>> retried (looks like there is a bug in the LRU implementation on
>> Ganesha. To make things worse, I enabled the server debugs and it
>> slowed down the client operations making the eviction of the entry
>> easier). This happens thrice failing the client op.
>>
> Problem is not obvious.
>
> axp->gen is initialized to zero with the rest of *axp -- mem_zalloc().
>
> gd->gen is initialized to zero by alloc_svc_rpc_gss_data().
>
> axp->gen is bumped by one (++) each time it is handled by LRU code in
> authgss_ctx_hash_get().
>

If a node gen isn't getting incremented it means that node is not
being looked up often.

> atomic_inc_uint32_t(&gd->gen) is immediately after that.
>
> You think gd->gen also needs to be set to axp->gen in _set()?
>

> I'm not sure they are related.  There are many gd per axp, so
> axp->gen could be much higher than gd->gen.
>

>From authgss_ctx_gc_idle ->

if (abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen) {
Remove the entry from the tree; //gd is no more in the cache after this
}

Translates to - gd wasn't looked up in quite sometime let's clean it up.

//gss.max_idle_gen -> by default set to 1024

If tree's gen is 5000 and a new node gets inserted into the tree, node
gen shouldn't start at 0 or it might pass the above condition in the
next authgss_ctx_gc_idle call.

> Both _get and _set are only called in svc_auth_gss.c _svcauth_gss().
>
> Admittedly, it is hard to track that there are 2 fields both called gen.
>
>> Thanks,
>> Satya.
>>
>> On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS
>>  wrote:
>>>
>>> Looks like the gen field in svc_rpc_gss_data is used to check the
>>> freshness of a context. However it is not initialized to axp->gen in
>>> authgss_ctx_hash_set.
>>> Will this not result in evicting the entries out early or am I missing
>>> something ?
>>>
>>> Thanks,
>>> Satya.
>>>
>>
>>
>> --
>> Announcing the Oxford Dictionaries API! The API offers world-renowned
>> dictionary content that is easy and intuitive to access. Sign up for an
>> account today to start using our lexical data to power your apps and
>> projects. Get started today and enter our developer competition.
>> http://sdm.link/oxford
>> ___
>> Nfs-ganesha-devel mailing list
>> Nfs-ganesha-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>
>

Thanks,
Satya.

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-11 Thread Malahal Naineni
gd->gen is not used in the latest code. If I remember, there was a bug
removing recent cached entries resulting in permission errors. What
version are you using? Try using V2.5.

Regards, Malahal.

On Sat, Mar 11, 2017 at 12:54 AM, Satya Prakash GS
 wrote:
> On Sat, Mar 11, 2017 at 12:37 AM, William Allen Simpson
>  wrote:
>> I'm not familiar with this code, so not likely to be much help.
>> Looks mostly written by Matt, but Malahal made the most recent
>> changes in July 2016.
>>
>> On 3/10/17 9:35 AM, Satya Prakash GS wrote:
>>>
>>> Is this a possibility :
>>>
>>> Server first rejects a client op with CREDPROBLEM/REJECTEDCRED,
>>> Client does an upcall and gssd initializes the context with the server.
>>> However the server recycles it immediately before the operation was
>>> retried (looks like there is a bug in the LRU implementation on
>>> Ganesha. To make things worse, I enabled the server debugs and it
>>> slowed down the client operations making the eviction of the entry
>>> easier). This happens thrice failing the client op.
>>>
>> Problem is not obvious.
>>
>> axp->gen is initialized to zero with the rest of *axp -- mem_zalloc().
>>
>> gd->gen is initialized to zero by alloc_svc_rpc_gss_data().
>>
>> axp->gen is bumped by one (++) each time it is handled by LRU code in
>> authgss_ctx_hash_get().
>>
>
> If a node gen isn't getting incremented it means that node is not
> being looked up often.
>
>> atomic_inc_uint32_t(&gd->gen) is immediately after that.
>>
>> You think gd->gen also needs to be set to axp->gen in _set()?
>>
>
>> I'm not sure they are related.  There are many gd per axp, so
>> axp->gen could be much higher than gd->gen.
>>
>
> >From authgss_ctx_gc_idle ->
>
> if (abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen) {
> Remove the entry from the tree; //gd is no more in the cache after this
> }
>
> Translates to - gd wasn't looked up in quite sometime let's clean it up.
>
> //gss.max_idle_gen -> by default set to 1024
>
> If tree's gen is 5000 and a new node gets inserted into the tree, node
> gen shouldn't start at 0 or it might pass the above condition in the
> next authgss_ctx_gc_idle call.
>
>> Both _get and _set are only called in svc_auth_gss.c _svcauth_gss().
>>
>> Admittedly, it is hard to track that there are 2 fields both called gen.
>>
>>> Thanks,
>>> Satya.
>>>
>>> On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS
>>>  wrote:

 Looks like the gen field in svc_rpc_gss_data is used to check the
 freshness of a context. However it is not initialized to axp->gen in
 authgss_ctx_hash_set.
 Will this not result in evicting the entries out early or am I missing
 something ?

 Thanks,
 Satya.

>>>
>>>
>>> --
>>> Announcing the Oxford Dictionaries API! The API offers world-renowned
>>> dictionary content that is easy and intuitive to access. Sign up for an
>>> account today to start using our lexical data to power your apps and
>>> projects. Get started today and enter our developer competition.
>>> http://sdm.link/oxford
>>> ___
>>> Nfs-ganesha-devel mailing list
>>> Nfs-ganesha-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>>>
>>
>
> Thanks,
> Satya.
>
> --
> Announcing the Oxford Dictionaries API! The API offers world-renowned
> dictionary content that is easy and intuitive to access. Sign up for an
> account today to start using our lexical data to power your apps and
> projects. Get started today and enter our developer competition.
> http://sdm.link/oxford
> ___
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-11 Thread Satya Prakash GS
We are using 2.3-stable. Given that most of our testing has been done
it's a bit difficult for us to move to 2.5 now but we can take fixes
from 2.5.

I put a similar fix (to the existing one in 2.5) but I am running into
another issue around the ticket renew time. Operations are failing on
the client with auth check failed error -13 (Permission denied). I see
this after every ticket renewal :

27510445:442377371:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114 gss_validate
27510447:442377373:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114
gss_validate: gss_verify_mic returned error 0x000c
27510448:442377374:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114
gss_validate failed ret -13.
27510449:442377375:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114
rpc_verify_header: auth check failed with -13
27510450:442377376:Mar 10 08:37:21 atsqa8c43 kernel: RPC: 60114
rpc_verify_header: retrying

gss_verify_mic failed on client with error code set to GSS_S_CONTEXT_EXPIRED.
Either the server or the client is using the wrong context while
wrapping/unwrapping.

Do you remember fixing bug like that ?

Thanks,
Satya.


On Sat, Mar 11, 2017 at 5:55 PM, Malahal Naineni  wrote:
> gd->gen is not used in the latest code. If I remember, there was a bug
> removing recent cached entries resulting in permission errors. What
> version are you using? Try using V2.5.
>
> Regards, Malahal.
>
> On Sat, Mar 11, 2017 at 12:54 AM, Satya Prakash GS
>  wrote:
>> On Sat, Mar 11, 2017 at 12:37 AM, William Allen Simpson
>>  wrote:
>>> I'm not familiar with this code, so not likely to be much help.
>>> Looks mostly written by Matt, but Malahal made the most recent
>>> changes in July 2016.
>>>
>>> On 3/10/17 9:35 AM, Satya Prakash GS wrote:

 Is this a possibility :

 Server first rejects a client op with CREDPROBLEM/REJECTEDCRED,
 Client does an upcall and gssd initializes the context with the server.
 However the server recycles it immediately before the operation was
 retried (looks like there is a bug in the LRU implementation on
 Ganesha. To make things worse, I enabled the server debugs and it
 slowed down the client operations making the eviction of the entry
 easier). This happens thrice failing the client op.

>>> Problem is not obvious.
>>>
>>> axp->gen is initialized to zero with the rest of *axp -- mem_zalloc().
>>>
>>> gd->gen is initialized to zero by alloc_svc_rpc_gss_data().
>>>
>>> axp->gen is bumped by one (++) each time it is handled by LRU code in
>>> authgss_ctx_hash_get().
>>>
>>
>> If a node gen isn't getting incremented it means that node is not
>> being looked up often.
>>
>>> atomic_inc_uint32_t(&gd->gen) is immediately after that.
>>>
>>> You think gd->gen also needs to be set to axp->gen in _set()?
>>>
>>
>>> I'm not sure they are related.  There are many gd per axp, so
>>> axp->gen could be much higher than gd->gen.
>>>
>>
>> >From authgss_ctx_gc_idle ->
>>
>> if (abs(axp->gen - gd->gen) > __svc_params->gss.max_idle_gen) {
>> Remove the entry from the tree; //gd is no more in the cache after this
>> }
>>
>> Translates to - gd wasn't looked up in quite sometime let's clean it up.
>>
>> //gss.max_idle_gen -> by default set to 1024
>>
>> If tree's gen is 5000 and a new node gets inserted into the tree, node
>> gen shouldn't start at 0 or it might pass the above condition in the
>> next authgss_ctx_gc_idle call.
>>
>>> Both _get and _set are only called in svc_auth_gss.c _svcauth_gss().
>>>
>>> Admittedly, it is hard to track that there are 2 fields both called gen.
>>>
 Thanks,
 Satya.

 On Thu, Mar 9, 2017 at 8:07 PM, Satya Prakash GS
  wrote:
>
> Looks like the gen field in svc_rpc_gss_data is used to check the
> freshness of a context. However it is not initialized to axp->gen in
> authgss_ctx_hash_set.
> Will this not result in evicting the entries out early or am I missing
> something ?
>
> Thanks,
> Satya.
>


 --
 Announcing the Oxford Dictionaries API! The API offers world-renowned
 dictionary content that is easy and intuitive to access. Sign up for an
 account today to start using our lexical data to power your apps and
 projects. Get started today and enter our developer competition.
 http://sdm.link/oxford
 ___
 Nfs-ganesha-devel mailing list
 Nfs-ganesha-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

>>>
>>
>> Thanks,
>> Satya.
>>
>> --
>> Announcing the Oxford Dictionaries API! The API offers world-renowned
>> dictionary content that is easy and intuitive to access. Sign up for an
>> account today to start using our lexical data to power your apps and
>> projects. Get started today and enter our developer competition.
>> htt

Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-12 Thread William Allen Simpson
On 3/11/17 8:15 AM, Satya Prakash GS wrote:
> We are using 2.3-stable. Given that most of our testing has been done
> it's a bit difficult for us to move to 2.5 now but we can take fixes
> from 2.5.
>
Sorry, I should have asked long ago what version you were using.

On this list, I always assume that you are using the most recent -dev
release.  There are an awful lot of bug fixes since 2.3.  Indeed, 2.4
was mostly a bug fix release, and 2.5 is supposed to be a performance
release (but has a fair number of bug fixes, too).

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-12 Thread Malahal Naineni
>>  Indeed, 2.4 was mostly a bug fix release

Actually, 2.4 has couple big features as far as ganesha project is
concerned, but Bill is probably indicating that libntirpc
corresponding to ganesha2.4 is mostly bug fix release.

Regards, Malahal.

On Sun, Mar 12, 2017 at 8:15 PM, William Allen Simpson
 wrote:
> On 3/11/17 8:15 AM, Satya Prakash GS wrote:
>>
>> We are using 2.3-stable. Given that most of our testing has been done
>> it's a bit difficult for us to move to 2.5 now but we can take fixes
>> from 2.5.
>>
> Sorry, I should have asked long ago what version you were using.
>
> On this list, I always assume that you are using the most recent -dev
> release.  There are an awful lot of bug fixes since 2.3.  Indeed, 2.4
> was mostly a bug fix release, and 2.5 is supposed to be a performance
> release (but has a fair number of bug fixes, too).

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-13 Thread Satya Prakash GS
My bad, I should have mentioned the version in the original post.

Mahalal was kind enough to share a list of relevant commits. With the
patches I continued to see the issue. I suspect the client code is not
handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic.
Instead I fixed the server code to timeout the ticket 5 mins before
the actual timeout (Ganesha is already timing the ticket 5 seconds
earlier).
So far, the issue hasn't got reproduced but I will continue running
the test for a day or two before confirming if the fix works. Do you
see any issue with this fix ?

Thanks,
Satya.

On Sun, Mar 12, 2017 at 8:26 PM, Malahal Naineni  wrote:
>>>  Indeed, 2.4 was mostly a bug fix release
>
> Actually, 2.4 has couple big features as far as ganesha project is
> concerned, but Bill is probably indicating that libntirpc
> corresponding to ganesha2.4 is mostly bug fix release.
>
> Regards, Malahal.
>
> On Sun, Mar 12, 2017 at 8:15 PM, William Allen Simpson
>  wrote:
>> On 3/11/17 8:15 AM, Satya Prakash GS wrote:
>>>
>>> We are using 2.3-stable. Given that most of our testing has been done
>>> it's a bit difficult for us to move to 2.5 now but we can take fixes
>>> from 2.5.
>>>
>> Sorry, I should have asked long ago what version you were using.
>>
>> On this list, I always assume that you are using the most recent -dev
>> release.  There are an awful lot of bug fixes since 2.3.  Indeed, 2.4
>> was mostly a bug fix release, and 2.5 is supposed to be a performance
>> release (but has a fair number of bug fixes, too).

--
Announcing the Oxford Dictionaries API! The API offers world-renowned
dictionary content that is easy and intuitive to access. Sign up for an
account today to start using our lexical data to power your apps and
projects. Get started today and enter our developer competition.
http://sdm.link/oxford
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-16 Thread Satya Prakash GS
Has anyone seen client ops failing with error -13 because of context
expiry on client (gss_verify_mic fails).
Surprisingly with little load, it's consistently reproducible on my setup.
Can someone point me to the relevant commits if this has already been fixed.

Thanks,
Satya.

On Mon, Mar 13, 2017 at 4:01 PM, Satya Prakash GS
 wrote:
> My bad, I should have mentioned the version in the original post.
>
> Mahalal was kind enough to share a list of relevant commits. With the
> patches I continued to see the issue. I suspect the client code is not
> handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic.
> Instead I fixed the server code to timeout the ticket 5 mins before
> the actual timeout (Ganesha is already timing the ticket 5 seconds
> earlier).
> So far, the issue hasn't got reproduced but I will continue running
> the test for a day or two before confirming if the fix works. Do you
> see any issue with this fix ?
>
> Thanks,
> Satya.
>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-19 Thread Malahal Naineni
If I understand, you have renewable ticket and commands fail when the
ticket expires? I will let our folks tests it. Any more details on
reproducing this issue.

On Fri, Mar 17, 2017 at 9:59 AM, Satya Prakash GS
 wrote:
> Has anyone seen client ops failing with error -13 because of context
> expiry on client (gss_verify_mic fails).
> Surprisingly with little load, it's consistently reproducible on my setup.
> Can someone point me to the relevant commits if this has already been fixed.
>
> Thanks,
> Satya.
>
> On Mon, Mar 13, 2017 at 4:01 PM, Satya Prakash GS
>  wrote:
>> My bad, I should have mentioned the version in the original post.
>>
>> Mahalal was kind enough to share a list of relevant commits. With the
>> patches I continued to see the issue. I suspect the client code is not
>> handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic.
>> Instead I fixed the server code to timeout the ticket 5 mins before
>> the actual timeout (Ganesha is already timing the ticket 5 seconds
>> earlier).
>> So far, the issue hasn't got reproduced but I will continue running
>> the test for a day or two before confirming if the fix works. Do you
>> see any issue with this fix ?
>>
>> Thanks,
>> Satya.
>>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel


Re: [Nfs-ganesha-devel] Permission denied error with Kerberos enabled

2017-03-21 Thread Satya Prakash GS
Here are the reproduction steps:

I have 3 different servers hosting nfs client, server and KDC.
I set the ticket lifetime to 10 minutes on the client and server (in krb5.conf).
When adding a principal I used specified "-maxlife "10 minutes"
-maxrenew 2017-04-30".
I specified max_life (to 10 mins) in the kdc.conf file.
I am using machine credentials on the client (running operation as root user).

Run iozone or bonnie from 2 different clients and you should see the
issue within an hour.

The issue seems to be with the clock-skew which is set to 5 minutes by default.
The server is seeing context timeout of 15 mins while it should have
been 10 mins (taking the clock-skew into account).
Client is rejecting the server messages if the context is used for
more than 10 mins (on the server). This happens thrice and the user
operation fails.

Please let me know if you need any other details.

Thanks,
Satya.


On Sun, Mar 19, 2017 at 5:08 PM, Malahal Naineni  wrote:
> If I understand, you have renewable ticket and commands fail when the
> ticket expires? I will let our folks tests it. Any more details on
> reproducing this issue.
>
> On Fri, Mar 17, 2017 at 9:59 AM, Satya Prakash GS
>  wrote:
>> Has anyone seen client ops failing with error -13 because of context
>> expiry on client (gss_verify_mic fails).
>> Surprisingly with little load, it's consistently reproducible on my setup.
>> Can someone point me to the relevant commits if this has already been fixed.
>>
>> Thanks,
>> Satya.
>>
>> On Mon, Mar 13, 2017 at 4:01 PM, Satya Prakash GS
>>  wrote:
>>> My bad, I should have mentioned the version in the original post.
>>>
>>> Mahalal was kind enough to share a list of relevant commits. With the
>>> patches I continued to see the issue. I suspect the client code is not
>>> handling GSS_S_CONTEXT_EXPIRED correctly on a call to gss_verify_mic.
>>> Instead I fixed the server code to timeout the ticket 5 mins before
>>> the actual timeout (Ganesha is already timing the ticket 5 seconds
>>> earlier).
>>> So far, the issue hasn't got reproduced but I will continue running
>>> the test for a day or two before confirming if the fix works. Do you
>>> see any issue with this fix ?
>>>
>>> Thanks,
>>> Satya.
>>>

--
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
___
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel