Your NFS server is trying to connect to the rpcbind on the NFS client machine
and this fails (or timeouts, to be precise).  There might be various reasons
for that. One might be that the NFS/NLM client didn't pass the proper client
name in the NLM lock request.  You could confirm that by running the following
dtrace oneliner:

dtrace -n 'nlm_host_findcreate:entry {printf("NLM client: %s\n", 
stringof(arg1))}'

and try to reproduce again.

The other reason might be that you have blocked outgoing communication from the
NFS server (to the NFS client), or whatever.


HTH.


On Wed, Jan 28, 2015 at 10:29:53AM -0800, Joe Little wrote:
> Just bumped up the threads from 80 to 1024 (didn't help). Here's your
> details
> 
> root@miele:/root# echo "::svc_pool nlm" | mdb -k
> 
> mdb: failed to add kvm_pte_chain walker: walk name already in use
> 
> mdb: failed to add kvm_rmap_desc walker: walk name already in use
> 
> mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> 
> mdb: failed to add kvm_pte_chain walker: walk name already in use
> 
> mdb: failed to add kvm_rmap_desc walker: walk name already in use
> 
> mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> 
> SVCPOOL = ffffff07fc281aa8 -> POOL ID = NLM(2)
> 
> Non detached threads    = 1
> 
> Detached threads        = 0
> 
> Max threads             = 1024
> 
> `redline'               = 1
> 
> Reserved threads        = 0
> 
> Thread lock     = mutex not held
> 
> Asleep threads          = 0
> 
> Request lock    = mutex not held
> 
> Pending requests        = 0
> 
> Walking threads         = 0
> 
> Max requests from xprt  = 8
> 
> Stack size for svc_run  = 0
> 
> Creator lock    = mutex not held
> 
> No of Master xprt's     = 4
> 
> rwlock for the mxprtlist= owner 0
> 
> master xprt list ptr    = ffffff079cbc3800
> 
> 
> root@miele:/root# echo "::stacks -m klmmod" | mdb -k
> 
> mdb: failed to add kvm_pte_chain walker: walk name already in use
> 
> mdb: failed to add kvm_rmap_desc walker: walk name already in use
> 
> mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> 
> mdb: failed to add kvm_pte_chain walker: walk name already in use
> 
> mdb: failed to add kvm_rmap_desc walker: walk name already in use
> 
> mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> 
> THREAD           STATE    SOBJ                COUNT
> 
> ffffff09adf52880 SLEEP    CV                      1
> 
>                  swtch+0x141
> 
>                  cv_timedwait_hires+0xec
> 
>                  cv_reltimedwait+0x51
> 
>                  waitforack+0x5c
> 
>                  connmgr_connect+0x131
> 
>                  connmgr_wrapconnect+0x138
> 
>                  connmgr_get+0x9dc
> 
>                  connmgr_wrapget+0x63
> 
>                  clnt_cots_kcallit+0x18f
> 
>                  rpcbind_getaddr+0x245
> 
>                  update_host_rpcbinding+0x4f
> 
>                  nlm_host_get_rpc+0x6d
> 
>                  nlm_do_lock+0x10d
> 
>                  nlm4_lock_4_svc+0x2a
> 
>                  nlm_dispatch+0xe6
> 
>                  nlm_prog_4+0x34
> 
>                  svc_getreq+0x1c1
> 
>                  svc_run+0x146
> 
>                  svc_do_run+0x8e
> 
>                  nfssys+0xf1
> 
>                  _sys_sysenter_post_swapgs+0x149
> 
> 
> ffffff002e1fdc40 SLEEP    CV                      1
> 
>                  swtch+0x141
> 
>                  cv_timedwait_hires+0xec
> 
>                  cv_timedwait+0x5c
> 
>                  nlm_gc+0x54
> 
>                  thread_start+8
> 
> 
> The file server is using VLANs and from the snoop result it seems to
> responding with NFS4 locks for a v3 request?
> 
> 
> root@miele:/root# snoop -r -i snoop.out
> 
>   1   0.00000 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3712
> FH=97AA PID=2 Region=0:0
> 
>   2   9.99901 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3712
> FH=97AA PID=2 Region=0:0 (retransmit)
> 
>   3  19.99987 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3712
> FH=97AA PID=2 Region=0:0 (retransmit)
> 
>   4  15.00451 VLAN#3812: 172.27.72.15 -> 172.27.72.26 NLM R LOCK4 OH=3712
> denied (no locks)
> 
>   5   1.70920 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3812
> FH=97AA PID=3 Region=0:0
> 
>   6   9.99927 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3812
> FH=97AA PID=3 Region=0:0 (retransmit)
> 
>   7  20.00018 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3812
> FH=97AA PID=3 Region=0:0 (retransmit)
> 
> 
> 
> 
> 
> On Wed, Jan 28, 2015 at 10:22 AM, Marcel Telka <[email protected]> wrote:
> 
> > Please show me the output from the following commands on the affected
> > machine:
> >
> > echo "::svc_pool nlm" | mdb -k
> > echo "::stacks -m klmmod" | mdb -k
> >
> > Then run the following command:
> >
> > snoop -o snoop.out rpc nlockmgr
> >
> > Then reproduce the problem (to see the ENOLCK error) and then Ctrl+C the
> > snoop.
> > Make sure there is something in the snoop.out file and send the file to me.
> >
> >
> > Thanks.
> >
> >
> > On Wed, Jan 28, 2015 at 10:06:39AM -0800, Joe Little via illumos-discuss
> > wrote:
> > > Forwarded this as requested by OmniTI
> > >
> > > ---------- Forwarded message ----------
> > > From: Joe Little <[email protected]>
> > > Date: Wed, Jan 28, 2015 at 8:49 AM
> > > Subject: NFS v3 locking broken in latest OmniOS r151012 and updates
> > > To: [email protected]
> > >
> > >
> > > I recently switched one file server from Nexenta 3 and then Nexenta 4
> > > Community (still uses closed NLM I believe) to OmniOS r151012.
> > >
> > > Immediately, users started to complain from various Linux clients that
> > > locking was failing. Most of those clients explicitly set their NFS
> > version
> > > to 3. I finally isolated that the locking does not fail on NFS v4 and
> > have
> > > worked on transition where possible. But presently, no NFS v3 client and
> > > successfully lock against OmniOS NFS v3 locking service. I've confirmed
> > > that the locking service is running and is present using rpcinfo,
> > matching
> > > one for one in services from previous OpenSolaris and Illumos variants.
> > One
> > > example from a user:
> > >
> > > $ strace /bin/tcsh
> > >
> > > [...]
> > >
> > > open("/home/REDACTED/.history", O_RDWR|O_CREAT, 0600) = 0
> > >
> > > dup(0)                                  = 1
> > >
> > > dup(1)                                  = 2
> > >
> > > dup(2)                                  = 3
> > >
> > > dup(3)                                  = 4
> > >
> > > dup(4)                                  = 5
> > >
> > > dup(5)                                  = 6
> > >
> > > close(5)                                = 0
> > >
> > > close(4)                                = 0
> > >
> > > close(3)                                = 0
> > >
> > > close(2)                                = 0
> > >
> > > close(1)                                = 0
> > >
> > > close(0)                                = 0
> > >
> > > fcntl(6, F_SETFD, FD_CLOEXEC)           = 0
> > >
> > > fcntl(6, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0})
> > >
> > >
> > > HERE fcntl hangs for 1-2 min and finally returns with "-1 ENOLCK (No
> > >
> > > locks available)"
> > >
> > >
> > >
> > > -------------------------------------------
> > > illumos-discuss
> > > Archives: https://www.listbox.com/member/archive/182180/=now
> > > RSS Feed:
> > https://www.listbox.com/member/archive/rss/182180/23046997-5a38a7d8
> > > Modify Your Subscription:
> > https://www.listbox.com/member/?&;
> > > Powered by Listbox: http://www.listbox.com
> >
> > --
> > +-------------------------------------------+
> > | Marcel Telka   e-mail:   [email protected]  |
> > |                homepage: http://telka.sk/ |
> > |                jabber:   [email protected] |
> > +-------------------------------------------+
> >



-- 
+-------------------------------------------+
| Marcel Telka   e-mail:   [email protected]  |
|                homepage: http://telka.sk/ |
|                jabber:   [email protected] |
+-------------------------------------------+


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to