Two of the affected hosts had iptables blocking the return it appears. The
third host does not. Investigating further. The behavior though changed
between old OpenSolaris-based NLM and NetApps to the newer Illumos. Those
blocks have been around for quite some time.


On Wed, Jan 28, 2015 at 10:52 AM, Marcel Telka <[email protected]> wrote:

> Your NFS server is trying to connect to the rpcbind on the NFS client
> machine
> and this fails (or timeouts, to be precise).  There might be various
> reasons
> for that. One might be that the NFS/NLM client didn't pass the proper
> client
> name in the NLM lock request.  You could confirm that by running the
> following
> dtrace oneliner:
>
> dtrace -n 'nlm_host_findcreate:entry {printf("NLM client: %s\n",
> stringof(arg1))}'
>
> and try to reproduce again.
>
> The other reason might be that you have blocked outgoing communication
> from the
> NFS server (to the NFS client), or whatever.
>
>
> HTH.
>
>
> On Wed, Jan 28, 2015 at 10:29:53AM -0800, Joe Little wrote:
> > Just bumped up the threads from 80 to 1024 (didn't help). Here's your
> > details
> >
> > root@miele:/root# echo "::svc_pool nlm" | mdb -k
> >
> > mdb: failed to add kvm_pte_chain walker: walk name already in use
> >
> > mdb: failed to add kvm_rmap_desc walker: walk name already in use
> >
> > mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> >
> > mdb: failed to add kvm_pte_chain walker: walk name already in use
> >
> > mdb: failed to add kvm_rmap_desc walker: walk name already in use
> >
> > mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> >
> > SVCPOOL = ffffff07fc281aa8 -> POOL ID = NLM(2)
> >
> > Non detached threads    = 1
> >
> > Detached threads        = 0
> >
> > Max threads             = 1024
> >
> > `redline'               = 1
> >
> > Reserved threads        = 0
> >
> > Thread lock     = mutex not held
> >
> > Asleep threads          = 0
> >
> > Request lock    = mutex not held
> >
> > Pending requests        = 0
> >
> > Walking threads         = 0
> >
> > Max requests from xprt  = 8
> >
> > Stack size for svc_run  = 0
> >
> > Creator lock    = mutex not held
> >
> > No of Master xprt's     = 4
> >
> > rwlock for the mxprtlist= owner 0
> >
> > master xprt list ptr    = ffffff079cbc3800
> >
> >
> > root@miele:/root# echo "::stacks -m klmmod" | mdb -k
> >
> > mdb: failed to add kvm_pte_chain walker: walk name already in use
> >
> > mdb: failed to add kvm_rmap_desc walker: walk name already in use
> >
> > mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> >
> > mdb: failed to add kvm_pte_chain walker: walk name already in use
> >
> > mdb: failed to add kvm_rmap_desc walker: walk name already in use
> >
> > mdb: failed to add kvm_mmu_page_header walker: walk name already in use
> >
> > THREAD           STATE    SOBJ                COUNT
> >
> > ffffff09adf52880 SLEEP    CV                      1
> >
> >                  swtch+0x141
> >
> >                  cv_timedwait_hires+0xec
> >
> >                  cv_reltimedwait+0x51
> >
> >                  waitforack+0x5c
> >
> >                  connmgr_connect+0x131
> >
> >                  connmgr_wrapconnect+0x138
> >
> >                  connmgr_get+0x9dc
> >
> >                  connmgr_wrapget+0x63
> >
> >                  clnt_cots_kcallit+0x18f
> >
> >                  rpcbind_getaddr+0x245
> >
> >                  update_host_rpcbinding+0x4f
> >
> >                  nlm_host_get_rpc+0x6d
> >
> >                  nlm_do_lock+0x10d
> >
> >                  nlm4_lock_4_svc+0x2a
> >
> >                  nlm_dispatch+0xe6
> >
> >                  nlm_prog_4+0x34
> >
> >                  svc_getreq+0x1c1
> >
> >                  svc_run+0x146
> >
> >                  svc_do_run+0x8e
> >
> >                  nfssys+0xf1
> >
> >                  _sys_sysenter_post_swapgs+0x149
> >
> >
> > ffffff002e1fdc40 SLEEP    CV                      1
> >
> >                  swtch+0x141
> >
> >                  cv_timedwait_hires+0xec
> >
> >                  cv_timedwait+0x5c
> >
> >                  nlm_gc+0x54
> >
> >                  thread_start+8
> >
> >
> > The file server is using VLANs and from the snoop result it seems to
> > responding with NFS4 locks for a v3 request?
> >
> >
> > root@miele:/root# snoop -r -i snoop.out
> >
> >   1   0.00000 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3712
> > FH=97AA PID=2 Region=0:0
> >
> >   2   9.99901 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3712
> > FH=97AA PID=2 Region=0:0 (retransmit)
> >
> >   3  19.99987 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3712
> > FH=97AA PID=2 Region=0:0 (retransmit)
> >
> >   4  15.00451 VLAN#3812: 172.27.72.15 -> 172.27.72.26 NLM R LOCK4 OH=3712
> > denied (no locks)
> >
> >   5   1.70920 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3812
> > FH=97AA PID=3 Region=0:0
> >
> >   6   9.99927 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3812
> > FH=97AA PID=3 Region=0:0 (retransmit)
> >
> >   7  20.00018 VLAN#3812: 172.27.72.26 -> 172.27.72.15 NLM C LOCK4 OH=3812
> > FH=97AA PID=3 Region=0:0 (retransmit)
> >
> >
> >
> >
> >
> > On Wed, Jan 28, 2015 at 10:22 AM, Marcel Telka <[email protected]> wrote:
> >
> > > Please show me the output from the following commands on the affected
> > > machine:
> > >
> > > echo "::svc_pool nlm" | mdb -k
> > > echo "::stacks -m klmmod" | mdb -k
> > >
> > > Then run the following command:
> > >
> > > snoop -o snoop.out rpc nlockmgr
> > >
> > > Then reproduce the problem (to see the ENOLCK error) and then Ctrl+C
> the
> > > snoop.
> > > Make sure there is something in the snoop.out file and send the file
> to me.
> > >
> > >
> > > Thanks.
> > >
> > >
> > > On Wed, Jan 28, 2015 at 10:06:39AM -0800, Joe Little via
> illumos-discuss
> > > wrote:
> > > > Forwarded this as requested by OmniTI
> > > >
> > > > ---------- Forwarded message ----------
> > > > From: Joe Little <[email protected]>
> > > > Date: Wed, Jan 28, 2015 at 8:49 AM
> > > > Subject: NFS v3 locking broken in latest OmniOS r151012 and updates
> > > > To: [email protected]
> > > >
> > > >
> > > > I recently switched one file server from Nexenta 3 and then Nexenta 4
> > > > Community (still uses closed NLM I believe) to OmniOS r151012.
> > > >
> > > > Immediately, users started to complain from various Linux clients
> that
> > > > locking was failing. Most of those clients explicitly set their NFS
> > > version
> > > > to 3. I finally isolated that the locking does not fail on NFS v4 and
> > > have
> > > > worked on transition where possible. But presently, no NFS v3 client
> and
> > > > successfully lock against OmniOS NFS v3 locking service. I've
> confirmed
> > > > that the locking service is running and is present using rpcinfo,
> > > matching
> > > > one for one in services from previous OpenSolaris and Illumos
> variants.
> > > One
> > > > example from a user:
> > > >
> > > > $ strace /bin/tcsh
> > > >
> > > > [...]
> > > >
> > > > open("/home/REDACTED/.history", O_RDWR|O_CREAT, 0600) = 0
> > > >
> > > > dup(0)                                  = 1
> > > >
> > > > dup(1)                                  = 2
> > > >
> > > > dup(2)                                  = 3
> > > >
> > > > dup(3)                                  = 4
> > > >
> > > > dup(4)                                  = 5
> > > >
> > > > dup(5)                                  = 6
> > > >
> > > > close(5)                                = 0
> > > >
> > > > close(4)                                = 0
> > > >
> > > > close(3)                                = 0
> > > >
> > > > close(2)                                = 0
> > > >
> > > > close(1)                                = 0
> > > >
> > > > close(0)                                = 0
> > > >
> > > > fcntl(6, F_SETFD, FD_CLOEXEC)           = 0
> > > >
> > > > fcntl(6, F_SETLKW, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0})
> > > >
> > > >
> > > > HERE fcntl hangs for 1-2 min and finally returns with "-1 ENOLCK (No
> > > >
> > > > locks available)"
> > > >
> > > >
> > > >
> > > > -------------------------------------------
> > > > illumos-discuss
> > > > Archives: https://www.listbox.com/member/archive/182180/=now
> > > > RSS Feed:
> > > https://www.listbox.com/member/archive/rss/182180/23046997-5a38a7d8
> > > > Modify Your Subscription:
> > >
> https://www.listbox.com/member/?&;
> > > > Powered by Listbox: http://www.listbox.com
> > >
> > > --
> > > +-------------------------------------------+
> > > | Marcel Telka   e-mail:   [email protected]  |
> > > |                homepage: http://telka.sk/ |
> > > |                jabber:   [email protected] |
> > > +-------------------------------------------+
> > >
>
>
>
> --
> +-------------------------------------------+
> | Marcel Telka   e-mail:   [email protected]  |
> |                homepage: http://telka.sk/ |
> |                jabber:   [email protected] |
> +-------------------------------------------+
>



-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Reply via email to