I'm observing some strange 3 minute slow response times for
utimes() (setattr) calls on an NFSv4 filesystem, using an
amd64 solaris express nfs4 server box running snv_77 (and
bfu'ed to recent [=mid/end december 2007] opensolaris bits).
nfs4 clients are running either snv_77 x86, or snv_72 sparc,
or s10u2 x86.
It seems the issue is not 100% reproducible, but at least on
two different nfs4 servers I'm observing the problem quite often.
I can reproduce the issue like this:
1. On the snv_77 nfs4 server, create a new file on a local nfs shared
filesystem (I tested both with zfs and ufs):
nfs4server% rm foobar
nfs4server% date > foobar
2. On a nfs4 client, first read the "foobar" file from the nfs4 server
that was created in step 1., then change the timestamps on the file:
nfs4client% cat foobar
Thu Jan 3 17:24:29 CET 2008
nfs4client% time touch -t 01011200 foobar
0.00u 0.00s 3:00.10 0.0%
Note that the touch command is stuck for 3 minutes until the
command prompt returns. While touch is stuck, the nfs4 client
gets "NFS server nfs4server not responding; still trying" messages
logged to /var/adm/messages.
snoop has traced this:
1 0.00000 nfs4client -> nfs4server NFS C 4 (lookup valid) PUTFH FH=BCDC
NVERIFY GETATTR 10011a b0a23a ACCESS rd,lk,mo,ext,dl LOOKUP tmp GETFH GETATTR
10011...
2 0.00031 nfs4server -> nfs4client NFS R 4 (lookup valid) NFS4ERR_SAME
PUTFH NFS4_OK NVERIFY NFS4ERR_SAME
3 0.00009 nfs4client -> nfs4server NFS C 4 (lookup valid) PUTFH FH=A8C7
NVERIFY GETATTR 10011a b0a23a ACCESS rd,lk,mo,ext,dl LOOKUP foobar GETFH
GETATTR 10...
4 0.00032 nfs4server -> nfs4client NFS R 4 (lookup valid) NFS4_OK PUTFH
NFS4_OK NVERIFY NFS4_OK GETATTR NFS4_OK ACCESS NFS4_OK Supp=rd,lk,mo,ext,dl
Allow=rd,lk,mo,...
5 0.00308 nfs4client -> nfs4server NFS C 4 (access ) PUTFH FH=C2F9
ACCESS rd,mo,ext,exc GETATTR 10011a b0a23a
6 0.00014 nfs4server -> nfs4client NFS R 4 (access ) NFS4_OK PUTFH
NFS4_OK ACCESS NFS4_OK Supp=rd,mo,ext,exc Allow=rd,mo,ext GETATTR NFS4_OK
7 0.00010 nfs4client -> nfs4server NFS C 4 (open ) PUTFH FH=A8C7
OPEN foobar OT=NC SQ=4 CT=N AC=R DN=N OO=0090 GETFH GETATTR 10011a b0a23a
8 0.00010 nfs4server -> nfs4client NFS R 4 (open ) NFS4ERR_EXPIRED
PUTFH NFS4_OK OPEN NFS4ERR_EXPIRED
9 0.00021 nfs4client -> nfs4server NFS C 4 (setclientid ) PUTROOTFH GETATTR
400 0 SETCLIENTID Prog=1073741825 ID=tcp Addr=172.20.0.11.182.32
CBID=1073741825
10 0.00012 nfs4server -> nfs4client NFS R 4 (setclientid ) NFS4_OK PUTROOTFH
NFS4_OK GETATTR NFS4_OK SETCLIENTID NFS4_OK CL=8f477b7ccf CFV=000000000000008F
11 0.00005 nfs4client -> nfs4server NFS C 4 (sclntid_conf)
SETCLIENTID_CONFIRM CL=8f477b7ccf CFV=000000000000008F
12 0.03031 nfs4server -> nfs4client NFS R 4 (sclntid_conf) NFS4_OK
SETCLIENTID_CONFIRM NFS4_OK
13 0.00024 nfs4server -> nfs4client NFS C CB_NULL
14 0.00004 nfs4client -> nfs4server TCP D=57992 S=46624 Ack=3639677124
Seq=1965079672 Len=0 Win=49640
15 0.00019 nfs4client -> nfs4server NFS R CB_NULL
16 0.00009 nfs4server -> nfs4client TCP D=46624 S=57992 Ack=1965079700
Seq=3639677124 Len=0 Win=49640
17 0.00393 nfs4client -> nfs4server NFS C 4 (open ) PUTFH FH=A8C7
OPEN foobar OT=NC SQ=5 CT=N AC=R DN=N OO=0090 GETFH GETATTR 10011a b0a23a
18 0.00021 nfs4server -> nfs4client NFS R 4 (open ) NFS4_OK PUTFH
NFS4_OK OPEN NFS4_OK ST=1840:1 RF=CF,PL DT=R DST=18B9:0 GETFH NFS4_OK FH=C2F9
GETATTR NFS4_OK
19 0.00008 nfs4client -> nfs4server NFS C 4 (open_confirm) PUTFH FH=C2F9
OPEN_CONFIRM SQ=6 OST=1840:1
20 0.00011 nfs4server -> nfs4client NFS R 4 (open_confirm) NFS4_OK PUTFH
NFS4_OK OPEN_CONFIRM NFS4_OK OST=1840:2
21 0.00014 nfs4client -> nfs4server NFS C 4 (read ) PUTFH FH=C2F9
READ ST=18B9:0 at 0 for 4096
22 0.00015 nfs4server -> nfs4client NFS R 4 (read ) NFS4_OK PUTFH
NFS4_OK READ NFS4_OK (29 bytes) EOF
23 0.00014 nfs4client -> nfs4server NFS C 4 (close ) PUTFH FH=C2F9
GETATTR 10011a b0a23a CLOSE SQ=7 OST=1840:2
24 0.00011 nfs4server -> nfs4client NFS R 4 (close ) NFS4_OK PUTFH
NFS4_OK GETATTR NFS4_OK CLOSE OST=1840:3
25 0.01700 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639480833
Seq=1964998148 Len=0 Win=49640
26 2.31082 nfs4client -> nfs4server NFS C 4 (lookup ) PUTFH FH=BCDC
SAVEFH LOOKUP tmp GETFH GETATTR 10011a b0a23a RESTOREFH NVERIFY GETATTR 10011a
b0a2...
27 0.00019 nfs4server -> nfs4client NFS R 4 (lookup ) NFS4ERR_SAME
PUTFH NFS4_OK SAVEFH NFS4_OK LOOKUP NFS4_OK GETFH NFS4_OK FH=A8C7 GETATTR
NFS4_OK RESTOREFH NFS4_...
28 0.00017 nfs4client -> nfs4server NFS C 4 (setattr ) PUTFH FH=C2F9
SETATTR ST=SPC0 GETATTR 10011a b0a23a
29 0.00026 nfs4server -> nfs4client NFS C CB4 (cb_recall) CBID=1073741825
CB_RECALL FH=C2F9 ST=18B9:0 TR=F
30 0.00014 nfs4client -> nfs4server NFS R CB4 (cb_recall) NFS4_OK CB_RECALL
NFS4_OK
31 0.05435 nfs4server -> nfs4client TCP D=997 S=2049 Ack=1964998676
Seq=3639481153 Len=0 Win=49640
32 0.00004 nfs4server -> nfs4client TCP D=46624 S=57992 Ack=1965079760
Seq=3639677288 Len=0 Win=49640
33 40.03157 nfs4client -> nfs4server NFS C 4 (renew ) RENEW
CL=8f477b7ccf
34 0.00030 nfs4server -> nfs4client NFS R 4 (renew ) NFS4_OK RENEW
NFS4_OK
35 0.05962 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639481213
Seq=1964998784 Len=0 Win=49640
36 41.93759 nfs4client -> nfs4server NFS C 4 (renew ) RENEW
CL=8f477b7ccf
37 0.00029 nfs4server -> nfs4client NFS R 4 (renew ) NFS4_OK RENEW
NFS4_OK
38 0.05965 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639481273
Seq=1964998892 Len=0 Win=49640
39 7.94707 nfs4server -> nfs4client NFS C CB4 (cb_recall) CBID=1073741825
CB_RECALL FH=C2F9 ST=18B9:0 TR=F
40 0.00022 nfs4client -> nfs4server NFS R CB4 (cb_recall) NFS4_OK CB_RECALL
NFS4_OK
41 0.05962 nfs4server -> nfs4client TCP D=46624 S=57992 Ack=1965079820
Seq=3639677452 Len=0 Win=49640
42 31.30150 nfs4client -> nfs4server NIS C MATCH www.opensolaris.org in
ipnodes.byname
43 0.00086 nfs4server -> nfs4client NIS R MATCH No more entries
44 0.00590 nfs4server -> nfs4client RPC R XID=1198815295 Success
45 1.67215 nfs4client -> nfs4server NIS C MATCH sunopensolaris.112.2o7.net
in ipnodes.byname
46 0.00476 nfs4server -> nfs4client NIS R MATCH No such key
47 0.94554 nfs4client -> nfs4server NFS C 4 (renew ) RENEW
CL=8f477b7ccf
48 0.00031 nfs4server -> nfs4client NFS R 4 (renew ) NFS4_OK RENEW
NFS4_OK
49 0.05960 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639481333
Seq=1964999000 Len=0 Win=49640
50 39.93862 nfs4client -> nfs4server NIS C MATCH imap in ipnodes.byname
51 0.00483 nfs4server -> nfs4client NIS R MATCH No such key
52 0.00046 nfs4client -> nfs4server NIS C MATCH imap in ipnodes.byname
53 0.00037 nfs4server -> nfs4client NIS R MATCH No such key
54 0.00014 nfs4client -> nfs4server NIS C MATCH imap in hosts.byname
55 0.00030 nfs4server -> nfs4client NIS R MATCH OK
56 1.99283 nfs4client -> nfs4server NFS C 4 (renew ) RENEW
CL=8f477b7ccf
57 0.00031 nfs4server -> nfs4client NFS R 4 (renew ) NFS4_OK RENEW
NFS4_OK
58 0.05963 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639481393
Seq=1964999108 Len=0 Win=49640
59 4.61004 nfs4client -> nfs4server NIS C MATCH jk in passwd.byname
60 0.00051 nfs4server -> nfs4client NIS R MATCH OK
61 9.23876 nfs4client -> nfs4server NFS C 4 (setattr ) PUTFH FH=C2F9
SETATTR ST=SPC0 GETATTR 10011a b0a23a (retransmit)
62 0.05833 nfs4server -> nfs4client TCP D=997 S=2049 Ack=1964999368
Seq=3639481393 Len=0 Win=49640
63 0.06279 nfs4server -> nfs4client NFS R 4 (setattr ) NFS4_OK PUTFH
NFS4_OK SETATTR 0 410000 GETATTR NFS4_OK
64 0.00020 nfs4client -> nfs4server NFS C 4 (delegreturn ) PUTFH FH=C2F9
GETATTR 10011a b0a23a DELEGRETURN DST=18B9:0
65 0.00021 nfs4server -> nfs4client NFS R 4 (delegreturn )
NFS4ERR_BAD_STATEID PUTFH NFS4_OK GETATTR NFS4_OK DELEGRETURN
NFS4ERR_BAD_STATEID
66 0.00103 nfs4client -> nfs4server NIS C MATCH inbox in auto.home
67 0.00057 nfs4server -> nfs4client NIS R MATCH OK
68 0.00094 nfs4client -> nfs4server NIS C MATCH imap in ipnodes.byname
69 0.00324 nfs4server -> nfs4client NIS R MATCH No such key
70 0.00100 nfs4client -> nfs4server NIS C MATCH imap in ipnodes.byname
71 0.00038 nfs4server -> nfs4client NIS R MATCH No such key
72 0.02302 nfs4client -> nfs4server NIS C MATCH tiger2 in hosts.byname
73 0.00028 nfs4server -> nfs4client NIS R MATCH OK
74 0.02782 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639481893
Seq=1964999580 Len=0 Win=49640
75 27.90841 nfs4client -> nfs4server NFS C 4 (renew ) RENEW
CL=8f477b7ccf
76 0.00023 nfs4server -> nfs4client NFS R 4 (renew ) NFS4_OK RENEW
NFS4_OK
77 0.05969 nfs4client -> nfs4server TCP D=2049 S=997 Ack=3639481953
Seq=1964999688 Len=0 Win=49640
Problem seems to start with the (setattr) packet #28, which doesn't
get a reply from the server until the client retransmits (packet #61/#63).
Instead of a reply for packet#28, the nfs4 server is sending a
(cb_recall) packet #29 to the nfs4 client (to revoke the read delegation?),
but the client doesn't send the (delegreturn) until packet #64
The (delegreturn) seems to be delayed on the client, because the
thread executing the nfs4delegreturn_impl() function is stuck in
usr/src/uts/common/fs/nfs/nfs4_callback.c, with one reader blocking
the r_deleg_recall_lock:
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/nfs/nfs4_callback.c#1633
1633 /*
1634 * Take r_deleg_recall_lock in WRITE mode, this will prevent
1635 * nfs4_is_otw_open_necessary from trying to use the delegation
1636 * while the DELEGRETURN is in progress.
1637 */
1638 (void) nfs_rw_enter_sig(&rp->r_deleg_recall_lock, RW_WRITER,
FALSE);
Can anyone reproduce this?
This message posted from opensolaris.org