Re: Why is NFSv4 so slow?

2010-09-13 Thread Oliver Fromme
Rick Macklem wrote:
  Btw, if anyone who didn't see the posting on freebsd-fs and would
  like to run a quick test, it would be appreciated.
  Bascially do both kinds of mount using a FreeBSD8.1 or later client
  and then read a greater than 100Mbyte file with dd.
  
  # mount -t nfs -o nfsv3 server:/path /mnt-path
  - cd anywhere in mount that has  100Mbyte file
  # dd if=file of=/dev/null bs=1m
  # umount /mnt-path
  
  Then repeat with
  # mount -t newnfs -o nfsv3 server:/path /mnt-path
  
  and post the results along with the client machine's info
  (machine arch/# of cores/memory/net interface used for NFS traffic).
  
  Thanks in advance to anyone who runs the test, rick

Ok ...

NFS server:
 - FreeBSD 8.1-PRERELEASE-20100620 i386
 - intel Atom 330 (1.6 GHz dual-core with HT -- 4-way SMP)
 - 4 GB RAM
 - re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet

NFS client:
 - FreeBSD 8.1-STABLE-20100908 i386
 - AMD Phenom II X6 1055T (2.8 GHz + Turbo Core, six-core)
 - 4 GB RAM
 - re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet

The machines are connected through a Netgear GS108T
gigabit ethernet switch.

I umounted and re-mounted the NFS path after every single
dd(1) command, so the data actually comes from the server
instead of from the local cache.  I also made sure that
the file was in the cache on the server, so the server's
disk speed is irrelevant.

Testing with mount -t nfs:

183649990 bytes transferred in 2.596677 secs (70725002 bytes/sec)
183649990 bytes transferred in 2.578746 secs (71216779 bytes/sec)
183649990 bytes transferred in 2.561857 secs (71686277 bytes/sec)
183649990 bytes transferred in 2.629028 secs (69854708 bytes/sec)
183649990 bytes transferred in 2.535422 secs (72433702 bytes/sec)

Testing with mount -t newnfs:

183649990 bytes transferred in 5.361544 secs (34253192 bytes/sec)
183649990 bytes transferred in 5.401471 secs (3396 bytes/sec)
183649990 bytes transferred in 5.052138 secs (36350946 bytes/sec)
183649990 bytes transferred in 5.311821 secs (34573829 bytes/sec)
183649990 bytes transferred in 5.537337 secs (33165760 bytes/sec)

So, nfs is roughly twice as fast as newnfs, indeed.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

A language that doesn't have everything is actually easier
to program in than some that do.
-- Dennis M. Ritchie
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-13 Thread Rick Macklem
 
 Ok ...
 
 NFS server:
 - FreeBSD 8.1-PRERELEASE-20100620 i386
 - intel Atom 330 (1.6 GHz dual-core with HT -- 4-way SMP)
 - 4 GB RAM
 - re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet
 
 NFS client:
 - FreeBSD 8.1-STABLE-20100908 i386
 - AMD Phenom II X6 1055T (2.8 GHz + Turbo Core, six-core)
 - 4 GB RAM
 - re0: RealTek 8168/8111 B/C/CP/D/DP/E PCIe Gigabit Ethernet
 
 The machines are connected through a Netgear GS108T
 gigabit ethernet switch.
 
 I umounted and re-mounted the NFS path after every single
 dd(1) command, so the data actually comes from the server
 instead of from the local cache. I also made sure that
 the file was in the cache on the server, so the server's
 disk speed is irrelevant.
 
 Testing with mount -t nfs:
 
 183649990 bytes transferred in 2.596677 secs (70725002 bytes/sec)
 183649990 bytes transferred in 2.578746 secs (71216779 bytes/sec)
 183649990 bytes transferred in 2.561857 secs (71686277 bytes/sec)
 183649990 bytes transferred in 2.629028 secs (69854708 bytes/sec)
 183649990 bytes transferred in 2.535422 secs (72433702 bytes/sec)
 
 Testing with mount -t newnfs:
 
 183649990 bytes transferred in 5.361544 secs (34253192 bytes/sec)
 183649990 bytes transferred in 5.401471 secs (3396 bytes/sec)
 183649990 bytes transferred in 5.052138 secs (36350946 bytes/sec)
 183649990 bytes transferred in 5.311821 secs (34573829 bytes/sec)
 183649990 bytes transferred in 5.537337 secs (33165760 bytes/sec)
 
 So, nfs is roughly twice as fast as newnfs, indeed.
 
 Best regards
 Oliver
 
Thanks for doing the test. I think I can find out what causes the
factor of 2 someday. What is really weird is that some people see
several orders of magnitude slower (a few Mbytes/sec).

Your case was also useful, because you are using the same net
interface/driver as the original report of a few Mbytes/sec, so it
doesn't appear to be an re problem.

Have a good week, rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-13 Thread Rick C. Petty
On Mon, Sep 13, 2010 at 11:15:34AM -0400, Rick Macklem wrote:
  
  instead of from the local cache. I also made sure that
  the file was in the cache on the server, so the server's
  disk speed is irrelevant.
  

snip

  So, nfs is roughly twice as fast as newnfs, indeed.

Hmm, I have the same network switch as Oliver, and I wasn't caching the
file on the server before.  When I cache the file on the server, I get
about 1 MiB/s faster throughput, so that doesn't seem to make the
difference to me (but with higher throughputs, I would imagine it would).

 Thanks for doing the test. I think I can find out what causes the
 factor of 2 someday. What is really weird is that some people see
 several orders of magnitude slower (a few Mbytes/sec).
 
 Your case was also useful, because you are using the same net
 interface/driver as the original report of a few Mbytes/sec, so it
 doesn't appear to be an re problem.

I believe I said something to that effect.  :-P

The problem I have is that the magnitude of throughput varies randomly.
Sometimes I can repeat the test and see 3-4 MB/s.  Then my server's
motherboard failed last week so I swapped things around and now I have 9-10
MB/s on the same client (but using 100Mbit interface instead of gigabit, so
those speeds make sense).

One thing I noticed is the lag seems to have disappeared after the reboots.
Another thing I had to change was that I was using an NFSv3 mount for /home
(with the v3 client, not the experimental v3/v4 client) and now I'm using
NFSv4 mounts exclusively.  Too much hardware changed because of that board
failing (AHCI was randomly dropping disks, and it got to the point that it
wouldn't pick up drives after a cold start and then the board failed to
POST 11 of 12 times), so I haven't been able to reliably reproduce any
problems.  I also had to reboot the bad client because of the broken
NFSv3 mountpoints, and the server was auto-upgraded to a newer 8.1-stable
(I often run make buildworld kernel regularly, so any reboots will
automatically have a newer kernel).

There's definite evidence that the newnfs mounts are slower than plain nfs,
and sometimes orders of magnitude slower (as others have shown).  But the
old nfs is so broken in other ways that I'd prefer slower yet more stable.
Thanks again for all your help, Rick!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-13 Thread Goran Lowkrantz
--On September 12, 2010 11:44:40 -0400 Rick Macklem rmack...@uoguelph.ca 
wrote:

On Wed, Sep 01, 2010 at 11:46:30AM -0400, Rick Macklem wrote:

snip

My results seems to confirm a factor of two (or 1.5) but it's stable:
new nfs nfsv4
369792 bytes transferred in 71.932692 secs (55607119 bytes/sec)
369792 bytes transferred in 66.806218 secs (59874214 bytes/sec)
369792 bytes transferred in 65.127972 secs (61417079 bytes/sec)
369792 bytes transferred in 64.493585 secs (62021204 bytes/sec)

old nfs nfsv3
369792 bytes transferred in 42.290365 secs (94583478 bytes/sec)
369792 bytes transferred in 42.135682 secs (94930700 bytes/sec)
369792 bytes transferred in 41.404841 secs (96606332 bytes/sec)
369792 bytes transferred in 41.461210 secs (96474989 bytes/sec)

new nfs nfsv3
369792 bytes transferred in 63.172592 secs (63318121 bytes/sec)
369792 bytes transferred in 64.149324 secs (62354044 bytes/sec)
369792 bytes transferred in 62.447537 secs (64053284 bytes/sec)
369792 bytes transferred in 57.203868 secs (69924813 bytes/sec)

Client:
FreeBSD 8.1-STABLE #200: Sun Sep 12 12:03:25 CEST 2010
   r...@skade.glz.hidden-powers.com:/usr/obj/usr/src/sys/GENERIC amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 5000+ (2600.26-MHz K8-class 
CPU)
 Origin = AuthenticAMD  Id = 0x60fb2  Family = f  Model = 6b  Stepping = 
2


em0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=19bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4
ether 00:1b:21:2e:7d:3c
inet 10.255.253.3 netmask 0xff00 broadcast 10.255.253.255
media: Ethernet autoselect (1000baseT full-duplex)
status: active

Server:
FreeBSD 8.1-STABLE #74: Sun Sep  5 18:47:12 CEST 2010
   r...@midgard.glz.hidden-powers.com:/usr/obj/usr/src/sys/SERVER amd64
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Phenom(tm) 9550 Quad-Core Processor (2210.08-MHz K8-class CPU)
 Origin = AuthenticAMD  Id = 0x100f23  Family = 10  Model = 2  Stepping 
= 3


re0: flags=8943UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST metric 0 
mtu 1500


options=3898VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC
ether 00:1f:d0:59:d8:e2
inet 10.255.253.1 netmask 0xff00 broadcast 10.255.253.255
media: Ethernet autoselect (1000baseT full-duplex)
status: active

Network:
Systems connected via two Netgear GS108T, one system to each switch, the 
switches connected via TP cable.


Patchar:
stable-8-v15.patch
zfs_metaslab_v2.patch
zfs_abe_stat_rrwlock.patch
arc.c.9.patch
r211970.patch

Cheers,
Göran

---
There is hopeful symbolism in the fact that flags do not wave in a vacuum.
   -- Arthur C. Clarke
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-12 Thread Rick Macklem
 On Wed, Sep 01, 2010 at 11:46:30AM -0400, Rick Macklem wrote:
  
   I am experiencing similar issues with newnfs:
  
   1) I have two clients that each get around 0.5MiB/s to 2.6MiB/s
   reading
   from the NFS4-share on Gbit-Lan
  
   2) Mounting with -t newnfs -o nfsv3 results in no performance gain
   whatsoever.
  
   3) Mounting with -t nfs results in 58MiB/s ! (Netcat has similar
   performance) ??? not a hardware/driver issue from my pov
 
  Ok, so it does sound like an issue in the experimental client and
  not NFSv4. For the most part, the read code is the same as
  the regular client, but it hasn't been brought up-to-date
  with recent changes.
 
 Do you (or will you soon) have some patches I/we could test? I'm
 willing to try anything to avoid mounting ten or so subdirectories in
 each of my mount points.
 
  One thing you could try is building a kernel without SMP enabled
  and see if that helps? (I only have single core hardware, so I won't
  see any SMP races.) If that helps, I can compare the regular vs
  experimental client for smp locking in the read stuff.
 
 I can try disabling SMP too. Should that really matter, if you're not
 even pegging one CPU? The locks shouldn't have *that* much overhead...
 
 -- Rick C. Petty

Just fyi, I asked folks to run read tests on the clients (over on
freebsd-fs), Sofar, I've only gotten one response, but they didn't
see the problem you are (they did see a factor of 2 slower, but it
is still 50Mbytes/sec). Maybe you can take a look at their email
and compare his client with yours? His message is:
   http://docs.FreeBSD.org/cgi/mid.cgi?01NRSE7GZJEC0022AD

Btw, if anyone who didn't see the posting on freebsd-fs and would
like to run a quick test, it would be appreciated.
Bascially do both kinds of mount using a FreeBSD8.1 or later client
and then read a greater than 100Mbyte file with dd.

# mount -t nfs -o nfsv3 server:/path /mnt-path
- cd anywhere in mount that has  100Mbyte file
# dd if=file of=/dev/null bs=1m
# umount /mnt-path

Then repeat with
# mount -t newnfs -o nfsv3 server:/path /mnt-path

and post the results along with the client machine's info
(machine arch/# of cores/memory/net interface used for NFS traffic).

Thanks in advance to anyone who runs the test, rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-04 Thread Rick Macklem
 
 Do you (or will you soon) have some patches I/we could test? I'm
 willing to try anything to avoid mounting ten or so subdirectories in
 each of my mount points.
 

Attached is a small patch for the only difference I can spot in the read
code between the regular and experimental NFS client.

I have asked jhb@ to try and do some testing, to see if he can reproduce
it. If he does reproduce it, maybe he can figure out what is going on.
(I don't think I'll have any further patches to try, unless he spots
something.)

  One thing you could try is building a kernel without SMP enabled
  and see if that helps? (I only have single core hardware, so I won't
  see any SMP races.) If that helps, I can compare the regular vs
  experimental client for smp locking in the read stuff.
 
 I can try disabling SMP too. Should that really matter, if you're not
 even pegging one CPU? The locks shouldn't have *that* much overhead...
 
 -- Rick C. Petty

If running UMP fixes the problem, it is most likely a missing lock that
allows a race to put things in a weird state. But for these things, it
is often something I'd never expect that turns out to be the culprit.

rick

--- fs/nfsclient/nfs_clbio.c.sav	2010-09-04 10:46:06.0 -0400
+++ fs/nfsclient/nfs_clbio.c	2010-09-04 10:47:06.0 -0400
@@ -510,10 +510,7 @@
 			rabp = nfs_getcacheblk(vp, rabn, biosize, td);
 			if (!rabp) {
 error = newnfs_sigintr(nmp, td);
-if (error)
-return (error);
-else
-break;
+return (error ? error : EINTR);
 			}
 			if ((rabp-b_flags  (B_CACHE|B_DELWRI)) == 0) {
 rabp-b_flags |= B_ASYNC;
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: Why is NFSv4 so slow?

2010-09-04 Thread Rick Macklem


- Original Message -
 On Wed, Sep 01, 2010 at 11:46:30AM -0400, Rick Macklem wrote:
  
   I am experiencing similar issues with newnfs:
  
   1) I have two clients that each get around 0.5MiB/s to 2.6MiB/s
   reading
   from the NFS4-share on Gbit-Lan
  
   2) Mounting with -t newnfs -o nfsv3 results in no performance gain
   whatsoever.
  
   3) Mounting with -t nfs results in 58MiB/s ! (Netcat has similar
   performance) ??? not a hardware/driver issue from my pov
 
  Ok, so it does sound like an issue in the experimental client and
  not NFSv4. For the most part, the read code is the same as
  the regular client, but it hasn't been brought up-to-date
  with recent changes.
 
 Do you (or will you soon) have some patches I/we could test? I'm
 willing to try anything to avoid mounting ten or so subdirectories in
 each of my mount points.
 
One other thing you could do is run this in a loop while you have a
slow read running. The client threads must be blocked somewhere a
lot if the read rate is so slow. (Then take a look at xxx and please
email it to me too.)

ps axHl  xxx
sleep 1

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-03 Thread Rick C. Petty
On Mon, Aug 30, 2010 at 09:59:38PM -0400, Rick Macklem wrote:
 
 I don't tune anything with sysctl, I just use what I get from an
 install from CD onto i386 hardware. (I don't even bother to increase
 kern.ipc.maxsockbuf although I suggest that in the mount message.)

Sure.  But maybe you don't have server mount points with 34k+ files in
them?  I notice when I increase maxsockbuf, the problem of disappearing
files goes away, mostly.  Often a find /mnt fixes the problem
temporarily, until I unmount and mount again.

 The only thing I can suggest is trying:
 # mount -t newnfs -o nfsv3 server:/path /mnt
 and seeing if that performs like the regukar NFSv3 or has
 the perf. issue you see for NFSv4?

Yes, that has the same exact problem.  However, if I use:
mount -t nfs server:/path /mnt
The problem does indeed go away!  But it means I have to mount all the
subdirectories independently, which I'm trying to avoid and is the
reason I went to NFSv4.

 If this does have the perf. issue, then the exp. client
 is most likely the cause and may get better in a few months
 when I bring it up-to-date.

Then that settles it-- the newnfs client seems to be the problem.  Just
to recap...  These two are *terribly* slow (e.g. a VBR mp3 avg 192kbps
cannot be played without skips):
mount -t newnfs -o nfsv4 server:/path /mnt
mount -t newnfs -o nfsv3 server:/path /mnt
But this one works just fine (H.264 1080p video does not skip):
mount -t nfs server:/path /mnt

I guess I will have to wait for you to bring the v4 client up to date.
Thanks again for all of your contributions and for porting NFSv4 to
FreeBSD!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-03 Thread Rick C. Petty
On Wed, Sep 01, 2010 at 11:46:30AM -0400, Rick Macklem wrote:
  
  I am experiencing similar issues with newnfs:
  
  1) I have two clients that each get around 0.5MiB/s to 2.6MiB/s
  reading
  from the NFS4-share on Gbit-Lan
  
  2) Mounting with -t newnfs -o nfsv3 results in no performance gain
  whatsoever.
  
  3) Mounting with -t nfs results in 58MiB/s ! (Netcat has similar
  performance) ??? not a hardware/driver issue from my pov
 
 Ok, so it does sound like an issue in the experimental client and
 not NFSv4. For the most part, the read code is the same as
 the regular client, but it hasn't been brought up-to-date
 with recent changes.

Do you (or will you soon) have some patches I/we could test?  I'm
willing to try anything to avoid mounting ten or so subdirectories in
each of my mount points.

 One thing you could try is building a kernel without SMP enabled
 and see if that helps? (I only have single core hardware, so I won't
 see any SMP races.) If that helps, I can compare the regular vs
 experimental client for smp locking in the read stuff.

I can try disabling SMP too.  Should that really matter, if you're not
even pegging one CPU?  The locks shouldn't have *that* much overhead...

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-09-01 Thread Rick Macklem
 Hi everyone,
 
 I am experiencing similar issues with newnfs:
 
 1) I have two clients that each get around 0.5MiB/s to 2.6MiB/s
 reading
 from the NFS4-share on Gbit-Lan
 
 2) Mounting with -t newnfs -o nfsv3 results in no performance gain
 whatsoever.
 
 3) Mounting with -t nfs results in 58MiB/s ! (Netcat has similar
 performance) → not a hardware/driver issue from my pov
 

The experimental client does reads in larger MAXBSIZE chunks,
which did cause a similar problem in Mac OS X until
rsize=32768,wsize=32768 was specified. Rick already tried that,
but you might want to try it for your case.

 Is there anything I can do to help fix this?
 
Ok, so it does sound like an issue in the experimental client and
not NFSv4. For the most part, the read code is the same as
the regular client, but it hasn't been brought up-to-date
with recent changes.

One thing you could try is building a kernel without SMP enabled
and see if that helps? (I only have single core hardware, so I won't
see any SMP races.) If that helps, I can compare the regular vs
experimental client for smp locking in the read stuff.

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-08-30 Thread Rick C. Petty
On Sun, Aug 29, 2010 at 11:44:06AM -0400, Rick Macklem wrote:
  Hi. I'm still having problems with NFSv4 being very laggy on one
  client.
  When the NFSv4 server is at 50% idle CPU and the disks are  1% busy,
  I am
  getting horrible throughput on an idle client. Using dd(1) with 1 MB
  block
  size, when I try to read a  100 MB file from the client, I'm getting
  around 300-500 KiB/s. On another client, I see upwards of 20 MiB/s
  with
  the same test (on a different file). On the broken client:
 
 Since other client(s) are working well, that seems to suggest that it
 is a network related problem and not a bug in the NFS code.

Well I wouldn't say well.  Every client I've set up has had this issue,
and somehow through tweaking various settings and restarting nfs a bunch of
times, I've been able to make it tolerable for most clients.  Only one
client is behaving well, and that happens to be the only machine I haven't
rebooted since I enabled NFSv4.  Other clients are seeing 2-3 MiB/s on my
dd(1) test.

I should point out that caching is an issue.  The second time I run a dd on
the same input file, I get upwards of 20-35 MiB/s on the bad client.  But
I can invalidate the cache by unmounting and remounting the file system
so it looks like client-side caching.

I'm not sure how you can say it's network-related and not NFS.  Things
worked just fine with NFSv3 (in fact NFSv3 client using the same NFSv4
server doesn't have this problem).  Using rsync over ssh I get around 15-20
MiB/s throughput, and dd piped through ssh gets almost 40 MiB/s (neither
one is using compression)!

 First off, the obvious question: How does this client differ from the
 one that performs much better?

Different hardware (CPU, board, memory).  I'm also hoping it was some
sysctl tweak I did, but I can't seem to determine what it was.

 Do they both use the same re network interface for the NFS traffic?
 (If the answer is no, I'd be suspicious that the re hardware or
 device driver is the culprit.)

That's the same thing you and others said about the *other* NFSv4 clients
I set up.  How is v4 that much different than v3 in terms of network
traffic?  The other clients are all using re0 and exactly the same
ifconfig options and flags, including the client that's behaving fine.

 Things that I might try in an effort to isolate the problem:
 - switch the NFS traffic to use the nfe0 net interface.

I'll consider it.  I'm not convinced it's a NIC problem yet.

 - put a net interface identical to the one on the client that
   works well in the machine and use that for the NFS traffic.

It's already close enough.  Bad client:

r...@pci0:1:7:0: class=0x02 card=0x816910ec chip=0x816910ec rev=0x10 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Single Gigabit LOM Ethernet Controller (RTL8110)'
class  = network
subclass   = ethernet

Good client:

r...@pci0:1:0:0: class=0x02 card=0x84321043 chip=0x816810ec rev=0x06 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class  = network
subclass   = ethernet

Mediocre client:

r...@pci0:1:0:0: class=0x02 card=0x84321043 chip=0x816810ec rev=0x06 
hdr=0x00
vendor = 'Realtek Semiconductor'
device = 'Gigabit Ethernet NIC(NDIS 6.0) (RTL8168/8111/8111c)'
class  = network
subclass   = ethernet

The mediocre and good clients have exactly identical hardware, and often
I'll witness the slow client behavior on the mediocre client, and rarely
on the good client although in previous emails to you, it was the good
client which was behaving the worst of all.

Other differences:
good client = 8.1 GENERIC r210227M amd64 12GB RAM Athlon II X2 255
med. client = 8.1 GENERIC r209555M i386 4GB RAM Athlon II X2 255
bad client = 8.1 GENERIC r211534M i386 2GB RAM Athlon 64 X2 5200+

 - turn off TXCSUM and RXCSUM on re0

Tried that, didn't help although it seemed to slow things down a little.

 - reduce the read/write data size, using rsize=N,wsize=N on the
   mount. (It will default to MAXBSIZE and some net interfaces don't
   handle large bursts of received data well. If you drop it to
   rsize=8192,wszie=8192 and things improve, then increase N until it
   screws up.)

8k didn't improve things at all.

 - check the port configuration on the switch end, to make sure it
   is also 1000bps-full duplex.

It is, and has been.

 - move the client to a different net port on the switch or even a
   different switch (and change the cable, while you're at it).

I've tried that too.  The switches are great and my cables are fine.
Like I said, NFSv3 on the same moint point works just fine (dd does
around 30-35 MiB/s).

 - Look at netstat -s and see if there are a lot of retransmits
   going on in TCP.

2 of 40k TCP packets retransmitted, 7k of 40k duplicate acks received.
I don't see anything else in netstat -s with numbers larger than 10.

 If none of the above seems to help, you 

Re: Why is NFSv4 so slow?

2010-08-30 Thread Rick Macklem
 
 Well I wouldn't say well. Every client I've set up has had this
 issue,
 and somehow through tweaking various settings and restarting nfs a
 bunch of
 times, I've been able to make it tolerable for most clients. Only one
 client is behaving well, and that happens to be the only machine I
 haven't
 rebooted since I enabled NFSv4. Other clients are seeing 2-3 MiB/s on
 my
 dd(1) test.
 
All I can tell you is that, for my old hardware (100Mbps networking)
I see 10Mbytes/sec (all you can hope for) using the regular NFSv3
client. I see about 10% slower for NFSv3 and NFSv4 using the experimental
client (NFSv3 and NFSv4 about identical). The 10% doesn't surprise me,
since the experimental client is based on a FreeBSD6 client and,
although I plan on carrying all the newer client changes over to
it, I haven't gotten around to doing that. If it is still 10% slower
after the changes are carried over, I will be looking at why.

I don't tune anything with sysctl, I just use what I get from an
install from CD onto i386 hardware. (I don't even bother to increase
kern.ipc.maxsockbuf although I suggest that in the mount message.)

I also do not specify any mount options other than the protocol
version. My mount commands look like:
# mount -t nfs -o nfsv3 server:/path /mnt
# mount -t newnfs -o nfsv3 server:/path /mnt
# mount -t nfs -o nfsv4 server:/path /mnt

So, I don't see dramatically slower NFSv4 and expect to get the 10%
perf. reduction fixed when I bring the exp. client in line with
the current one, but can't be sure.

So, I have no idea what you are seeing. It might be an issue
that will be fixed when I bring the exp. client up to date,
but I have no idea if that's the case? (It will be a few
months before the client update happens.)

The only thing I can suggest is trying:
# mount -t newnfs -o nfsv3 server:/path /mnt
and seeing if that performs like the regukar NFSv3 or has
the perf. issue you see for NFSv4?

If this does have the perf. issue, then the exp. client
is most likely the cause and may get better in a few months
when I bring it up-to-date.

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-08-30 Thread Rick Macklem
 On Sun, Aug 29, 2010 at 11:44:06AM -0400, Rick Macklem wrote:
   Hi. I'm still having problems with NFSv4 being very laggy on one
   client.
   When the NFSv4 server is at 50% idle CPU and the disks are  1%
   busy,
   I am
   getting horrible throughput on an idle client. Using dd(1) with 1
   MB
   block
   size, when I try to read a  100 MB file from the client, I'm
   getting
   around 300-500 KiB/s. On another client, I see upwards of 20 MiB/s
   with
   the same test (on a different file). On the broken client:
 
  Since other client(s) are working well, that seems to suggest that
  it
  is a network related problem and not a bug in the NFS code.
 

Oh, one more thing...Make sure that the user and group name/number
space is consistent across all machines and nfsuserd is working on
them all. (Look at ls -lg on the clients and see that the
correct user/group names are showing up.) If this mapping isn't
working correctly, it will do an upcall to the userland nfsuserd for
every RPC and that would make NFSv4 run very slowly. It will also
use the domain part (after first '.') of each machine's hostname,
so make sure that all the hostnames (all clients and server) are
the same. ie: server.cis.uoguelph.ca, client1.cis.uoguelph.ca,...
are all .cis.uoguelph.ca.

If that is the problem
# mount -t newnfs -o nfsv3 server:/path /mnt
will work fine, since NFSv3 doesn't use the mapping daemon.

 
 Is that something easily scriptable with tcpdump? I'd rather not look
 for such things manually.
 
I've always done this manually and, although tcpdump can be used
to do the packet capture, wireshark actually understands NFS packets
and, as such, is much better for looking at the packets.

rick


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-08-29 Thread Rick Macklem
 Hi. I'm still having problems with NFSv4 being very laggy on one
 client.
 When the NFSv4 server is at 50% idle CPU and the disks are  1% busy,
 I am
 getting horrible throughput on an idle client. Using dd(1) with 1 MB
 block
 size, when I try to read a  100 MB file from the client, I'm getting
 around 300-500 KiB/s. On another client, I see upwards of 20 MiB/s
 with
 the same test (on a different file). On the broken client:
 

Since other client(s) are working well, that seems to suggest that it
is a network related problem and not a bug in the NFS code.

First off, the obvious question: How does this client differ from the
one that performs much better?
Do they both use the same re network interface for the NFS traffic?
(If the answer is no, I'd be suspicious that the re hardware or
device driver is the culprit.)

Things that I might try in an effort to isolate the problem:
- switch the NFS traffic to use the nfe0 net interface.
- put a net interface identical to the one on the client that
  works well in the machine and use that for the NFS traffic.
- turn off TXCSUM and RXCSUM on re0
- reduce the read/write data size, using rsize=N,wsize=N on the
  mount. (It will default to MAXBSIZE and some net interfaces don't
  handle large bursts of received data well. If you drop it to
  rsize=8192,wszie=8192 and things improve, then increase N until it
  screws up.)
- check the port configuration on the switch end, to make sure it
  is also 1000bps-full duplex.
- move the client to a different net port on the switch or even a
  different switch (and change the cable, while you're at it).
- Look at netstat -s and see if there are a lot of retransmits
  going on in TCP.

If none of the above seems to help, you could look at a packet trace
and see what is going on. Look for TCP reconnects (SYN, SYN-ACK...)
or places where there is a large time delay/retransmit of a TCP
segment.

Hopefully others who are more familiar with the networking side
can suggest other things to try, rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-08-28 Thread Rick C. Petty
Hi.  I'm still having problems with NFSv4 being very laggy on one client.
When the NFSv4 server is at 50% idle CPU and the disks are  1% busy, I am
getting horrible throughput on an idle client.  Using dd(1) with 1 MB block
size, when I try to read a  100 MB file from the client, I'm getting
around 300-500 KiB/s.  On another client, I see upwards of 20 MiB/s with
the same test (on a different file).  On the broken client:

# uname -mv
FreeBSD 8.1-STABLE #5 r211534M: Sat Aug 28 15:53:10 CDT 2010 
u...@example.com:/usr/obj/usr/src/sys/GENERIC  i386

# ifconfig re0
re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500

options=389bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC
ether 00:e0:4c:xx:yy:zz
inet xx.yy.zz.3 netmask 0xff00 broadcast xx.yy.zz.255
media: Ethernet autoselect (1000baseT full-duplex)
status: active

# netstat -m
267/768/1035 mbufs in use (current/cache/total)
263/389/652/25600 mbuf clusters in use (current/cache/total/max)
263/377 mbuf+clusters out of packet secondary zone in use (current/cache)
0/20/20/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
592K/1050K/1642K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/5/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

# netstat -idn
NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
re01500 Link#1  00:e0:4c:xx:yy:zz   232135 0 068984 0 
00 
re01500 xx.yy.zz.0/2 xx.yy.zz.3 232127 - -68979 -   
  -- 
nfe0*  1500 Link#2  00:22:15:xx:yy:zz0 0 00 0 
00 
plip0  1500 Link#3   0 0 00 0 
00 
lo0   16384 Link#4  42 0 0   42 0 
00 
lo0   16384 fe80:4::1/64  fe80:4::10 - -0 - 
-- 

lo0   16384 ::1/128   ::1  0 - -0 - 
-- 
lo0   16384 127.0.0.0/8   127.0.0.1   42 - -   42 - 
-- 

# sysctl kern.ipc.maxsockbuf
kern.ipc.maxsockbuf: 1048576
# sysctl net.inet.tcp.sendbuf_max
net.inet.tcp.sendbuf_max: 16777216
# sysctl net.inet.tcp.recvbuf_max
net.inet.tcp.recvbuf_max: 16777216
# sysctl net.inet.tcp.sendspace
net.inet.tcp.sendspace: 65536
# sysctl net.inet.tcp.recvspace
net.inet.tcp.recvspace: 131072

# sysctl hw.pci | grep msi
hw.pci.honor_msi_blacklist: 1
hw.pci.enable_msix: 1
hw.pci.enable_msi: 1

# vmstat -i
interrupt  total   rate
irq14: ata0   47  0
irq16: re0219278191
irq21: ohci0+   5939  5
irq22: vgapci0+77990 67
cpu0: timer  2294451   1998
irq256: hdac0  44069 38
cpu1: timer  2293983   1998
Total4935757   4299

Any ideas?

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-07-05 Thread Rick Macklem



On Sun, 27 Jun 2010, Rick C. Petty wrote:


First off, many thanks to Rick Macklem for making NFSv4 possible in
FreeBSD!

I recently updated my NFS server and clients to v4, but have since noticed
significant performance penalties.  For instance, when I try ls a b c (if
a, b, and c are empty directories) on the client, it takes up to 1.87
seconds (wall time) whereas before it always finished in under 0.1 seconds.
If I repeat the test, it takes the same amount of time in v4 (in v3, wall
time was always under 0.01 seconds for subsequent requests, as if the
directory listing was cached).

If I try to play an h264 video file on the filesystem using mplayer, it
often jitters and skipping around in time introduces up to a second or so
pause.  With NFSv3 it behaved more like the file was on local disk (no
noticable pauses or jitters).


I just came across a case where things get really slow during testing
of some experimental caching stuff. (It was caused by the experimental
stuff not in head, but...)

It turns out that if numvnodes  desiredvnodes, it sleeps for 1sec before
allocating a new vnode. This might explain your approx. 1sec delays.

When this happens, ps axlH will probably show a process sleeping on
vlruwk and desiredvnodes can be increased by setting a larger value
for kern.maxvnodes. (numvnodes can be seen as vfs.numvnodes)

I don't think there is actually a vnode leak, but you might find that
the experimental nfs subsystem is a vnode hog.

rick


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-07-01 Thread Rick Macklem



On Mon, 28 Jun 2010, Rick C. Petty wrote:


On Mon, Jun 28, 2010 at 12:35:14AM -0400, Rick Macklem wrote:


Being stuck in newnfsreq means that it is trying to establish a TCP
connection with the server (again smells like some networking issue).
snip
Disabling delegations is the next step. (They aren't
required for correct behaviour and are disabled by default because
they are the greenest part of the implementation.)


After disabling delegations, I was able to build world and kernel on two
different clients, and my port build problems went away as well.


I was able to reproduce a problem when delegations are enabled and the
rdirplus option was used on a mount. Since I haven't done non-trivial
testing with rdirplus set, but have done quite a bit with delegations
enabled for mounts without rdirplus, I suspect the problem is related
to using rdirplus on NFSv4 mounts.

So, I'd recommend against using rdirplus on NFSv4 mounts until the
problem gets resolved.

You could try re-enabling delegations and the try mounts without 
rdirplus and see if the problems during builds still show up?


Thanks for your help with testing, rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-30 Thread Rick Macklem



On Wed, 30 Jun 2010, Ian Smith wrote:



I wondered whether this might be a Linux thing.  On my 7.2 system,

% find /usr/src -name *.[ch] -exec grep -Hw getpwuid {} \;  file

returns 195 lines, many in the form getpwuid(getuid()), in many base and
contrib components - including id(1), bind, sendmail etc - that could be
nondeterministic if getpwuid(0) ever returned other than root.

Just one mention of 'toor' in /usr/src/usr.sbin/makefs/compat/pwcache.c

Not claiming to know how the lookups in /usr/src/lib/libc/gen/getpwent.c
work under the hood, but this does seem likely a non-issue on FreeBSD.


I remember it causing some confusion while testing, but I can't remember
when or where. It might have been Linux or I might have been logged in as
toor or 

I think I will hardcode the root case in nfsuserd, just to be safe.
(I also migt have editted /etc/passwd and reordered entries without
paying attention to it.)

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Rick Macklem



On Tue, 29 Jun 2010, Ian Smith wrote:



Not wanting to hijack this (interesting) thread, but ..

I have to concur with Rick P - that's rather a odd requirement when each
FreeBSD install since at least 2.2 has come with root and toor (in that
order) in /etc/passwd.  I don't use toor, but often enough read about
folks who do, and don't recall it ever being an issue with NFSv3.  Are
you sure this is a problem that cannot be coded around in NFSv4?


Currently when the nfsuserd needs to translate a uid (such as 0) into a
name (NFSv4 uses names instead of the numbers used by NFSv3), it calls
getpwuid() and uses whatever name is returned. If there are more than
one name for the uid (such as the above case for 0), then you get one
of them and that causes confusion.

I suppose if the FreeBSD world feels that root and toor must both
exist in the password database, then nfsuserd could be hacked to handle
the case of translating uid 0 to root without calling getpwuid(). It
seems ugly, but if deleting toor from the password database upsets
people, I can do that.

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Adam Vande More
On Tue, Jun 29, 2010 at 9:58 AM, Rick Macklem rmack...@uoguelph.ca wrote:

 I suppose if the FreeBSD world feels that root and toor must both
 exist in the password database, then nfsuserd could be hacked to handle
 the case of translating uid 0 to root without calling getpwuid(). It
 seems ugly, but if deleting toor from the password database upsets
 people, I can do that.


I agree with Ian on this.  I don't use toor either, but have seen people use
it, and sometimes it will get recommended here for various reasons e.g.
running a root account with a different default shell.  It wouldn't bother
me having to do this provided it was documented, but having to do so would
be a POLA violation to many users I think.

-- 
Adam Vande More
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Rick C. Petty
On Tue, Jun 29, 2010 at 10:20:57AM -0500, Adam Vande More wrote:
 On Tue, Jun 29, 2010 at 9:58 AM, Rick Macklem rmack...@uoguelph.ca wrote:
 
  I suppose if the FreeBSD world feels that root and toor must both
  exist in the password database, then nfsuserd could be hacked to handle
  the case of translating uid 0 to root without calling getpwuid(). It
  seems ugly, but if deleting toor from the password database upsets
  people, I can do that.
 
 I agree with Ian on this.  I don't use toor either, but have seen people use
 it, and sometimes it will get recommended here for various reasons e.g.
 running a root account with a different default shell.  It wouldn't bother
 me having to do this provided it was documented, but having to do so would
 be a POLA violation to many users I think.

To be fair, I'm not sure this is even a problem.  Rick M. only suggested it
as a possibility.  I would think that getpwuid() would return the first
match which has always been root.  At least that's what it does when
scanning the passwd file; I'm not sure about NIS.  If someone can prove
that this will cause a problem with NFSv4, we could consider hackingit.
Otherwise I don't think we should change this behavior yet.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Rick Macklem



On Tue, 29 Jun 2010, Rick C. Petty wrote:



To be fair, I'm not sure this is even a problem.  Rick M. only suggested it
as a possibility.  I would think that getpwuid() would return the first
match which has always been root.  At least that's what it does when
scanning the passwd file; I'm not sure about NIS.  If someone can prove
that this will cause a problem with NFSv4, we could consider hackingit.
Otherwise I don't think we should change this behavior yet.


I do know that it causes problems from my testing. I think getpwuid() gets
toor because of the way /etc/passwd gets stored in the database created
from it via vipw.

I have no problem coding it as a special case for nfsuserd and documenting
it. I just won't guarantee how soon it will happen:-)

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-29 Thread Rick Macklem



On Mon, 28 Jun 2010, Rick C. Petty wrote:




It would be interesting to see if the performance problem exists for
NFSv3 mounts against the experimental (nfsv4) server.


Hmm, I couldn't reproduce the problem.  Once I unmounted the nfsv4 client
and tried v3, the jittering stopped.  Then I unmounted v3 and tried v4
again, no jitters.  I played with a couple of combinations back and forth
(toggling the presence of nfsv4 in the options) and sometimes I saw
jittering but only with v4, but nothing like what I was seeing before.
Perhaps this is a result of Jeremy's TCP tuning tweaks.

This is also a difficult thing to test, because the server and client have
so much memory, they cache the date blocks.  So if I try my stutter test
on the same video a second time, I only notice stutters if I skip to parts
I haven't skipped to before.  I can comment that it seemed like more of a
latency issue than a throughput issue to me.  But the disks aren't ever
under a high load.  But it's hard to determine accurate load when the
disks are seeking.  Oh, I'm using the AHCI controller mode/driver on those
disks instead of ATA, if that matters.



I basically don't have a clue what might be the source of the problem. I
do agree that it sounds like an intermittent latency issue.

The only thing I can think of that you might try is simply increasing
the number of nfsd threads on the server. They don't add much overhead
and the default of '4'is pretty small. (It's just the -n N option on
nfsd, just in case you weren't aware of it.)


One time when I mounted the v4 again, it broke subdirectories like I was
talking about before.  Essentially it would give me a readout of all the
top-level directories but wouldn't descend into subdirectories which
reflect different mountpoints on the server.  An unmount and a remount
(without changes to /etc/fstab) fixed the problem.  I'm wondering if there
isn't some race condition that seems to affect crossing mountpoints on the
server.  When the situation happens, it affects all mountpoints equally
and persists for the duration of that mount.  And of course, I can't
reproduce the problem when I try.



If it happened for a hard mount (no soft,intr mount options) then it
is a real bug. The server mount point crossings are detected via a change
in the value of the fsid attribute. I suspect that under some 
circumstances, the wrong value of fsid is getting cached in the client.

(I just remembered that you use rdirplus and it might not be caching
the server's notion of fsid in the right place.)

If you were really keen (if you ever look up keen in Webster's, it's
not what we tend to use it for at all. It was actually a wail for the 
dead and a keener was a professional wailer for the dead, hired for

funerals of important but maybe not that well liked individuals. But I
digress...), you could try a bunch of mounts./dismounts without rdirplus
and see if you can even get it to fail without the option.


I saw the broken mountpoint crossing on another client (without any TCP
tuning) but each time it happened I saw this in the logs:

nfscl: consider increasing kern.ipc.maxsockbuf

Once I doubled that value, the problem went away..  at least with this
particular v4 server mountpoint.



If this had any effect, it was probably timing/latency related to a bug
w.r.t. caching of the server's notion of fsid. I'll poke around and see
if I can spot where this might be broken.


At the moment, things are behaving as expected.  The v4 file system seems
just as fast as v3 did, and I don't need a dozen mountpoints specified
on each client thanks to v4.  Once again, I thank you, Rick, for all your
hard work!


Btw, if the mountpoint crossing bug gets too irritating, you can do the
multiple mounts for NFSv4 just like NFSv3. (That's what you have to do
do for the Solaris10 NFSv4 client, because its completely broken w.r.t.
mountpoint crossings.)

rick


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Dan Nelson
In the last episode (Jun 29), Rick C. Petty said:
 On Tue, Jun 29, 2010 at 10:20:57AM -0500, Adam Vande More wrote:
  On Tue, Jun 29, 2010 at 9:58 AM, Rick Macklem rmack...@uoguelph.ca wrote:
  
   I suppose if the FreeBSD world feels that root and toor must both
   exist in the password database, then nfsuserd could be hacked to
   handle the case of translating uid 0 to root without calling
   getpwuid().  It seems ugly, but if deleting toor from the password
   database upsets people, I can do that.
  
  I agree with Ian on this.  I don't use toor either, but have seen people
  use it, and sometimes it will get recommended here for various reasons
  e.g.  running a root account with a different default shell.  It
  wouldn't bother me having to do this provided it was documented, but
  having to do so would be a POLA violation to many users I think.
 
 To be fair, I'm not sure this is even a problem.  Rick M. only suggested
 it as a possibility.  I would think that getpwuid() would return the first
 match which has always been root.  At least that's what it does when
 scanning the passwd file; I'm not sure about NIS.  If someone can prove
 that this will cause a problem with NFSv4, we could consider hackingit. 
 Otherwise I don't think we should change this behavior yet.

If there are multiple users that map to the same userid, nscd on Linux will
select one name at random and return it for getpwuid() calls.  I haven't
seen this behaviour on FreeBSD or Solaris, though.  They always seem to
return the first entry in the passwd file.

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-29 Thread Ian Smith
On Tue, 29 Jun 2010, Dan Nelson wrote:
  In the last episode (Jun 29), Rick C. Petty said:
   On Tue, Jun 29, 2010 at 10:20:57AM -0500, Adam Vande More wrote:
On Tue, Jun 29, 2010 at 9:58 AM, Rick Macklem rmack...@uoguelph.ca 
wrote:

 I suppose if the FreeBSD world feels that root and toor must both
 exist in the password database, then nfsuserd could be hacked to
 handle the case of translating uid 0 to root without calling
 getpwuid().  It seems ugly, but if deleting toor from the password
 database upsets people, I can do that.

I agree with Ian on this.  I don't use toor either, but have seen people
use it, and sometimes it will get recommended here for various reasons
e.g.  running a root account with a different default shell.  It
wouldn't bother me having to do this provided it was documented, but
having to do so would be a POLA violation to many users I think.
   
   To be fair, I'm not sure this is even a problem.  Rick M. only suggested
   it as a possibility.  I would think that getpwuid() would return the first
   match which has always been root.  At least that's what it does when
   scanning the passwd file; I'm not sure about NIS.  If someone can prove
   that this will cause a problem with NFSv4, we could consider hackingit. 
   Otherwise I don't think we should change this behavior yet.
  
  If there are multiple users that map to the same userid, nscd on Linux will
  select one name at random and return it for getpwuid() calls.  I haven't
  seen this behaviour on FreeBSD or Solaris, though.  They always seem to
  return the first entry in the passwd file.

I wondered whether this might be a Linux thing.  On my 7.2 system,

% find /usr/src -name *.[ch] -exec grep -Hw getpwuid {} \;  file

returns 195 lines, many in the form getpwuid(getuid()), in many base and 
contrib components - including id(1), bind, sendmail etc - that could be 
nondeterministic if getpwuid(0) ever returned other than root.

Just one mention of 'toor' in /usr/src/usr.sbin/makefs/compat/pwcache.c

Not claiming to know how the lookups in /usr/src/lib/libc/gen/getpwent.c 
work under the hood, but this does seem likely a non-issue on FreeBSD.

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 12:30:30AM -0400, Rick Macklem wrote:
 
 I can't explain the corruption, beyond the fact that soft,intr can
 cause all sorts of grief. If mounts without soft,intr still show
 corruption problems, try disabling delegations (either kill off the
 nfscbd daemons on the client or set vfs.newnfs.issue_delegations=0
 on the server). It is disabled by default because it is the greenest
 part of the subsystem.

I tried without soft,intr and make buildworld failed with what looks like
file corruption again.  I'm trying without delegations now.

 Make sure you don't have multiple entries for the same uid, such as root
 and toor both for uid 0 in your /etc/passwd. (ie. get rid of one of 
 them, if you have both)

Hmm, that's a strange requirement, since FreeBSD by default comes with
both.  That should probably be documented in the nfsv4 man page.

 When you specify nfs for an NFSv3 mount, you get the regular client.
 When you specify newnfs for an NFSv3 mount, you get the experimental
 client. When you specify nfsv4 you always get the experimental NFS
 client, and it doesn't matter which FStype you've specified.

Ok.  So my comparison was with the regular and experimental clients.

 If you are using UFS/FFS on the server, this should work and I don't know
 why the empty directories under /vol on the client confused it. If your
 server is using ZFS, everything from / including /vol need to be exported.

Nope, UFS2 only (on both clients and server).

  kernel: nfsv4 client/server protocol prob err=10020
 
 This error indicates that there wasn't a valid FH for the server. I
 suspect that the mount failed. (It does a loop of Lookups from / in
 the kernel during the mount and it somehow got confused part way through.)

If the mount failed, why would it allow me to ls /vol/a and see both b
and c directories as well as other files/directories on /vol/ ?

 I don't know why these empty dirs would confuse it. I'll try a test
 here, but I suspect the real problem was that the mount failed and
 then happened to succeed after you deleted the empty dirs.

It doesn't seem likely.  I spent an hour mounting and unmounting and each
mount looked successful in that there were files and directories besides
the two I was trying to decend into.

 It still smells like some sort of transport/net interface/... issue
 is at the bottom of this. (see response to your next post)

It's possible.  I just had another NFSv4 client (with the same server) lock
up:

load: 0.00  cmd: ls 17410 [nfsv4lck] 641.87r 0.00u 0.00s 0% 1512k

and:

load: 0.00  cmd: make 87546 [wait] 37095.09r 0.01u 0.01s 0% 844k

That make has been hung for hours, and the ls(1) was executed during that
lockup.  I wish there was a way I could unhang these processes and unmount
the NFS mount without panicking the kernel, but alas even this fails:

# umount -f /sw
load: 0.00  cmd: umount 17479 [nfsclumnt] 1.27r 0.00u 0.04s 0% 788k

A shutdown -p now resulted in a panic with the speaker beeping
constantly and no console output.

It's possible the NICs are all suspect, but all of this worked fine a
couple of days ago when I was only using NFSv3.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Sun, Jun 27, 2010 at 09:58:53PM -0700, Jeremy Chadwick wrote:
  
  Again, my ports tree is mounted as FSType nfs with option nfsv4.
  FreeBSD/amd64 8.1-PRERELEASE r208408M GENERIC kernel.
 
 This sounds like NFSv4 is tickling some kind of bug in your NIC driver
 but I'm not entirely sure.  Can you provide output from:
 
 1) ifconfig -a  (you can X out the IPs + MACs if you want)

On the NFSv4 server:

nfe0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=8010bRXCSUM,TXCSUM,VLAN_MTU,TSO4,LINKSTATE
ether 00:22:15:b4:2d:XX
inet 172.XX.XX.4 netmask 0xff00 broadcast 172.XX.XX.255
media: Ethernet autoselect (1000baseT full-duplex)
status: active
lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=3RXCSUM,TXCSUM
inet6 fe80::1 prefixlen 64 scopeid 0x2 
inet6 ::1 prefixlen 128 
inet 127.0.0.1 netmask 0xff00 
nd6 options=3PERFORMNUD,ACCEPT_RTADV

On one of the clients:

re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500

options=389bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC
ether e0:cb:4e:cd:d3:XX
inet 172.XX.XX.9 netmask 0xff00 broadcast 172.XX.XX.255
media: Ethernet autoselect (1000baseT full-duplex)
status: active
lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
options=3RXCSUM,TXCSUM
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 
inet6 ::1 prefixlen 128 
inet 127.0.0.1 netmask 0xff00 
nd6 options=3PERFORMNUD,ACCEPT_RTADV

 2) netstat -m

server:

1739/1666/3405 mbufs in use (current/cache/total)
257/1257/1514/25600 mbuf clusters in use (current/cache/total/max)
256/547 mbuf+clusters out of packet secondary zone in use (current/cache)
0/405/405/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
948K/4550K/5499K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

client:

264/2046/2310 mbufs in use (current/cache/total)
256/1034/1290/25600 mbuf clusters in use (current/cache/total/max)
256/640 mbuf+clusters out of packet secondary zone in use (current/cache)
3/372/375/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
590K/4067K/4657K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

 3) vmstat -i

Server:

interrupt  total   rate
irq1: atkbd0  24  0
irq18: atapci1   1883933  0
irq20: nfe0 ohci1 1712603504793
cpu0: timer   4315963536   1999
irq256: hdac0 12  0
irq257: ahci0  139934363 64
cpu2: timer   4315960172   1999
cpu1: timer   4315960172   1999
cpu3: timer   4315960172   1999
Total19118265888   8858

Client:

interrupt  total   rate
irq1: atkbd0 1063022  0
irq16: hdac016013959  6
irq17: atapci0+++  6  0
irq18: ohci0 ohci1*  5324486  2
irq19: atapci1   7500968  2
irq20: ahc0   19  0
irq21: ahc1   112390  0
cpu0: timer   5125670841   1999
irq256: hdac1  2  0
irq257: re0742537149289
cpu1: timer   5125664297   1999
Total11023887139   4301

 4) prtconf -lvc  (only need the Ethernet-related entries)

I'll assume you meant to type pciconf, on the server:

n...@pci0:0:10:0:   class=0x02 card=0x82f21043 chip=0x076010de rev=0xa2 
hdr=0x00
vendor = 'NVIDIA Corporation'
device = 'NForce Network Controller (MCP78 NIC)'
class  = network
subclass   = ethernet
cap 01[44] = powerspec 2  supports D0 D1 D2 D3  current D0
cap 05[50] = MSI supports 16 messages, 64 bit, vector masks 
cap 08[6c] = HT MSI fixed address window disabled at 

Re: Why is NFSv4 so slow?

2010-06-28 Thread Jeremy Chadwick
On Mon, Jun 28, 2010 at 09:20:25AM -0500, Rick C. Petty wrote:
   
   Again, my ports tree is mounted as FSType nfs with option nfsv4.
   FreeBSD/amd64 8.1-PRERELEASE r208408M GENERIC kernel.
  
  This sounds like NFSv4 is tickling some kind of bug in your NIC driver
  but I'm not entirely sure.  Can you provide output from:
  
  1) ifconfig -a  (you can X out the IPs + MACs if you want)
 
 
 On one of the clients:
 
 re0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
   
 options=389bRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,WOL_UCAST,WOL_MCAST,WOL_MAGIC
   ether e0:cb:4e:cd:d3:XX
   inet 172.XX.XX.9 netmask 0xff00 broadcast 172.XX.XX.255
   media: Ethernet autoselect (1000baseT full-duplex)
   status: active
 lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST metric 0 mtu 16384
   options=3RXCSUM,TXCSUM
   inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 
   inet6 ::1 prefixlen 128 
   inet 127.0.0.1 netmask 0xff00 
   nd6 options=3PERFORMNUD,ACCEPT_RTADV
 
  2) netstat -m
 
 server:
 
 1739/1666/3405 mbufs in use (current/cache/total)
 257/1257/1514/25600 mbuf clusters in use (current/cache/total/max)
 256/547 mbuf+clusters out of packet secondary zone in use (current/cache)
 0/405/405/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
 948K/4550K/5499K bytes allocated to network (current/cache/total)
 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
 0/0/0 sfbufs in use (current/peak/max)
 0 requests for sfbufs denied
 0 requests for sfbufs delayed
 0 requests for I/O initiated by sendfile
 0 calls to protocol drain routines
 
 client:
 
 264/2046/2310 mbufs in use (current/cache/total)
 256/1034/1290/25600 mbuf clusters in use (current/cache/total/max)
 256/640 mbuf+clusters out of packet secondary zone in use (current/cache)
 3/372/375/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
 590K/4067K/4657K bytes allocated to network (current/cache/total)
 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
 0/0/0 sfbufs in use (current/peak/max)
 0 requests for sfbufs denied
 0 requests for sfbufs delayed
 0 requests for I/O initiated by sendfile
 0 calls to protocol drain routines
 
  3) vmstat -i
 
 Server:
 
 interrupt  total   rate
 irq1: atkbd0  24  0
 irq18: atapci1   1883933  0
 irq20: nfe0 ohci1 1712603504793
 cpu0: timer   4315963536   1999
 irq256: hdac0 12  0
 irq257: ahci0  139934363 64
 cpu2: timer   4315960172   1999
 cpu1: timer   4315960172   1999
 cpu3: timer   4315960172   1999
 Total19118265888   8858
 
 Client:
 
 interrupt  total   rate
 irq1: atkbd0 1063022  0
 irq16: hdac016013959  6
 irq17: atapci0+++  6  0
 irq18: ohci0 ohci1*  5324486  2
 irq19: atapci1   7500968  2
 irq20: ahc0   19  0
 irq21: ahc1   112390  0
 cpu0: timer   5125670841   1999
 irq256: hdac1  2  0
 irq257: re0742537149289
 cpu1: timer   5125664297   1999
 Total11023887139   4301
 
  4) prtconf -lvc  (only need the Ethernet-related entries)
 
 I'll assume you meant to type pciconf, on the server:

Yes sorry -- I spend my days at work dealing with Solaris (which is
where where prtconf comes from :-) ).

Three other things to provide output from if you could (you can X out IPs
and MACs too), from both client and server:

6) netstat -idn
7) sysctl hw.pci | grep msi
8) Contents of /etc/sysctl.conf

Thanks.

 server, immediately after restarting all of nfs scripts (rpcbind
 nfsclient nfsuserd nfsserver mountd nfsd statd lockd nfscbd):

 Jun 27 18:04:44 rpcbind: cannot get information for udp6
 Jun 27 18:04:44 rpcbind: cannot get information for tcp6

These two usually indicate you removed IPv6 support from the kernel,
except your ifconfig output (I've remove it) on the server shows you do
have IPv6 support.  I've been trying to get these warnings removed for
quite some time (PR kern/96242).  They're harmless, but the
inconsistency here is a little weird -- are you explicitly disabling
IPv6 on nfe0?

The remaining messages in your kernel 

Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 07:56:00AM -0700, Jeremy Chadwick wrote:
 
 Three other things to provide output from if you could (you can X out IPs
 and MACs too), from both client and server:
 
 6) netstat -idn

server:

NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
nfe0   1500 Link#1  00:22:15:b4:2d:XX 1767890778 0 0 872169302
 0 00 
nfe0   1500 172.XX.XX.0/2 172.XX.XX.4   1767882158 - - 1964274616   
  - -- 
lo0   16384 Link#23728 0 0 3728 0 
00 
lo0   16384 
(28)00:00:00:00:00:00:fe:80:00:02:00:00:00:00:00:00:00:00:00:00:00:01 3728  
   0 0 3728 0 00 
lo0   16384 
(28)00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:01 3728  
   0 0 3728 0 00 
lo0   16384 127.0.0.0/8   127.0.0.1 3648 - - 3664 - 
-- 

client:

NameMtu Network   Address  Ipkts Ierrs IdropOpkts Oerrs 
 Coll Drop
re01500 Link#1  e0:cb:4e:cd:d3:XX 955288523 0 0 696819089 
0 00 
re01500 172.XX.XX.0/2 172.XX.XX.2   955279721 - - 696814499 
- -- 
lo0   16384 Link#23148 0 0 3148 0 
00 
lo0   16384 
(28)00:00:00:00:00:00:fe:80:00:02:00:00:00:00:00:00:00:00:00:00:00:01 3148  
   0 0 3148 0 00 
lo0   16384 
(28)00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:01 3148  
   0 0 3148 0 00 
lo0   16384 127.0.0.0/8   127.0.0.1 3112 - - 3112 - 
-- 

 7) sysctl hw.pci | grep msi

both server and client:

hw.pci.honor_msi_blacklist: 1
hw.pci.enable_msix: 1
hw.pci.enable_msi: 1

 8) Contents of /etc/sysctl.conf

server and client:

# 4 virtual channels
dev.pcm.0.play.vchans=4
# Read modules from /usr/local/modules
kern.module_path=/boot/kernel;/boot/modules;/usr/local/modules
# Remove those annoying ARP moved messages:
net.link.ether.inet.log_arp_movements=0
# 32MB write cache on disk controllers system-wide
vfs.hirunningspace=33554432
# Allow users to mount file systems
vfs.usermount=1
# misc
net.link.tap.user_open=1
net.inet.ip.forwarding=1
compat.linux.osrelease=2.6.16
debug.ddb.textdump.pending=1
# for NFSv4
kern.ipc.maxsockbuf=524288

  server, immediately after restarting all of nfs scripts (rpcbind
  nfsclient nfsuserd nfsserver mountd nfsd statd lockd nfscbd):
 
  Jun 27 18:04:44 rpcbind: cannot get information for udp6
  Jun 27 18:04:44 rpcbind: cannot get information for tcp6
 
 These two usually indicate you removed IPv6 support from the kernel,
 except your ifconfig output (I've remove it) on the server shows you do
 have IPv6 support.  I've been trying to get these warnings removed for
 quite some time (PR kern/96242).  They're harmless, but the
 inconsistency here is a little weird -- are you explicitly disabling
 IPv6 on nfe0?

I have WITHOUT_IPV6= in my make.conf on all my machines (or I have
problems with jdk1.6) and WITHOUT_INET6= in my src.conf.  I'm not sure
why the rpcbind/ifconfig binaries have a different concept than the
kernel since I always make buildworld kernel and keep things in sync
with mergemaster when I reboot.  I'm building new worlds/kernels now
to see if that makes any difference.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 12:35:14AM -0400, Rick Macklem wrote:
 
 Being stuck in newnfsreq means that it is trying to establish a TCP
 connection with the server (again smells like some networking issue).
 snip
 Disabling delegations is the next step. (They aren't
 required for correct behaviour and are disabled by default because
 they are the greenest part of the implementation.)

After disabling delegations, I was able to build world and kernel on two
different clients, and my port build problems went away as well.

I'm still left with a performance problem, although not quite as bad as I
originally reported.  Directory listings are snappy once again, but playing
h264 video is choppy, particularly when seeking around: there's almost a
full second delay before it kicks in, no matter where I seek.  With NFSv3
the delay on seeks was less than 0.1 seconds and the playback was never
jittery.

I can try it again with v3 client and v4 server, if you think that's
worthy of pursuit.  If it makes any difference, the server's four CPUs are
pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
before I enabled v4 server too.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Jeremy Chadwick
On Mon, Jun 28, 2010 at 10:18:35AM -0500, Rick C. Petty wrote:
  8) Contents of /etc/sysctl.conf
 
 server and client:
 
 # for NFSv4
 kern.ipc.maxsockbuf=524288

You might want to discuss this one with Rick a bit (I'm not sure of the
implications).  Regarding heavy network I/O (I don't use NFS but Samba),
I've found that the following tunables do in fact make a performance
difference -- you might try and see if these have some impact (or, try
forcing a specific protocol type for NFS, e.g. TCP-only; I'm not
familiar with NFSv4 though).  These are adjustable in sysctl.conf, thus
adjustable in real-time.

# Increase send/receive buffer maximums from 256KB to 16MB.
# FreeBSD 7.x and later will auto-tune the size, but only up to the max.
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216

# Double send/receive TCP datagram memory allocation.  This defines the
# amount of memory taken up by default *per socket*.
net.inet.tcp.sendspace=65536
net.inet.tcp.recvspace=131072

That's about all I can comment on -- if NFSv3 works OK for you
(performance-wise), then I'm not sure where the bottleneck could be.

   Jun 27 18:04:44 rpcbind: cannot get information for udp6
   Jun 27 18:04:44 rpcbind: cannot get information for tcp6
  
  These two usually indicate you removed IPv6 support from the kernel,
  except your ifconfig output (I've remove it) on the server shows you do
  have IPv6 support.  I've been trying to get these warnings removed for
  quite some time (PR kern/96242).  They're harmless, but the
  inconsistency here is a little weird -- are you explicitly disabling
  IPv6 on nfe0?
 
 I have WITHOUT_IPV6= in my make.conf on all my machines (or I have
 problems with jdk1.6) and WITHOUT_INET6= in my src.conf.  I'm not sure
 why the rpcbind/ifconfig binaries have a different concept than the
 kernel since I always make buildworld kernel and keep things in sync
 with mergemaster when I reboot.  I'm building new worlds/kernels now
 to see if that makes any difference.

make.conf WITHOUT_IPV6 would affect ports, src.conf WITHOUT_INET6 would
affect the base system (thus rpcbind).  The src.conf entry is what's
causing rpcbind to spit out the above cannot get information messages,
even though IPv6 is available in your kernel (see below).

However: your kernel configuration file must contain options INET6 or
else you wouldn't have IPv6 addresses on lo0.  So even though your
kernel and world are synchronised, IPv6 capability-wise they probably
aren't.  This may be your intended desire though, and if so, no biggie.

If you wanted to work around the problem, you can supposedly comment out
the udp6 and tcp6 lines in /etc/netconfig.  I choose not to do this (put
up with the warning messages) since I'm not sure of the repercussions of
adjusting this file (e.g. will something else down the road break).

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick Macklem



On Mon, 28 Jun 2010, Rick C. Petty wrote:


On Mon, Jun 28, 2010 at 12:35:14AM -0400, Rick Macklem wrote:


Being stuck in newnfsreq means that it is trying to establish a TCP
connection with the server (again smells like some networking issue).
snip
Disabling delegations is the next step. (They aren't
required for correct behaviour and are disabled by default because
they are the greenest part of the implementation.)


After disabling delegations, I was able to build world and kernel on two
different clients, and my port build problems went away as well.



Ok, it sounds like you found some kind of race condition in the delegation
handling. (I'll see if I can reproduce it here. It could be fun to find:-)


I'm still left with a performance problem, although not quite as bad as I
originally reported.  Directory listings are snappy once again, but playing
h264 video is choppy, particularly when seeking around: there's almost a
full second delay before it kicks in, no matter where I seek.  With NFSv3
the delay on seeks was less than 0.1 seconds and the playback was never
jittery.



Hmm, see below w.r.t. 100% cpu.


I can try it again with v3 client and v4 server, if you think that's
worthy of pursuit.  If it makes any difference, the server's four CPUs are
pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
before I enabled v4 server too.


It would be interesting to see if the performance problem exists for
NFSv3 mounts against the experimental (nfsv4) server.

Since the CPUs are 100% busy, it might be a scheduling issue w.r.t.
the nfsd threads (ie. the ones in the experimental server don't have
as high a priority as for the regular server?). I've always tested
on a machine where the CPU (I only have single core) are nowhere near
100% busy. If this theory is correct, the performance issue should
still be noticible for an NFSv3 mount to the experimental server.

I'll try running something compute bound on the server here and see
what happens.

rick
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick Macklem



On Mon, 28 Jun 2010, Rick C. Petty wrote:




Make sure you don't have multiple entries for the same uid, such as root
and toor both for uid 0 in your /etc/passwd. (ie. get rid of one of
them, if you have both)


Hmm, that's a strange requirement, since FreeBSD by default comes with
both.  That should probably be documented in the nfsv4 man page.



Well, if the mapping from uid-name is not unique, getpwuid() will just
return one of them and it probably won't be the expected one. Having
both root and toor only cause weird behaviour when root tries to
use a mount point. I had thought it was in the man pages, but I now
see it isn't mentioned. I'll try and remember to add it.



This error indicates that there wasn't a valid FH for the server. I
suspect that the mount failed. (It does a loop of Lookups from / in
the kernel during the mount and it somehow got confused part way through.)


If the mount failed, why would it allow me to ls /vol/a and see both b
and c directories as well as other files/directories on /vol/ ?


I don't know why these empty dirs would confuse it. I'll try a test
here, but I suspect the real problem was that the mount failed and
then happened to succeed after you deleted the empty dirs.


It doesn't seem likely.  I spent an hour mounting and unmounting and each
mount looked successful in that there were files and directories besides
the two I was trying to decend into.



My theory was that, since you used soft, one of the Lookups during
the mounting process in the kernel failed with ETIMEDOUT. It isn't
coded to handle that. There are lots of things that will break in
the NFSv4 client if soft or intr are used. (That is in the mount_nfs
man page, but right at the end, so it could get missed.)

Maybe broken mount would have been a better term than failed mount.

If more recent mount attempts are without soft, then I would expect
them to work reliably. (If you feel daring, add the empty subdirs back
and see if it fails?)

I will try a case with empty subdirs on the client, to see if there is
a problem when I do it. (It should just cover them up until umount, but
it could certainly be broken:-)


It still smells like some sort of transport/net interface/... issue
is at the bottom of this. (see response to your next post)


It's possible.  I just had another NFSv4 client (with the same server) lock
up:

load: 0.00  cmd: ls 17410 [nfsv4lck] 641.87r 0.00u 0.00s 0% 1512k

and:

load: 0.00  cmd: make 87546 [wait] 37095.09r 0.01u 0.01s 0% 844k

That make has been hung for hours, and the ls(1) was executed during that
lockup.  I wish there was a way I could unhang these processes and unmount
the NFS mount without panicking the kernel, but alas even this fails:

# umount -f /sw
load: 0.00  cmd: umount 17479 [nfsclumnt] 1.27r 0.00u 0.04s 0% 788k



The plan is to implement a hard forced umount (something like -ff)
which will throw away data, but get the umount done, but it hasn't been
coded yet. (For 8.2 maybe?)


A shutdown -p now resulted in a panic with the speaker beeping
constantly and no console output.

It's possible the NICs are all suspect, but all of this worked fine a
couple of days ago when I was only using NFSv3.


Yea, if NFSv3 worked fine with the same kernel, it seems more likely
an experimental NFS server issue, possibly related to scheduling the
busy CPUs. (If it was a NIC related problem, it is most likely related
to the driver, but if the NFSv3 case was using the same driver, that
doesn't seem likely.)

You are now using rsize=32768,wsize=32768 aren't you?
(If you aren't yet using that, try it, since larger bursts of
traffic can definitely tickle nics driver problems, to borrow
Jeremy's term.)

rick
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick Macklem



On Mon, 28 Jun 2010, Rick C. Petty wrote:



I can try it again with v3 client and v4 server, if you think that's
worthy of pursuit.  If it makes any difference, the server's four CPUs are
pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
before I enabled v4 server too.


If it is practical, it would be interesting to see what effect killing
off the cpu bound jobs has w.r.t. performance.

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 10:09:21PM -0400, Rick Macklem wrote:
 
 
 On Mon, 28 Jun 2010, Rick C. Petty wrote:
 
  If it makes any difference, the server's four CPUs are
 pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
 before I enabled v4 server too.

 If it is practical, it would be interesting to see what effect killing
 off the cpu bound jobs has w.r.t. performance.

I sent SIGTSTP to all those processes and brought the CPUs to idle.  The
jittering/stuttering is still present when watching h264 video.  So that
rules out scheduling issues.  I'll be investigating Jeremy's TCP tuning
suggestions next.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 09:29:11AM -0700, Jeremy Chadwick wrote:
 
 # Increase send/receive buffer maximums from 256KB to 16MB.
 # FreeBSD 7.x and later will auto-tune the size, but only up to the max.
 net.inet.tcp.sendbuf_max=16777216
 net.inet.tcp.recvbuf_max=16777216
 
 # Double send/receive TCP datagram memory allocation.  This defines the
 # amount of memory taken up by default *per socket*.
 net.inet.tcp.sendspace=65536
 net.inet.tcp.recvspace=131072

I tried adjusting to these settings, on both the client and the server.
I still see the same jittery/stuttery video behavior.  Thanks for your
suggestions though, these are probably good settings to have around anyway
since I have 12 GB of RAM on the client and 8 GB of RAM on the server.

 make.conf WITHOUT_IPV6 would affect ports, src.conf WITHOUT_INET6 would
 affect the base system (thus rpcbind).  The src.conf entry is what's
 causing rpcbind to spit out the above cannot get information messages,
 even though IPv6 is available in your kernel (see below).
 
 However: your kernel configuration file must contain options INET6 or
 else you wouldn't have IPv6 addresses on lo0.  So even though your
 kernel and world are synchronised, IPv6 capability-wise they probably
 aren't.  This may be your intended desire though, and if so, no biggie.

Oh forgot about that.  I'll have to add the nooptions since I like to
build as close to GENERIC as possible.  Mostly the WITHOUT_* stuff in
/etc/src.conf is to reduce my overall build times, since I don't need some
of those tools.

I'm okay with the messages though; I'll probably comment out WITHOUT_INET6.

Thanks again for your suggestions,

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-28 Thread Rick C. Petty
On Mon, Jun 28, 2010 at 07:48:59PM -0400, Rick Macklem wrote:
 
 Ok, it sounds like you found some kind of race condition in the delegation
 handling. (I'll see if I can reproduce it here. It could be fun to find:-)

Good luck with that!  =)

 I can try it again with v3 client and v4 server, if you think that's
 worthy of pursuit.  If it makes any difference, the server's four CPUs are
 pegged at 100% (running nice +4 cpu-bound jobs).  But that was the case
 before I enabled v4 server too.
 
 It would be interesting to see if the performance problem exists for
 NFSv3 mounts against the experimental (nfsv4) server.

Hmm, I couldn't reproduce the problem.  Once I unmounted the nfsv4 client
and tried v3, the jittering stopped.  Then I unmounted v3 and tried v4
again, no jitters.  I played with a couple of combinations back and forth
(toggling the presence of nfsv4 in the options) and sometimes I saw
jittering but only with v4, but nothing like what I was seeing before.
Perhaps this is a result of Jeremy's TCP tuning tweaks.

This is also a difficult thing to test, because the server and client have
so much memory, they cache the date blocks.  So if I try my stutter test
on the same video a second time, I only notice stutters if I skip to parts
I haven't skipped to before.  I can comment that it seemed like more of a
latency issue than a throughput issue to me.  But the disks aren't ever
under a high load.  But it's hard to determine accurate load when the
disks are seeking.  Oh, I'm using the AHCI controller mode/driver on those
disks instead of ATA, if that matters.

One time when I mounted the v4 again, it broke subdirectories like I was
talking about before.  Essentially it would give me a readout of all the
top-level directories but wouldn't descend into subdirectories which
reflect different mountpoints on the server.  An unmount and a remount
(without changes to /etc/fstab) fixed the problem.  I'm wondering if there
isn't some race condition that seems to affect crossing mountpoints on the
server.  When the situation happens, it affects all mountpoints equally
and persists for the duration of that mount.  And of course, I can't
reproduce the problem when I try.

I saw the broken mountpoint crossing on another client (without any TCP
tuning) but each time it happened I saw this in the logs:

nfscl: consider increasing kern.ipc.maxsockbuf

Once I doubled that value, the problem went away..  at least with this
particular v4 server mountpoint.

At the moment, things are behaving as expected.  The v4 file system seems
just as fast as v3 did, and I don't need a dozen mountpoints specified
on each client thanks to v4.  Once again, I thank you, Rick, for all your
hard work!

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow? (root/toor)

2010-06-28 Thread Ian Smith
On Mon, 28 Jun 2010, Rick Macklem wrote:
  On Mon, 28 Jun 2010, Rick C. Petty wrote:
  
   
Make sure you don't have multiple entries for the same uid, such as
root
and toor both for uid 0 in your /etc/passwd. (ie. get rid of one of
them, if you have both)
   
   Hmm, that's a strange requirement, since FreeBSD by default comes with
   both.  That should probably be documented in the nfsv4 man page.
   
  
  Well, if the mapping from uid-name is not unique, getpwuid() will just
  return one of them and it probably won't be the expected one. Having
  both root and toor only cause weird behaviour when root tries to
  use a mount point. I had thought it was in the man pages, but I now
  see it isn't mentioned. I'll try and remember to add it.

Not wanting to hijack this (interesting) thread, but ..

I have to concur with Rick P - that's rather a odd requirement when each 
FreeBSD install since at least 2.2 has come with root and toor (in that 
order) in /etc/passwd.  I don't use toor, but often enough read about 
folks who do, and don't recall it ever being an issue with NFSv3.  Are 
you sure this is a problem that cannot be coded around in NFSv4?

cheers, Ian
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick Macklem



On Sun, 27 Jun 2010, Rick C. Petty wrote:


First off, many thanks to Rick Macklem for making NFSv4 possible in
FreeBSD!

I recently updated my NFS server and clients to v4, but have since noticed
significant performance penalties.  For instance, when I try ls a b c (if
a, b, and c are empty directories) on the client, it takes up to 1.87
seconds (wall time) whereas before it always finished in under 0.1 seconds.
If I repeat the test, it takes the same amount of time in v4 (in v3, wall
time was always under 0.01 seconds for subsequent requests, as if the
directory listing was cached).



Weird, I don't see that here. The only thing I can think of is that the
experimental client/server will try to do I/O at the size of MAXBSIZE
by default, which might be causing a burst of traffic your net interface
can't keep up with. (This can be turned down to 32K via the
rsize=32768,wsize=32768 mount options. I found this necessary to avoid
abissmal performance on some Macs for the Mac OS X port.)

The other thing that can really slow it down is if the uid-login-name
(and/or gid-group-name) is messed up, but this would normally only
show up for things like ls -l. (Beware having multiple password database
entries for the same uid, such as root and toor.)


If I try to play an h264 video file on the filesystem using mplayer, it
often jitters and skipping around in time introduces up to a second or so
pause.  With NFSv3 it behaved more like the file was on local disk (no
noticable pauses or jitters).

Has anyone seen this behavior upon switching to v4 or does anyone have any
suggestions for tuning?

Both client and server are running the same GENERIC kernel, 8.1-PRERELEASE
as of 2010-May-29.  They are connected via gigabit.  Both v3 and v4 tests
were performed on the exact same hardware and I/O, CPU, network loads.
All I did was toggle nfsv4_server_enable (and nfsuserd/nfscbd of course).

It seems like a server-side issue, because if I try an nfs3 client mount
to the nfs4 server and run the same tests, I see only a slight improvement
in performance.  In both cases, my mount options were
rdirplus,bg,intr,soft (and nfsv4 added in the one case, obviously).



I don't recommend the use of intr or soft for NFSv4 mounts, but they
wouldn't affect performance for trivial tests. You might want to try:
nfsv4,rsize=32768,wsize=32768 and see how that works.

When you did the nfs3 mount did you specify newnfs or nfs for the
file system type? (I'm wondering if you still saw the problem with the
regular nfs client against the server? Others have had good luck using
the server for NFSv3 mounts.)


On the server, I have these tunables explicitly set:

kern.ipc.maxsockbuf=524288
vfs.newnfs.issue_delegations=1

On the client, I just have the maxsockbuf setting (this is twice the
default value).  I'm open to trying other tunables or patches.  TIA,


When I see abissmal NFS perf. it is usually an issue with the underlying
transport. Looking at things like netstat -i or netstat -s might
give you a hint?

Having said that, the only difference I can think of between the two
NFS subsystems that might affect the transport layer is the default
I/O size, as noted above.

rick
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick C. Petty
On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:
 
 Weird, I don't see that here. The only thing I can think of is that the
 experimental client/server will try to do I/O at the size of MAXBSIZE
 by default, which might be causing a burst of traffic your net interface
 can't keep up with. (This can be turned down to 32K via the
 rsize=32768,wsize=32768 mount options. I found this necessary to avoid
 abissmal performance on some Macs for the Mac OS X port.)

Hmm.  When I mounted the same filesystem with nfs3 from a different client,
everything started working at almost normal speed (still a little slower
though).

Now on that same host I saw a file get corrupted.  On the server, I see
the following:

% hd testfile | tail -4
00677fd0  2a 24 cc 43 03 90 ad e2  9a 4a 01 d9 c4 6a f7 14  |*$.C.J...j..|
00677fe0  3f ba 01 77 28 4f 0f 58  1a 21 67 c5 73 1e 4f 54  |?..w(O.X.!g.s.OT|
00677ff0  bf 75 59 05 52 54 07 6f  db 62 d6 4a 78 e8 3e 2b  |.uY.RT.o.b.Jx.+|
00678000

But on the client I see this:

% hd testfile | tail -4
00011ff0  1e af dc 8e d6 73 67 a2  cd 93 fe cb 7e a4 dd 83  |.sg.~...|
00012000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
00678000

The only thing I could do to fix it was to copy the file on the server,
delete the original file on the client, and move the copied file back.

Not only is it affecting random file reads, but started breaking src
and ports builds in random places.  In one situation, portmaster failed
because of a port checksum.  It then tried to refetch and failed with the
same checksum problem.  I manually deleted the file, tried again and it
built just fine.  The ports tree and distfiles are nfs4 mounted.

 The other thing that can really slow it down is if the uid-login-name
 (and/or gid-group-name) is messed up, but this would normally only
 show up for things like ls -l. (Beware having multiple password database
 entries for the same uid, such as root and toor.)

I use the same UIDs/GIDs on all my boxes, so that can't be it.  But thanks
for the idea.

 I don't recommend the use of intr or soft for NFSv4 mounts, but they
 wouldn't affect performance for trivial tests. You might want to try:
 nfsv4,rsize=32768,wsize=32768 and see how that works.

I'm trying that right now (with rdirplus also) on one host.  If I start to
the delays again, I'll compare between hosts.

 When you did the nfs3 mount did you specify newnfs or nfs for the
 file system type? (I'm wondering if you still saw the problem with the
 regular nfs client against the server? Others have had good luck using
 the server for NFSv3 mounts.)

I used nfs for FStype.  So I should be using newnfs?  This wasn't very
clear in the man pages.  In fact newnfs wasn't mentioned in
man mount_newnfs.

 When I see abissmal NFS perf. it is usually an issue with the underlying
 transport. Looking at things like netstat -i or netstat -s might
 give you a hint?

I suspected it might be transport-related.  I didn't see anything out of
the ordinary from netstat, but then again I don't know what's ordinary
with NFS.  =)

~~

One other thing I noticed but I'm not sure if it's a bug or expected
behavior (unrelated to the delays or corruption), is I have the following
filesystems on the server:

/vol/a
/vol/a/b
/vol/a/c

I export all three volumes and set my NFS V4 root to /.  On the client,
I'll mount ... server:vol /vol and the b and c directories show up
but when I try ls /vol/a/b /vol/a/c, they show up empty.  In dmesg I see:

kernel: nfsv4 client/server protocol prob err=10020

After unmounting /vol, I discovered that my client already had /vol/a/b and
/vol/a/c directories (because pre-NFSv4, I had to mount each filesystem
separately).  Once I removed those empty dirs and remounted, the problem
went away.  But it did drive me crazy for a few hours.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick C. Petty
On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:
 
 Weird, I don't see that here. The only thing I can think of is that the
 experimental client/server will try to do I/O at the size of MAXBSIZE
 by default, which might be causing a burst of traffic your net interface
 can't keep up with. (This can be turned down to 32K via the
 rsize=32768,wsize=32768 mount options. I found this necessary to avoid
 abissmal performance on some Macs for the Mac OS X port.)

I just ran into the speed problem again after remounting.  This time
I tried to do a make buildworld and make got stuck on [newnfsreq] for
ten minutes, with no other filesystem activity on either client or server.

The file system corruption is still pretty bad.  I can no longer build any
ports on one machine, because after the port is extracted, the config.sub
files are being filled with all zeros.  It took me awhile to track this
down while trying to build devel/libtool22:

+ ac_build_alias=amd64-portbld-freebsd8.1
+ test xamd64-portbld-freebsd8.1 = x
+ test xamd64-portbld-freebsd8.1 = x
+ /bin/sh libltdl/config/config.sub amd64-portbld-freebsd8.1
+ ac_cv_build=''
+ printf '%s\n' 'configure:4596: result: '
+ printf '%s\n' ''

+ as_fn_error 'invalid value of canonical build' 4600 5
+ as_status=0
+ test 0 -eq 0
+ as_status=1
+ test 5

And although my work dir is on local disk,

% hd work/libtool-2.2.6b/libltdl/config/config.sub:

  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
||
*
7660

Again, my ports tree is mounted as FSType nfs with option nfsv4.
FreeBSD/amd64 8.1-PRERELEASE r208408M GENERIC kernel.

-- Rick C. Petty
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick Macklem



On Sun, 27 Jun 2010, Rick C. Petty wrote:



Hmm.  When I mounted the same filesystem with nfs3 from a different client,
everything started working at almost normal speed (still a little slower
though).

Now on that same host I saw a file get corrupted.  On the server, I see
the following:

% hd testfile | tail -4
00677fd0  2a 24 cc 43 03 90 ad e2  9a 4a 01 d9 c4 6a f7 14  |*$.C.J...j..|
00677fe0  3f ba 01 77 28 4f 0f 58  1a 21 67 c5 73 1e 4f 54  |?..w(O.X.!g.s.OT|
00677ff0  bf 75 59 05 52 54 07 6f  db 62 d6 4a 78 e8 3e 2b  |.uY.RT.o.b.Jx.+|
00678000

But on the client I see this:

% hd testfile | tail -4
00011ff0  1e af dc 8e d6 73 67 a2  cd 93 fe cb 7e a4 dd 83  |.sg.~...|
00012000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ||
*
00678000

The only thing I could do to fix it was to copy the file on the server,
delete the original file on the client, and move the copied file back.

Not only is it affecting random file reads, but started breaking src
and ports builds in random places.  In one situation, portmaster failed
because of a port checksum.  It then tried to refetch and failed with the
same checksum problem.  I manually deleted the file, tried again and it
built just fine.  The ports tree and distfiles are nfs4 mounted.



I can't explain the corruption, beyond the fact that soft,intr can
cause all sorts of grief. If mounts without soft,intr still show
corruption problems, try disabling delegations (either kill off the
nfscbd daemons on the client or set vfs.newnfs.issue_delegations=0
on the server). It is disabled by default because it is the greenest
part of the subsystem.


The other thing that can really slow it down is if the uid-login-name
(and/or gid-group-name) is messed up, but this would normally only
show up for things like ls -l. (Beware having multiple password database
entries for the same uid, such as root and toor.)


I use the same UIDs/GIDs on all my boxes, so that can't be it.  But thanks
for the idea.



Make sure you don't have multiple entries for the same uid, such as root
and toor both for uid 0 in your /etc/passwd. (ie. get rid of one of 
them, if you have both)





When you did the nfs3 mount did you specify newnfs or nfs for the
file system type? (I'm wondering if you still saw the problem with the
regular nfs client against the server? Others have had good luck using
the server for NFSv3 mounts.)


I used nfs for FStype.  So I should be using newnfs?  This wasn't very
clear in the man pages.  In fact newnfs wasn't mentioned in
man mount_newnfs.



When you specify nfs for an NFSv3 mount, you get the regular client.
When you specify newnfs for an NFSv3 mount, you get the experimental
client. When you specify nfsv4 you always get the experimental NFS
client, and it doesn't matter which FStype you've specified.



One other thing I noticed but I'm not sure if it's a bug or expected
behavior (unrelated to the delays or corruption), is I have the following
filesystems on the server:

/vol/a
/vol/a/b
/vol/a/c

I export all three volumes and set my NFS V4 root to /.  On the client,
I'll mount ... server:vol /vol and the b and c directories show up
but when I try ls /vol/a/b /vol/a/c, they show up empty.  In dmesg I see:



If you are using UFS/FFS on the server, this should work and I don't know
why the empty directories under /vol on the client confused it. If your
server is using ZFS, everything from / including /vol need to be exported.


kernel: nfsv4 client/server protocol prob err=10020



This error indicates that there wasn't a valid FH for the server. I
suspect that the mount failed. (It does a loop of Lookups from / in
the kernel during the mount and it somehow got confused part way through.)


After unmounting /vol, I discovered that my client already had /vol/a/b and
/vol/a/c directories (because pre-NFSv4, I had to mount each filesystem
separately).  Once I removed those empty dirs and remounted, the problem
went away.  But it did drive me crazy for a few hours.


I don't know why these empty dirs would confuse it. I'll try a test
here, but I suspect the real problem was that the mount failed and
then happened to succeed after you deleted the empty dirs.

It still smells like some sort of transport/net interface/... issue
is at the bottom of this. (see response to your next post)

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Rick Macklem



On Sun, 27 Jun 2010, Rick C. Petty wrote:


On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:


Weird, I don't see that here. The only thing I can think of is that the
experimental client/server will try to do I/O at the size of MAXBSIZE
by default, which might be causing a burst of traffic your net interface
can't keep up with. (This can be turned down to 32K via the
rsize=32768,wsize=32768 mount options. I found this necessary to avoid
abissmal performance on some Macs for the Mac OS X port.)


I just ran into the speed problem again after remounting.  This time
I tried to do a make buildworld and make got stuck on [newnfsreq] for
ten minutes, with no other filesystem activity on either client or server.



Being stuck in newnfsreq means that it is trying to establish a TCP
connection with the server (again smells like some networking issue).


The file system corruption is still pretty bad.  I can no longer build any
ports on one machine, because after the port is extracted, the config.sub
files are being filled with all zeros.  It took me awhile to track this
down while trying to build devel/libtool22:



Assuming your mounts are not using soft,intr, I can't explain the
corruption. Disabling delegations is the next step. (They aren't
required for correct behaviour and are disabled by default because
they are the greenest part of the implementation.)

rick

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Why is NFSv4 so slow?

2010-06-27 Thread Jeremy Chadwick
On Sun, Jun 27, 2010 at 10:47:41PM -0500, Rick C. Petty wrote:
 On Sun, Jun 27, 2010 at 08:04:28PM -0400, Rick Macklem wrote:
  
  Weird, I don't see that here. The only thing I can think of is that the
  experimental client/server will try to do I/O at the size of MAXBSIZE
  by default, which might be causing a burst of traffic your net interface
  can't keep up with. (This can be turned down to 32K via the
  rsize=32768,wsize=32768 mount options. I found this necessary to avoid
  abissmal performance on some Macs for the Mac OS X port.)
 
 I just ran into the speed problem again after remounting.  This time
 I tried to do a make buildworld and make got stuck on [newnfsreq] for
 ten minutes, with no other filesystem activity on either client or server.
 
 The file system corruption is still pretty bad.  I can no longer build any
 ports on one machine, because after the port is extracted, the config.sub
 files are being filled with all zeros.  It took me awhile to track this
 down while trying to build devel/libtool22:
 
 + ac_build_alias=amd64-portbld-freebsd8.1
 + test xamd64-portbld-freebsd8.1 = x
 + test xamd64-portbld-freebsd8.1 = x
 + /bin/sh libltdl/config/config.sub amd64-portbld-freebsd8.1
 + ac_cv_build=''
 + printf '%s\n' 'configure:4596: result: '
 + printf '%s\n' ''
 
 + as_fn_error 'invalid value of canonical build' 4600 5
 + as_status=0
 + test 0 -eq 0
 + as_status=1
 + test 5
 
 And although my work dir is on local disk,
 
 % hd work/libtool-2.2.6b/libltdl/config/config.sub:
 
   00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 ||
 *
 7660
 
 Again, my ports tree is mounted as FSType nfs with option nfsv4.
 FreeBSD/amd64 8.1-PRERELEASE r208408M GENERIC kernel.

This sounds like NFSv4 is tickling some kind of bug in your NIC driver
but I'm not entirely sure.  Can you provide output from:

1) ifconfig -a  (you can X out the IPs + MACs if you want)
2) netstat -m
3) vmstat -i
4) prtconf -lvc  (only need the Ethernet-related entries)
5) sysctl dev.XXX.N  (ex. for em0, XXX=em, N=0)

And also check dmesg to see if there's any messages the kernel has
been spitting out which look relevant?  Thanks.

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org