Re: Low nfs write throughput

2011-12-01 Thread John Baldwin
On Thursday, December 01, 2011 12:35:23 am Jeremy Chadwick wrote:
 On Tue, Nov 29, 2011 at 10:36:44AM -0500, John Baldwin wrote:
  On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
Bengt == Bengt Ahlgren ben...@sics.se writes:
   
Daryl Sayers da...@ci.com.au writes:
Can anyone suggest why I am getting poor write performance from my nfs 
setup.
I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
boards,
4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives 
with
onboard Gb network cards connected to an idle network. The results 
below show
that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. 
It
improves if I use async but a smbfs mount still beats it. I am using 
the same
file, source and destinations for all tests. I have tried alternate 
Network
cards with no resulting benefit.
   
[...]
   
Looking at a systat -v on the destination I see that the nfs test does 
not
exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
For the record I get reads of 22Mb/s without and 77Mb/s with async 
turned on
for the nfs mount.
   
On an UFS filesystem you get NFS writes with the same size as the
filesystem blocksize.  So an easy way to improve performance is to
create a filesystem with larger blocks.  I accidentally found this out
when I had two NFS exported filesystems from the same box with 16K and
64K blocksizes respectively.
   
(Larger blocksize also tremendously improves the performance of UFS
snapshots!)
   
   Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
   with
   no reportable change in performance. We are using a UFS2 filesystem so the
   zfs command was not required. I did not try the patch as we would like to 
   stay
   as standard as possible but will upgrade if the patch is released in new
   kernel.
  
  If you can test the patch then it is something I will likely put into the
  next release.  I have already tested it as far as robustness locally, what
  I don't have are good performance tests.  It would really be helpful if you
  were able to test it.
 
 John,
 
 We'd like to test this patch[1], but need to know if it needs to be
 applied to just the system acting as the NFS server, or the NFS clients
 as well.
 
 [1]: http://www.freebsd.org/~jhb/patches/nfs_server_cluster.patch

Just the NFS server.  I'm going to commit it to HEAD later today.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-30 Thread John Baldwin
On Tuesday, November 29, 2011 6:56:27 pm Daryl Sayers wrote:
  John == John Baldwin j...@freebsd.org writes:
 
  On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
   Bengt == Bengt Ahlgren ben...@sics.se writes:
  
   Daryl Sayers da...@ci.com.au writes:
   Can anyone suggest why I am getting poor write performance from my nfs 
   setup.
   I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
   boards,
   4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
   onboard Gb network cards connected to an idle network. The results 
   below show
   that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. 
   It
   improves if I use async but a smbfs mount still beats it. I am using 
   the same
   file, source and destinations for all tests. I have tried alternate 
   Network
   cards with no resulting benefit.
  
   [...]
  
   Looking at a systat -v on the destination I see that the nfs test does 
   not
   exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
   For the record I get reads of 22Mb/s without and 77Mb/s with async 
   turned on
   for the nfs mount.
  
   On an UFS filesystem you get NFS writes with the same size as the
   filesystem blocksize.  So an easy way to improve performance is to
   create a filesystem with larger blocks.  I accidentally found this out
   when I had two NFS exported filesystems from the same box with 16K and
   64K blocksizes respectively.
  
   (Larger blocksize also tremendously improves the performance of UFS
   snapshots!)
  
  Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
  with
  no reportable change in performance. We are using a UFS2 filesystem so the
  zfs command was not required. I did not try the patch as we would like to 
  stay
  as standard as possible but will upgrade if the patch is released in new
  kernel.
 
  If you can test the patch then it is something I will likely put into the
  next release.  I have already tested it as far as robustness locally, what
  I don't have are good performance tests.  It would really be helpful if you
  were able to test it.
 
  Thanks Bengt for the suggestion of block size. Increasing the block size to
  64k made a significant improvement to performance.
 
  In theory the patch might have given you similar gains.  During my simple 
  tests
  I was able to raise the average I/O size in iostat to 70 to 80k from 16k.
 
 OK, I downloaded and install the patch and did some basic testing and I can
 reveal that the patch does improve performance. I can also see that my KB/t
 now exceed the 16KB/t that seemed to be a limiting factor prior.

Ok, thanks.  Does it give similar performance results to using 64k block size?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-30 Thread Daryl Sayers
 John == John Baldwin j...@freebsd.org writes:

 On Tuesday, November 29, 2011 6:56:27 pm Daryl Sayers wrote:
  John == John Baldwin j...@freebsd.org writes:
 
  On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
   Bengt == Bengt Ahlgren ben...@sics.se writes:
  
   Daryl Sayers da...@ci.com.au writes:
   Can anyone suggest why I am getting poor write performance from my nfs 
   setup.
   I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
   boards,
   4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives 
   with
   onboard Gb network cards connected to an idle network. The results 
   below show
   that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. 
   It
   improves if I use async but a smbfs mount still beats it. I am using 
   the same
   file, source and destinations for all tests. I have tried alternate 
   Network
   cards with no resulting benefit.
  
   [...]
  
   Looking at a systat -v on the destination I see that the nfs test does 
   not
   exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
   For the record I get reads of 22Mb/s without and 77Mb/s with async 
   turned on
   for the nfs mount.
  
   On an UFS filesystem you get NFS writes with the same size as the
   filesystem blocksize.  So an easy way to improve performance is to
   create a filesystem with larger blocks.  I accidentally found this out
   when I had two NFS exported filesystems from the same box with 16K and
   64K blocksizes respectively.
  
   (Larger blocksize also tremendously improves the performance of UFS
   snapshots!)
  
  Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
  with
  no reportable change in performance. We are using a UFS2 filesystem so the
  zfs command was not required. I did not try the patch as we would like to 
  stay
  as standard as possible but will upgrade if the patch is released in new
  kernel.
 
  If you can test the patch then it is something I will likely put into the
  next release.  I have already tested it as far as robustness locally, what
  I don't have are good performance tests.  It would really be helpful if you
  were able to test it.
 
  Thanks Bengt for the suggestion of block size. Increasing the block size 
  to
  64k made a significant improvement to performance.
 
  In theory the patch might have given you similar gains.  During my simple 
  tests
  I was able to raise the average I/O size in iostat to 70 to 80k from 16k.
 
 OK, I downloaded and install the patch and did some basic testing and I can
 reveal that the patch does improve performance. I can also see that my KB/t
 now exceed the 16KB/t that seemed to be a limiting factor prior.

 Ok, thanks.  Does it give similar performance results to using 64k block size?
From the tests I have done I get similar results to the block size change.


-- 
Daryl Sayers Direct: +612 95525510
Corinthian Engineering   Office: +612 95525500
Suite 54, Jones Bay Wharf   Fax: +612 95525549
26-32 Pirrama Rd  email: da...@ci.com.au
Pyrmont NSW 2009 Australia  www: http://www.ci.com.au
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-30 Thread Jeremy Chadwick
On Tue, Nov 29, 2011 at 10:36:44AM -0500, John Baldwin wrote:
 On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
   Bengt == Bengt Ahlgren ben...@sics.se writes:
  
   Daryl Sayers da...@ci.com.au writes:
   Can anyone suggest why I am getting poor write performance from my nfs 
   setup.
   I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
   boards,
   4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
   onboard Gb network cards connected to an idle network. The results below 
   show
   that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
   improves if I use async but a smbfs mount still beats it. I am using the 
   same
   file, source and destinations for all tests. I have tried alternate 
   Network
   cards with no resulting benefit.
  
   [...]
  
   Looking at a systat -v on the destination I see that the nfs test does 
   not
   exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
   For the record I get reads of 22Mb/s without and 77Mb/s with async 
   turned on
   for the nfs mount.
  
   On an UFS filesystem you get NFS writes with the same size as the
   filesystem blocksize.  So an easy way to improve performance is to
   create a filesystem with larger blocks.  I accidentally found this out
   when I had two NFS exported filesystems from the same box with 16K and
   64K blocksizes respectively.
  
   (Larger blocksize also tremendously improves the performance of UFS
   snapshots!)
  
  Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' 
  with
  no reportable change in performance. We are using a UFS2 filesystem so the
  zfs command was not required. I did not try the patch as we would like to 
  stay
  as standard as possible but will upgrade if the patch is released in new
  kernel.
 
 If you can test the patch then it is something I will likely put into the
 next release.  I have already tested it as far as robustness locally, what
 I don't have are good performance tests.  It would really be helpful if you
 were able to test it.

John,

We'd like to test this patch[1], but need to know if it needs to be
applied to just the system acting as the NFS server, or the NFS clients
as well.

[1]: http://www.freebsd.org/~jhb/patches/nfs_server_cluster.patch

-- 
| Jeremy Chadwickjdc at parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, US |
| Making life hard for others since 1977.   PGP 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-29 Thread John Baldwin
On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
  Bengt == Bengt Ahlgren ben...@sics.se writes:
 
  Daryl Sayers da...@ci.com.au writes:
  Can anyone suggest why I am getting poor write performance from my nfs 
  setup.
  I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
  boards,
  4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
  onboard Gb network cards connected to an idle network. The results below 
  show
  that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
  improves if I use async but a smbfs mount still beats it. I am using the 
  same
  file, source and destinations for all tests. I have tried alternate Network
  cards with no resulting benefit.
 
  [...]
 
  Looking at a systat -v on the destination I see that the nfs test does not
  exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
  For the record I get reads of 22Mb/s without and 77Mb/s with async turned 
  on
  for the nfs mount.
 
  On an UFS filesystem you get NFS writes with the same size as the
  filesystem blocksize.  So an easy way to improve performance is to
  create a filesystem with larger blocks.  I accidentally found this out
  when I had two NFS exported filesystems from the same box with 16K and
  64K blocksizes respectively.
 
  (Larger blocksize also tremendously improves the performance of UFS
  snapshots!)
 
 Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' with
 no reportable change in performance. We are using a UFS2 filesystem so the
 zfs command was not required. I did not try the patch as we would like to stay
 as standard as possible but will upgrade if the patch is released in new
 kernel.

If you can test the patch then it is something I will likely put into the
next release.  I have already tested it as far as robustness locally, what
I don't have are good performance tests.  It would really be helpful if you
were able to test it.

 Thanks Bengt for the suggestion of block size. Increasing the block size to
 64k made a significant improvement to performance.

In theory the patch might have given you similar gains.  During my simple tests
I was able to raise the average I/O size in iostat to 70 to 80k from 16k.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-29 Thread Daryl Sayers
 John == John Baldwin j...@freebsd.org writes:

 On Monday, November 28, 2011 7:12:39 pm Daryl Sayers wrote:
  Bengt == Bengt Ahlgren ben...@sics.se writes:
 
  Daryl Sayers da...@ci.com.au writes:
  Can anyone suggest why I am getting poor write performance from my nfs 
  setup.
  I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
  boards,
  4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
  onboard Gb network cards connected to an idle network. The results below 
  show
  that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
  improves if I use async but a smbfs mount still beats it. I am using the 
  same
  file, source and destinations for all tests. I have tried alternate 
  Network
  cards with no resulting benefit.
 
  [...]
 
  Looking at a systat -v on the destination I see that the nfs test does not
  exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
  For the record I get reads of 22Mb/s without and 77Mb/s with async turned 
  on
  for the nfs mount.
 
  On an UFS filesystem you get NFS writes with the same size as the
  filesystem blocksize.  So an easy way to improve performance is to
  create a filesystem with larger blocks.  I accidentally found this out
  when I had two NFS exported filesystems from the same box with 16K and
  64K blocksizes respectively.
 
  (Larger blocksize also tremendously improves the performance of UFS
  snapshots!)
 
 Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' with
 no reportable change in performance. We are using a UFS2 filesystem so the
 zfs command was not required. I did not try the patch as we would like to 
 stay
 as standard as possible but will upgrade if the patch is released in new
 kernel.

 If you can test the patch then it is something I will likely put into the
 next release.  I have already tested it as far as robustness locally, what
 I don't have are good performance tests.  It would really be helpful if you
 were able to test it.

 Thanks Bengt for the suggestion of block size. Increasing the block size to
 64k made a significant improvement to performance.

 In theory the patch might have given you similar gains.  During my simple 
 tests
 I was able to raise the average I/O size in iostat to 70 to 80k from 16k.

OK, I downloaded and install the patch and did some basic testing and I can
reveal that the patch does improve performance. I can also see that my KB/t
now exceed the 16KB/t that seemed to be a limiting factor prior. 

-- 
Daryl Sayers Direct: +612 95525510
Corinthian Engineering   Office: +612 95525500
Suite 54, Jones Bay Wharf   Fax: +612 95525549
26-32 Pirrama Rd  email: da...@ci.com.au
Pyrmont NSW 2009 Australia  www: http://www.ci.com.au
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-28 Thread Bengt Ahlgren
Daryl Sayers da...@ci.com.au writes:

 Can anyone suggest why I am getting poor write performance from my nfs setup.
 I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
 onboard Gb network cards connected to an idle network. The results below show
 that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
 improves if I use async but a smbfs mount still beats it. I am using the same
 file, source and destinations for all tests. I have tried alternate Network
 cards with no resulting benefit.

[...]

 Looking at a systat -v on the destination I see that the nfs test does not
 exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
 For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
 for the nfs mount.

On an UFS filesystem you get NFS writes with the same size as the
filesystem blocksize.  So an easy way to improve performance is to
create a filesystem with larger blocks.  I accidentally found this out
when I had two NFS exported filesystems from the same box with 16K and
64K blocksizes respectively.

(Larger blocksize also tremendously improves the performance of UFS
snapshots!)

Bengt
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-28 Thread Daryl Sayers
 Bengt == Bengt Ahlgren ben...@sics.se writes:

 Daryl Sayers da...@ci.com.au writes:
 Can anyone suggest why I am getting poor write performance from my nfs setup.
 I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
 onboard Gb network cards connected to an idle network. The results below show
 that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
 improves if I use async but a smbfs mount still beats it. I am using the same
 file, source and destinations for all tests. I have tried alternate Network
 cards with no resulting benefit.

 [...]

 Looking at a systat -v on the destination I see that the nfs test does not
 exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
 For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
 for the nfs mount.

 On an UFS filesystem you get NFS writes with the same size as the
 filesystem blocksize.  So an easy way to improve performance is to
 create a filesystem with larger blocks.  I accidentally found this out
 when I had two NFS exported filesystems from the same box with 16K and
 64K blocksizes respectively.

 (Larger blocksize also tremendously improves the performance of UFS
 snapshots!)

Thanks to all that answered. I did try the 'sysctl -w vfs.nfsrv.async=1' with
no reportable change in performance. We are using a UFS2 filesystem so the
zfs command was not required. I did not try the patch as we would like to stay
as standard as possible but will upgrade if the patch is released in new
kernel.
Thanks Bengt for the suggestion of block size. Increasing the block size to
64k made a significant improvement to performance.

-- 
Daryl Sayers Direct: +612 95525510
Corinthian Engineering   Office: +612 95525500
Suite 54, Jones Bay Wharf   Fax: +612 95525549
26-32 Pirrama Rd  email: da...@ci.com.au
Pyrmont NSW 2009 Australia  www: http://www.ci.com.au
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-21 Thread John Baldwin
On Friday, November 18, 2011 7:36:47 pm Xin LI wrote:
 Hi,
 
  I don't know if it will help with your performance, but I have some patches
  to allow the NFS server to cluster writes.  You can try
  www.freebsd.org/~jhb/patches/nfs_server_cluster.patch.  I've tested it on 8,
  but it should probably apply fine to 9.
 
 I think 9 would need some changes, I just made them with minimal
 compile testing, though.

Oops, 8 has the same problems, and actually, it needs more fixes than that as
the uio isn't initialized then.  I've updated the patch at the URL so it should
now work for the new server.  Sorry. :/

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-18 Thread Bane Ivosev
did you try this ?

sysctl -w vfs.nfsrv.async=1

On 11/18/11 04:10, Daryl Sayers wrote:
 Can anyone suggest why I am getting poor write performance from my nfs setup.
 I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
 onboard Gb network cards connected to an idle network. The results below show
 that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
 improves if I use async but a smbfs mount still beats it. I am using the same
 file, source and destinations for all tests. I have tried alternate Network
 cards with no resulting benefit.
 
 oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
 1950511+1 records in
 1950511+1 records out
 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
 1950477+74 records in
 1950511+1 records out
 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) 
 (98Mb/s)
 
 
 oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) 
 (15Mb/s)
 
 
 oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) 
 (19Mb/s)
 
 
 oguido# mount -t smbfs //gemini/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) 
 (33Mb/s)
 
 Looking at a systat -v on the destination I see that the nfs test does not
 exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
 For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
 for the nfs mount.
 
 
 A copy of dmesg:
 
 
 Copyright (c) 1992-2011 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
 The Regents of the University of California. All rights reserved.
 FreeBSD is a registered trademark of The FreeBSD Foundation.
 FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
 root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
 Timecounter i8254 frequency 1193182 Hz quality 0
 CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class 
 CPU)
   Origin = GenuineIntel  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
   
 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
   
 Features2=0xe3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
   AMD Features=0x2010NX,LM
   AMD Features2=0x1LAHF
   TSC: P-state invariant
 real memory  = 4294967296 (4096 MB)
 avail memory = 3141234688 (2995 MB)
 ACPI APIC Table: MSTEST OEMAPIC 
 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 FreeBSD/SMP: 1 package(s) x 2 core(s)
  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
 ioapic0 Version 2.0 irqs 0-23 on motherboard
 kbd1 at kbdmux0
 cryptosoft0: software crypto on motherboard
 acpi0: MSTEST TESTONLY on motherboard
 acpi0: [ITHREAD]
 acpi0: Power Button (fixed)
 acpi0: reservation of 0, a (3) failed
 acpi0: reservation of 10, bff0 (3) failed
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
 cpu0: ACPI CPU on acpi0
 ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
 (20101013/tbutils-354)
 cpu1: ACPI CPU on acpi0
 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
 pci0: ACPI PCI bus on pcib0
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 pci1: ACPI PCI bus on pcib1
 mpt0: LSILogic SAS/SATA Adapter port 0x7800-0x78ff mem 
 0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
 mpt0: [ITHREAD]
 mpt0: MPI Version=1.5.18.0
 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
 mpt0: 0 Active Volumes (2 Max)
 mpt0: 0 Hidden Drive Members (14 Max)
 uhci0: Intel 82801H (ICH8) USB controller USB-D port 0xdc00-0xdc1f irq 16 
 at device 26.0 on pci0
 uhci0: [ITHREAD]
 uhci0: LegSup = 0x2f00
 usbus0: Intel 82801H (ICH8) USB controller USB-D on uhci0
 uhci1: Intel 82801H (ICH8) USB controller USB-E port 0xe000-0xe01f irq 17 
 at device 26.1 on pci0
 uhci1: [ITHREAD]
 uhci1: LegSup = 0x2f00
 usbus1: Intel 82801H (ICH8) USB controller USB-E on uhci1
 ehci0: Intel 82801H (ICH8) USB 2.0 controller USB2-B mem 
 0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
 ehci0: [ITHREAD]
 usbus2: EHCI version 1.0
 usbus2: Intel 82801H (ICH8) USB 2.0 controller USB2-B on ehci0
 pci0: multimedia, HDA at device 27.0 (no driver attached)
 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
 pci5: ACPI PCI bus on pcib2
 atapci0: SiI 3132 SATA300 controller port 0xac00-0xac7f mem 
 

Re: Low nfs write throughput

2011-11-18 Thread Bane Ivosev
and if you use zfs also try this

zfs set sync=disabled

On 11/18/11 04:10, Daryl Sayers wrote:
 Can anyone suggest why I am getting poor write performance from my nfs setup.
 I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
 onboard Gb network cards connected to an idle network. The results below show
 that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
 improves if I use async but a smbfs mount still beats it. I am using the same
 file, source and destinations for all tests. I have tried alternate Network
 cards with no resulting benefit.
 
 oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
 1950511+1 records in
 1950511+1 records out
 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
 1950477+74 records in
 1950511+1 records out
 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) 
 (98Mb/s)
 
 
 oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) 
 (15Mb/s)
 
 
 oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) 
 (19Mb/s)
 
 
 oguido# mount -t smbfs //gemini/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) 
 (33Mb/s)
 
 Looking at a systat -v on the destination I see that the nfs test does not
 exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
 For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
 for the nfs mount.
 
 
 A copy of dmesg:
 
 
 Copyright (c) 1992-2011 The FreeBSD Project.
 Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
 The Regents of the University of California. All rights reserved.
 FreeBSD is a registered trademark of The FreeBSD Foundation.
 FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
 root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
 Timecounter i8254 frequency 1193182 Hz quality 0
 CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class 
 CPU)
   Origin = GenuineIntel  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
   
 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
   
 Features2=0xe3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
   AMD Features=0x2010NX,LM
   AMD Features2=0x1LAHF
   TSC: P-state invariant
 real memory  = 4294967296 (4096 MB)
 avail memory = 3141234688 (2995 MB)
 ACPI APIC Table: MSTEST OEMAPIC 
 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 FreeBSD/SMP: 1 package(s) x 2 core(s)
  cpu0 (BSP): APIC ID:  0
  cpu1 (AP): APIC ID:  1
 ioapic0 Version 2.0 irqs 0-23 on motherboard
 kbd1 at kbdmux0
 cryptosoft0: software crypto on motherboard
 acpi0: MSTEST TESTONLY on motherboard
 acpi0: [ITHREAD]
 acpi0: Power Button (fixed)
 acpi0: reservation of 0, a (3) failed
 acpi0: reservation of 10, bff0 (3) failed
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
 cpu0: ACPI CPU on acpi0
 ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
 (20101013/tbutils-354)
 cpu1: ACPI CPU on acpi0
 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
 pci0: ACPI PCI bus on pcib0
 pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
 pci1: ACPI PCI bus on pcib1
 mpt0: LSILogic SAS/SATA Adapter port 0x7800-0x78ff mem 
 0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
 mpt0: [ITHREAD]
 mpt0: MPI Version=1.5.18.0
 mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
 mpt0: 0 Active Volumes (2 Max)
 mpt0: 0 Hidden Drive Members (14 Max)
 uhci0: Intel 82801H (ICH8) USB controller USB-D port 0xdc00-0xdc1f irq 16 
 at device 26.0 on pci0
 uhci0: [ITHREAD]
 uhci0: LegSup = 0x2f00
 usbus0: Intel 82801H (ICH8) USB controller USB-D on uhci0
 uhci1: Intel 82801H (ICH8) USB controller USB-E port 0xe000-0xe01f irq 17 
 at device 26.1 on pci0
 uhci1: [ITHREAD]
 uhci1: LegSup = 0x2f00
 usbus1: Intel 82801H (ICH8) USB controller USB-E on uhci1
 ehci0: Intel 82801H (ICH8) USB 2.0 controller USB2-B mem 
 0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
 ehci0: [ITHREAD]
 usbus2: EHCI version 1.0
 usbus2: Intel 82801H (ICH8) USB 2.0 controller USB2-B on ehci0
 pci0: multimedia, HDA at device 27.0 (no driver attached)
 pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
 pci5: ACPI PCI bus on pcib2
 atapci0: SiI 3132 SATA300 controller port 0xac00-0xac7f mem 
 

Re: Low nfs write throughput

2011-11-18 Thread John Baldwin
On Thursday, November 17, 2011 10:10:27 pm Daryl Sayers wrote:
 
 Can anyone suggest why I am getting poor write performance from my nfs 
setup.
 I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother 
boards,
 4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
 onboard Gb network cards connected to an idle network. The results below 
show
 that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
 improves if I use async but a smbfs mount still beats it. I am using the 
same
 file, source and destinations for all tests. I have tried alternate Network
 cards with no resulting benefit.
 
 oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
 1950511+1 records in
 1950511+1 records out
 998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
 1950477+74 records in
 1950511+1 records out
 998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) 
(98Mb/s)
 
 
 oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) 
(15Mb/s)
 
 
 oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs 
/mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) 
(19Mb/s)
 
 
 oguido# mount -t smbfs //gemini/ufs /mnt
 oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
 7619+1 records in
 7619+1 records out
 998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) 
(33Mb/s)
 
 Looking at a systat -v on the destination I see that the nfs test does not
 exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
 For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
 for the nfs mount.

I don't know if it will help with your performance, but I have some patches
to allow the NFS server to cluster writes.  You can try 
www.freebsd.org/~jhb/patches/nfs_server_cluster.patch.  I've tested it on 8, 
but it should probably apply fine to 9.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Low nfs write throughput

2011-11-18 Thread Rick Macklem
Bane Ivosev wrote:
 and if you use zfs also try this
 
 zfs set sync=disabled
 
I know diddly about zfs, but I believe some others have improved
zfs performance for NFS writing by moving the ZIL log to a dedicated
device, sometimes an SSD. Apparently (again I'm not knowledgible) you
do have to be careful what SSD you use and how full you make it, if you
want good write performance on the SSD.

I should also note that use of these options (vfs.nfsrv.async=1 and the
above for zfs) is risky in the sense that recently written data can be
lost when a server crashes/reboots because the NFS clients don't know
to hold onto the data and re-write it after a server crash/reboot.

rick
ps: NFS write performance has been an issue since SUN released their
first implementation of it in 1985. The big server vendors typically
solve the problem with lots of non-volatile RAM in the server boxes.
(This solution requires server code that specifically knows how to
 use this non-volatile RAM. Such code is not in the FreeBSD servers.)

 On 11/18/11 04:10, Daryl Sayers wrote:
  Can anyone suggest why I am getting poor write performance from my
  nfs setup.
  I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus
  mother boards,
  4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives
  with
  onboard Gb network cards connected to an idle network. The results
  below show
  that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using
  nfs. It
  improves if I use async but a smbfs mount still beats it. I am using
  the same
  file, source and destinations for all tests. I have tried alternate
  Network
  cards with no resulting benefit.
 
  oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
  1950511+1 records in
  1950511+1 records out
  998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
  1950477+74 records in
  1950511+1 records out
  998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec)
  (98Mb/s)
 
 
  oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs
  /mnt
  oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
  7619+1 records in
  7619+1 records out
  998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec)
  (15Mb/s)
 
 
  oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async
  gemini:/dsk/ufs /mnt
  oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
  7619+1 records in
  7619+1 records out
  998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec)
  (19Mb/s)
 
 
  oguido# mount -t smbfs //gemini/ufs /mnt
  oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
  7619+1 records in
  7619+1 records out
  998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec)
  (33Mb/s)
 
  Looking at a systat -v on the destination I see that the nfs test
  does not
  exceed 16KB/t with 100% busy where the other tests reach up to
  128KB/t.
  For the record I get reads of 22Mb/s without and 77Mb/s with async
  turned on
  for the nfs mount.
 
 
  A copy of dmesg:
  
 
  Copyright (c) 1992-2011 The FreeBSD Project.
  Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
  1994
  The Regents of the University of California. All rights
  reserved.
  FreeBSD is a registered trademark of The FreeBSD Foundation.
  FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
  root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
  Timecounter i8254 frequency 1193182 Hz quality 0
  CPU: Intel(R) Core(TM)2 Duo CPU E6850 @ 3.00GHz (2995.21-MHz
  686-class CPU)
Origin = GenuineIntel Id = 0x6fb Family = 6 Model = f Stepping =
11

  Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE

  Features2=0xe3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
AMD Features=0x2010NX,LM
AMD Features2=0x1LAHF
TSC: P-state invariant
  real memory = 4294967296 (4096 MB)
  avail memory = 3141234688 (2995 MB)
  ACPI APIC Table: MSTEST OEMAPIC 
  FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
  FreeBSD/SMP: 1 package(s) x 2 core(s)
   cpu0 (BSP): APIC ID: 0
   cpu1 (AP): APIC ID: 1
  ioapic0 Version 2.0 irqs 0-23 on motherboard
  kbd1 at kbdmux0
  cryptosoft0: software crypto on motherboard
  acpi0: MSTEST TESTONLY on motherboard
  acpi0: [ITHREAD]
  acpi0: Power Button (fixed)
  acpi0: reservation of 0, a (3) failed
  acpi0: reservation of 10, bff0 (3) failed
  Timecounter ACPI-fast frequency 3579545 Hz quality 1000
  acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
  cpu0: ACPI CPU on acpi0
  ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be
  0xB1 (20101013/tbutils-354)
  cpu1: ACPI CPU on acpi0
  pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
  pci0: ACPI PCI bus on pcib0
  pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
  pci1: ACPI PCI bus on pcib1
  mpt0: LSILogic SAS/SATA Adapter port 0x7800-0x78ff mem
  

Re: Low nfs write throughput

2011-11-18 Thread Xin LI
Hi,

 I don't know if it will help with your performance, but I have some patches
 to allow the NFS server to cluster writes.  You can try
 www.freebsd.org/~jhb/patches/nfs_server_cluster.patch.  I've tested it on 8,
 but it should probably apply fine to 9.

I think 9 would need some changes, I just made them with minimal
compile testing, though.

Cheers,
-- 
Xin LI delp...@delphij.net https://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
Index: sys/fs/nfsserver/nfs_nfsdport.c
===
--- sys/fs/nfsserver/nfs_nfsdport.c	(revision 227689)
+++ sys/fs/nfsserver/nfs_nfsdport.c	(working copy)
@@ -90,20 +90,78 @@ SYSCTL_INT(_vfs_nfsd, OID_AUTO, issue_delegations,
 SYSCTL_INT(_vfs_nfsd, OID_AUTO, enable_locallocks, CTLFLAG_RW,
 nfsrv_dolocallocks, 0, Enable nfsd to acquire local locks on files);
 
-#define	NUM_HEURISTIC		1017
+#define	MAX_REORDERED_RPC	16
+#define	NUM_HEURISTIC		1031
 #define	NHUSE_INIT		64
 #define	NHUSE_INC		16
 #define	NHUSE_MAX		2048
 
 static struct nfsheur {
 	struct vnode *nh_vp;	/* vp to match (unreferenced pointer) */
-	off_t nh_nextr;		/* next offset for sequential detection */
+	off_t nh_nextoff;	/* next offset for sequential detection */
 	int nh_use;		/* use count for selection */
 	int nh_seqcount;	/* heuristic */
 } nfsheur[NUM_HEURISTIC];
 
 
 /*
+ * Heuristic to detect sequential operation.
+ */
+static struct nfsheur *
+nfsrv_sequential_heuristic(struct uio *uio, struct vnode *vp)
+{
+	struct nfsheur *nh;
+	int hi, try;
+
+	/* Locate best candidate. */
+	try = 32;
+	hi = ((int)(vm_offset_t)vp / sizeof(struct vnode)) % NUM_HEURISTIC;
+	nh = nfsheur[hi];
+	while (try--) {
+		if (nfsheur[hi].nh_vp == vp) {
+			nh = nfsheur[hi];
+			break;
+		}
+		if (nfsheur[hi].nh_use  0)
+			--nfsheur[hi].nh_use;
+		hi = (hi + 1) % NUM_HEURISTIC;
+		if (nfsheur[hi].nh_use  nh-nh_use)
+			nh = nfsheur[hi];
+	}
+
+	/* Initialize hint if this is a new file. */
+	if (nh-nh_vp != vp) {
+		nh-nh_vp = vp;
+		nh-nh_nextoff = uio-uio_offset;
+		nh-nh_use = NHUSE_INIT;
+		if (uio-uio_offset == 0)
+			nh-nh_seqcount = 4;
+		else
+			nh-nh_seqcount = 1;
+	}
+
+	/* Calculate heuristic. */
+	if ((uio-uio_offset == 0  nh-nh_seqcount  0) ||
+	uio-uio_offset == nh-nh_nextoff) {
+		/* See comments in vfs_vnops.c:sequential_heuristic(). */
+		nh-nh_seqcount += howmany(uio-uio_resid, 16384);
+		if (nh-nh_seqcount  IO_SEQMAX)
+			nh-nh_seqcount = IO_SEQMAX;
+	} else if (qabs(uio-uio_offset - nh-nh_nextoff) = MAX_REORDERED_RPC *
+	imax(vp-v_mount-mnt_stat.f_iosize, uio-uio_resid)) {
+		/* Probably a reordered RPC, leave seqcount alone. */
+	} else if (nh-nh_seqcount  1) {
+		nh-nh_seqcount /= 2;
+	} else {
+		nh-nh_seqcount = 0;
+	}
+	nh-nh_use += NHUSE_INC;
+	if (nh-nh_use  NHUSE_MAX)
+		nh-nh_use = NHUSE_MAX;
+	return (nh);
+}
+
+/*
  * Get attributes into nfsvattr structure.
  */
 int
@@ -567,58 +625,12 @@ nfsvno_read(struct vnode *vp, off_t off, int cnt,
 	int i;
 	struct iovec *iv;
 	struct iovec *iv2;
-	int error = 0, len, left, siz, tlen, ioflag = 0, hi, try = 32;
+	int error = 0, len, left, siz, tlen, ioflag = 0;
 	struct mbuf *m2 = NULL, *m3;
 	struct uio io, *uiop = io;
 	struct nfsheur *nh;
 
-	/*
-	 * Calculate seqcount for heuristic
-	 */
-	/*
-	 * Locate best candidate
-	 */
-
-	hi = ((int)(vm_offset_t)vp / sizeof(struct vnode)) % NUM_HEURISTIC;
-	nh = nfsheur[hi];
-
-	while (try--) {
-		if (nfsheur[hi].nh_vp == vp) {
-			nh = nfsheur[hi];
-			break;
-		}
-		if (nfsheur[hi].nh_use  0)
-			--nfsheur[hi].nh_use;
-		hi = (hi + 1) % NUM_HEURISTIC;
-		if (nfsheur[hi].nh_use  nh-nh_use)
-			nh = nfsheur[hi];
-	}
-
-	if (nh-nh_vp != vp) {
-		nh-nh_vp = vp;
-		nh-nh_nextr = off;
-		nh-nh_use = NHUSE_INIT;
-		if (off == 0)
-			nh-nh_seqcount = 4;
-		else
-			nh-nh_seqcount = 1;
-	}
-
-	/*
-	 * Calculate heuristic
-	 */
-
-	if ((off == 0  nh-nh_seqcount  0) || off == nh-nh_nextr) {
-		if (++nh-nh_seqcount  IO_SEQMAX)
-			nh-nh_seqcount = IO_SEQMAX;
-	} else if (nh-nh_seqcount  1) {
-		nh-nh_seqcount = 1;
-	} else {
-		nh-nh_seqcount = 0;
-	}
-	nh-nh_use += NHUSE_INC;
-	if (nh-nh_use  NHUSE_MAX)
-		nh-nh_use = NHUSE_MAX;
+	nh = nfsrv_sequential_heuristic(uiop, vp);
 	ioflag |= nh-nh_seqcount  IO_SEQSHIFT;
 
 	len = left = NFSM_RNDUP(cnt);
@@ -672,6 +684,7 @@ nfsvno_read(struct vnode *vp, off_t off, int cnt,
 		*mpp = NULL;
 		goto out;
 	}
+	nh-nh_nextoff = uiop-uio_offset;
 	tlen = len - uiop-uio_resid;
 	cnt = cnt  tlen ? cnt : tlen;
 	tlen = NFSM_RNDUP(cnt);
@@ -700,6 +713,7 @@ nfsvno_write(struct vnode *vp, off_t off, int retl
 	struct iovec *iv;
 	int ioflags, error;
 	struct uio io, *uiop = io;
+	struct nfsheur *nh;
 
 	MALLOC(ivp, struct iovec *, cnt * sizeof (struct iovec), M_TEMP,
 	M_WAITOK);
@@ -733,7 +747,11 @@ nfsvno_write(struct vnode *vp, off_t off, int retl
 	uiop-uio_segflg = UIO_SYSSPACE;
 	NFSUIOPROC(uiop, p);
 	uiop-uio_offset = off;
+	nh = nfsrv_sequential_heuristic(uiop, vp);
+	ioflags |= 

Low nfs write throughput

2011-11-17 Thread Daryl Sayers

Can anyone suggest why I am getting poor write performance from my nfs setup.
I have 2 x FreeBSD 8.2-STABLE i386 machines with ASUS P5B-plus mother boards,
4G mem and Dual core 3g processor using 147G 15k Seagate SAS drives with
onboard Gb network cards connected to an idle network. The results below show
that I get nearly 100Mb/s with a dd over rsh but only 15Mbs using nfs. It
improves if I use async but a smbfs mount still beats it. I am using the same
file, source and destinations for all tests. I have tried alternate Network
cards with no resulting benefit.

oguido# dd if=/u0/tmp/D2 | rsh castor dd of=/dsk/ufs/D2
1950511+1 records in
1950511+1 records out
998661755 bytes transferred in 10.402483 secs (96002246 bytes/sec)
1950477+74 records in
1950511+1 records out
998661755 bytes transferred in 10.115458 secs (98726301 bytes/sec) (98Mb/s)


oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp gemini:/dsk/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 62.570260 secs (15960646 bytes/sec) (15Mb/s)


oguido# mount -t nfs -o wsize=65536,rsize=65536,tcp,async gemini:/dsk/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 50.697024 secs (19698627 bytes/sec) (19Mb/s)


oguido# mount -t smbfs //gemini/ufs /mnt
oguido# dd if=/u0/tmp/D2 of=/mnt/tmp/D2 bs=128k
7619+1 records in
7619+1 records out
998661755 bytes transferred in 29.787616 secs (33526072 bytes/sec) (33Mb/s)

Looking at a systat -v on the destination I see that the nfs test does not
exceed 16KB/t with 100% busy where the other tests reach up to 128KB/t.
For the record I get reads of 22Mb/s without and 77Mb/s with async turned on
for the nfs mount.


A copy of dmesg:


Copyright (c) 1992-2011 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.2-STABLE #0: Tue Jul 26 02:49:49 UTC 2011
root@fm32-8-1106:/usr/obj/usr/src/sys/LOCAL i386
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Core(TM)2 Duo CPU E6850  @ 3.00GHz (2995.21-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0xe3fdSSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
  AMD Features=0x2010NX,LM
  AMD Features2=0x1LAHF
  TSC: P-state invariant
real memory  = 4294967296 (4096 MB)
avail memory = 3141234688 (2995 MB)
ACPI APIC Table: MSTEST OEMAPIC 
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 Version 2.0 irqs 0-23 on motherboard
kbd1 at kbdmux0
cryptosoft0: software crypto on motherboard
acpi0: MSTEST TESTONLY on motherboard
acpi0: [ITHREAD]
acpi0: Power Button (fixed)
acpi0: reservation of 0, a (3) failed
acpi0: reservation of 10, bff0 (3) failed
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
cpu0: ACPI CPU on acpi0
ACPI Warning: Incorrect checksum in table [OEMB] - 0xBE, should be 0xB1 
(20101013/tbutils-354)
cpu1: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge irq 16 at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
mpt0: LSILogic SAS/SATA Adapter port 0x7800-0x78ff mem 
0xfd4fc000-0xfd4f,0xfd4e-0xfd4e irq 16 at device 0.0 on pci1
mpt0: [ITHREAD]
mpt0: MPI Version=1.5.18.0
mpt0: Capabilities: ( RAID-0 RAID-1E RAID-1 )
mpt0: 0 Active Volumes (2 Max)
mpt0: 0 Hidden Drive Members (14 Max)
uhci0: Intel 82801H (ICH8) USB controller USB-D port 0xdc00-0xdc1f irq 16 at 
device 26.0 on pci0
uhci0: [ITHREAD]
uhci0: LegSup = 0x2f00
usbus0: Intel 82801H (ICH8) USB controller USB-D on uhci0
uhci1: Intel 82801H (ICH8) USB controller USB-E port 0xe000-0xe01f irq 17 at 
device 26.1 on pci0
uhci1: [ITHREAD]
uhci1: LegSup = 0x2f00
usbus1: Intel 82801H (ICH8) USB controller USB-E on uhci1
ehci0: Intel 82801H (ICH8) USB 2.0 controller USB2-B mem 
0xfebffc00-0xfebf irq 18 at device 26.7 on pci0
ehci0: [ITHREAD]
usbus2: EHCI version 1.0
usbus2: Intel 82801H (ICH8) USB 2.0 controller USB2-B on ehci0
pci0: multimedia, HDA at device 27.0 (no driver attached)
pcib2: ACPI PCI-PCI bridge irq 16 at device 28.0 on pci0
pci5: ACPI PCI bus on pcib2
atapci0: SiI 3132 SATA300 controller port 0xac00-0xac7f mem 
0xfd9ffc00-0xfd9ffc7f,0xfd9f8000-0xfd9fbfff irq 16 at device 0.0 on pci5
atapci0: [ITHREAD]
ata2: ATA channel 0 on atapci0
ata2: [ITHREAD]
ata3: ATA channel 1 on atapci0
ata3: [ITHREAD]
pcib3: ACPI PCI-PCI bridge irq 17 at device 28.1 on pci0