I will throw this document into the ring as well. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.641.1965&rep=re p1&type=pdf#page=169
Even though it's a bit dated, it hits on a lot of the elements that go into getting the most performance out of your 10G network adapter. As previously noted, the ability for your NFS server to deliver data to come anywhere close to filling a 10G pipe will depend heavily on its disk configuration and I/O access patterns. That being said, you mentioned the performance is worse at 10G than with a 1G connection. Have you looked at your CPU load while your migration/reclamation processes are running? You stated you can get about 7gb/s throughput using iperf3 (what options? what target?) but only 900mb/s running migration/reclamation. Iperf's sole job is to hammer packets across the network, so it has a different behavior on the local machine with respect to buffering/context switching/off-load processing than your production workload. If the tools show your network can transit data at desired speed, then it would seem to point to the local stack. Based on the couple of observations you've provided, I would look into whether the buffering is sufficient (across the board - tcp window, send/receive buffers, tx/rx queue len, etc). Remember, packets are theoretically coming in 10 times faster than before - if your CPUs are busy doing TSM things, their caches aren't primed for packet processing. If a packet gets dropped anywhere in the pipeline due to overrun, that flow (tcp/NFS session) will come to a halt until the missing packet gets resent - that will really kill throughput. (That's why those offload settings can help - some workloads seem to benefit from them, others not so much - testing with your production workload is the key). Also, it's helpful to look at a packet trace. Observe the tcp window sizes - if they are hitting 0, or if you're seeing retransmissions, etc. that can clue you into what's going on. Remember, too, the problem could be on the remote end as well. Good luck and let us know how you make out! -Ken -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Lee, Gary Sent: Wednesday, February 03, 2016 8:15 AM To: ADSM-L@VM.MARIST.EDU Subject: Re: getting performance from nfs storage over 10 gb link To Mike: Thanks for the link, hadn't seen that. Also, the nfs server is a dedicated SAN head end, so I have little to no control of its nfs parms. To -----Original Message----- From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Skylar Thompson Sent: Tuesday, February 02, 2016 4:43 PM To: ADSM-L@VM.MARIST.EDU Subject: Re: [ADSM-L] getting performance from nfs storage over 10 gb link Hi Gary, To Skylar: I will check into those options. To all: I can say that tsm performance was better when the same storage was mounted with the same mount options using a 1 gb adapter. Strange, I know, but that's the riddle I'm dealing with. Thanks for everything so far. Will keep you posted on results. We don't use NFS for our TSM servers, but we have been struggling with NFS over 10GbE in other areas. While not a universal solution, we've gotten significant performance improvements by disabling the following NIC offload options: gro lro rxvlan txvlan rxhash For instance, you can disable gro with ethtool --offload eth0 gro off (assuming eth0 is your NIC) There's a bunch more we haven't had a chance to play with, but hopefully that's a starting point. On Tue, Feb 02, 2016 at 05:54:46PM +0000, Lee, Gary wrote: > Tsm server 6.4.3, > RHEL 6.7 > 4 Dual-Core AMD Opteron(tm) Processor 8220 SE CPUs > 128 gB memory > > I recently installed an intel 10 gB Ethernet card. > Iperf3 test shown below gives around 7 gb throughput. > However, when running multiple migration and reclamation processes, and watching network traffic through an independent tool, I cannot get over the 900 mb/s threshold. > > The tsm storage pools are on a file system mounted as follows: > > Nxback-Pool02.servers.bsu.edu:/volumes/Pool02/TSMBackup/tsm02 > /tsminst1/storage nfs > rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=t > cp,timeo=600,retrans=2,sec=sys 0 0 > > I am running out of options. > > I don't expect to see the full throughput, as disk speeds will have a good deal of impact. > > Any ideas would be helpful. -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine