Hi Bruce

Bruce Allen wrote:
I'd like to know the fastest that anyone has seen an NFS server run, over either a 10Gb/s ethernet link or a handful of link aggregated (channel-bonded) Gb/s ethernet lines.

If you allow us to go into the world of NFS-alike things, the Panasas file system and server has hit about 2 GB/s in some of the testing we had done more than a year ago.

We ran the same problem with NFS on the same hardware (different code paths/file system name space) and it was suffering along at about 300 MB/s.


This would be with a small number of clients making large file sequential reads from the same NFS host/server. Please assume that the NFS server has 'infinitely fast' disks.

This was ~32 compute nodes talking over a gigabit switch of some sort (Nortel I think).

I am told by one vendor that "NFS can't run faster than 100MB/sec". I

Hmmmm....

Maybe theirs can't ...

...  or they are trying to sell you something ... :)

don't understand or believe this. If the server's local disks can read/write at 300MB/s and the networking can run substantially faster than 100 MB/s, I don't see any constraint to faster operation. But perhaps someone on this list can provide real-world data (or say why it can't work).

.... ok, a number of different issues going on here

a) the 300 MB/s (SATA II, right?) is the max theoretical speed. You are going to get something close to this in pure buffer to memory transactions in specialized cases. Normally you will see 50-70 MB/s for these disks for large block sequential reads. SATA also does a bit of interrupting... you need a *good* SATA controller, or you will see your interrupt rate go up 10x in heavy disk load times. Software RAID will increase this a bit as well.

b) If this is gigabit, you get about 110 MB/s max in best case scenarios, with the wind at your packets, along with a nice gravitational potential, an a good switch to direct packets by. If this is IB, you should be able to see quite a bit higher, though your PCI is going to limit you. PCI-e is better (and HTX is *awesome*).

Note: I am free to use modern versions of the NFS protocol, jumbo frames, large rsize/wsize, etc.

We had some issues about a year ago (not revisited recently) with RHEL3, jumbo frames, and Broadcom gigabit adapters (tg3 was flaky, and bcm5700 was much more stable/faster). We reported it to RH, whose response at the time was basically "go away". Wasn't an issue on the same hardware using other distros.

With NFS, you are moving through a protocol stack (NFS) as well as a transport stack (TCPIP). This is not cheap. However, there can be a number of reasons why NFS appears slow for you or your vendor.

FWIW, we have customers with units we have built out that happily support 2-400 MB/s over NFS without complaining, over gigabit (multiple simultaneous clients hammering on the server). There are multiple problems to overcome to get this working correctly and efficiently.

<speculation>

From what I can see on a 4 way system, I think it could support at maximum about 2 GB/s of disk IO ( DMA access to ram ) per CPU connected to an IO channel (most 4 ways have a single CPU connected to their IO channels). The protocol is not cheap, and the processing overhead could easily pare this down to 600-900 MB/s over a fast enough network fabric.

With some tweaking and tuning, you might be able to get this going a little faster. You would need to speak to the IB folks, or the 10 Gbe folks to see what they are really seeing. 1GB/s per adapter (10Gbe) is doable over PCIe/HTX (if there were HTX cards for it). If they have RDMA and TCP offload capability, you will likely get a win and some better performance.

</speculation>



--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to