Milo,
I would second Kyle's suggestion about trying get performance to an
expected level for 1 client and 1 server.
Does your RAID hardware tell you what i/o request size is best for
it? Then set FlowBufferSizeBytes to that as well as max_sectors_kb in
the directory listed below.
where X is the letter(s) of your device:
/sys/block/sdX/queue has parameters you can tweak to change for
tuning the block device.
mounting xfs with noatime and nobarrier gives us the best
performance for our SAN.
you could also try the directio trove back-end but if your RAID
hardware has a high IOP count then maybe that won't help.
you might try this for your networking...?
/etc/sysctl.conf
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.optmem_max = 524287
net.core.netdev_max_backlog = 300000
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_sack = 1
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
net.ipv4.tcp_low_latency = 0
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_moderate_rcvbuf = 1
net.ipv4.route.flush = 1
net.ipv4.tcp_rfc1337=1
you definitely want jumbo frames if all your hardware supports it.
i found from experience that tuning is very hard. Lots of knobs to
turn...
kevin
On Jul 23, 2009, at 12:59 PM, Milo wrote:
Hi, guys. I'm getting surprisingly poor performance on my PVFS2
cluster. Here's the setup:
*) 13 nodes running PVFS2 2.8.1 with Linux Kernel 2.6.28-13 server,
each with a 15 drive RAID-5 array.
*) The RAID-5 array gets 60 MB/s local write speeds with XFS
according to iozone (writing in 4 MB records)
I'd like to get at least 50 MB/s/server from the cluster and I've
been testing this with a single PVFS2 server and client with the
client running either on the same node or a node on the same switch
(it doesn't seem to make a lot of difference). The server is
configured with Trove syncing off, a 4 MB strip size simple_strip
distribution, and a 1 MB FlowBufferSizeBytes. Results have been as
follow:
With TroveMethod alt-aio or default, I'm getting around 15 MB/s when
transferring a 3 GB file through pvfs2-cp:
r...@ss239:~# pvfs2-cp -t ./file.3g /mnt/pvfs2/out
Wrote 2867200000 bytes in 192.811478 seconds. 14.181599 MB/seconds
dd'ing a similar file through pvfs2fuse gets about a third of that
performance, 5 MB/s:
r...@ss239:~# dd if=/dev/zero of=/mnt/pvfs2fuse/out bs=1024K
count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 206.964 s, 5.2 MB/s
I get similar results using iozone writing through the fuse client.
If I switch the method to null-aio, things speed up a lot, but it's
still suspiciously slow:
r...@ss239:~# pvfs2-cp -t ./file.out /mnt/pvfs2fuse/out7-nullaio
Wrote 2867200000 bytes in 60.815127 seconds. 44.962086 MB/seconds
r...@ss239:~# dd if=/dev/zero of=/mnt/pvfs2fuse/out-nullaio
bs=1024K count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 21.0201 s, 51.1 MB/s
I suspect there's some network bottleneck. I'm going to try to
adjust the MTU as Jim just did. But are there any other
configuration options I should look into?
Thanks.
~Milo
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users