Jalal - If you want to change the blocksize of the data actually stored on disk, I believe you'd have to tweak the underlying filesystem that the bstreams are stored on. Typically a standard XFS installation will most definitely not be the bottleneck in the pvfs2 filesystem, and to be honest there are a lot more options to tune/tweak in pvfs that will probably make a larger impact on performance
In general, what I do is to tweak the FlowBufferSize paramater in the filesystem config file. It is defaulted to 64KB, which will be the max size of any transfer (packet) over the wire in PVFS. You'll want to tune this for your workload and network - Setting it too high can hurt smaller accesses, too low (imo: default) tends to hurt large transfers. In my experience : Setting it to 1MB is usually sufficiently large to make the physical network become the bottleneck. Hope this helps, ~Kyle Kyle Schochenmaier On Wed, Jul 22, 2009 at 1:44 PM, Jalal<[email protected]> wrote: > I played around with DD by varying the BS and I was able to get close to > 60MB/s which is pretty good. I have some tweaks that I can on the disks and > the network that will make it a bit better, so i am happy with the numbers > that I got: > > dd if=/dev/zero of=file.out bs=4000K count=2800 > 2800+0 records in > 2800+0 records out > 11468800000 bytes (11 GB) copied, 204.367 seconds, 56.1 MB/s > > Which brings up a (possibly) silly question, how do you change the BS in > PVFS2 ? Is this just a matter of formating the filesystem that the > StorageSpace is on ? or is there different/bettey way ? > > PS. thanks for answering my question about EGAIN, I will just ignore those > errors from now on. > > > > On Tue, Jul 21, 2009 at 5:13 PM, Phil Carns <[email protected]> wrote: >> >> I would think with gigE that you should be able to get in the neighborhood >> of 80 MB/s (and with multiple processes maybe into the 90s). >> >> That dd test may be too short to show your real bandwidth, though (that >> example only took 2 seconds). I would suggest increasing the count and/or >> the block size until you have a test that runs for about a minute to get a >> more stable number. >> >> As a side note, the EAGAINs that you see when stracing pvfs2-client are >> perfectly normal. It uses nonblocking sockets for all of its communication, >> in which case that error code is a normal occurrence. >> >> If something like "cp" is running poorly it might be helpful to strace the >> cp tool itself and see how big its access sizes are. I'm not familiar with >> SUSE10, but some past versions of core utils fail to honor the block size >> reported by PVFS and end up using really small accesses by PVFS standards. >> >> -Phil >> >> Jalal wrote: >>> >>> Hi kevin, >>> >>> >>> here is the output of dd: >>> >>> ****PVFS****** >>> # dd if=/dev/zero of=file.out bs=1048576 count=100 >>> 100+0 records in >>> 100+0 records out >>> 104857600 bytes (105 MB) copied, 2.15666 seconds, 48.6 MB/s >>> *****local disk***** >>> dd if=/dev/zero of=file.out bs=1048576 count=100 >>> 100+0 records in >>> 100+0 records out >>> 104857600 bytes (105 MB) copied, 0.266154 seconds, 394 MB/s >>> >>> >>> Does that look reasonable for my setup considering that I have I only >>> have 1GbE network on all nodes, and using 16 PVFS2 servers ? >>> >>> >>> On Thu, Jul 16, 2009 at 6:53 PM, Kevin Harms <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> >>> have you tried using dd? what about: dd if=/dev/zero >>> of=/mnt/pvfs2/file.out bs=1048576 count=100 >>> >>> kevin >>> >>> >>> On Jul 16, 2009, at 7:07 PM, Jalal wrote: >>> >>> hello there, >>> >>> I have been trying to setup pvfs2 on a small cluster (16 >>> servers, and >>> 16 clients) running SUSE10SP2-64bit and I am running into some >>> major >>> performance problems that are causing me to doubt my install. I am >>> hoping to get some help from this great users group. >>> >>> The server side of things seems to be working great. I have 14 I/O >>> servers, and 2 metaDB servers. I don't see any errors at all. I >>> can >>> run the pvfs2 native tools (ex: pvfs2-cp) and I am seeing some >>> fantastic results (500+ Mbs). The pvfs2-fs.conf is bone stock and >>> is >>> as generated by pvfs2-genconfig. >>> >>> When I use the native linux FS commands (ex: cp, rsync...) I am >>> seeing >>> some dismal results that are 10-15 times slower then the pvfs2 FS >>> tools. The kernel driver build goes very smoothly, and I am not >>> seeing >>> any errors. Here are the steps that I am taking: >>> >>> >>> cd /tmp >>> tar zxvf pvfs-2.8.1.tar.gz >>> cd pvfs-2.8.1/ >>> ./configure --prefix=/opt/pvfs2 --with-kernel=/tmp/linux >>> --disable-server --disable-karma >>> make kmod >>> make kmod_install >>> depmod -a >>> modprobe pvfs2 >>> /opt/pvfs2/sbin/pvfs2-client -p /opt/pvfs2/sbin/pvfs2-client-core >>> mount -t pvfs2 tcp://lab1:3334/pvfs2-fs /mnt/pvfs2 >>> >>> I did an strace on the pvfs2-client process and I am seeing lots >>> and >>> lots of retries: >>> >>> readv(26, >>> [{"p\27\0\0\2\0\0\0\4\0\0\0\0\0\0\0C\362\0\0d\0\0\0\244\1"..., >>> 128}], 1) = 128 >>> read(5, 0x7d4450, 8544) = -1 EAGAIN (Resource >>> temporarily unavailable) >>> getrusage(RUSAGE_SELF, {ru_utime={2, 960185}, ru_stime={7, >>> 904494}, ...}) = 0 >>> writev(5, [{"AQ\0\0", 4}, {")\5\3 ", 4}, {"5\316\0\0\0\0\0\0", 8}, >>> {"\4\0\0\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0C\362"..., >>> 8224}], >>> 4) = 8240 >>> poll([{fd=5, events=POLLIN, revents=POLLIN}], 1, 10) = 1 >>> read(5, "AQ\0\0)\5\3 >>> 6\316\0\0\0\0\0\0\5\0\0\377\0\0\0\0C\362\0"..., >>> 8544) = 8544 >>> read(5, 0x7d1020, 8544) = -1 EAGAIN (Resource >>> temporarily unavailable) >>> getrusage(RUSAGE_SELF, {ru_utime={2, 960185}, ru_stime={7, >>> 904494}, ...}) = 0 >>> epoll_ctl(6, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLERR|EPOLLHUP, >>> {u32=6084112, u64=6084112}}) = -1 EEXIST (File exists) >>> epoll_wait(6, {}, 16, 0) = 0 >>> read(5, 0x7d2020, 8544) = -1 EAGAIN (Resource >>> temporarily unavailable) >>> writev(26, >>> [{"\277\312\0\0\2\0\0\0\246\267\0\0\0\0\0\0L\0\0\0\0\0\0\0"..., >>> 24}, >>> {"p\27\0\0\2\0\0\0\10\0\0\0\0\0\0\0C\362\0\0d\0\0\0\1\0\0"..., >>> 76}], 2) = 100 >>> epoll_wait(6, {{EPOLLIN, {u32=6084112, u64=6084112}}}, 16, 10) = 1 >>> fcntl(26, F_GETFL) = 0x802 (flags >>> O_RDWR|O_NONBLOCK) >>> recvfrom(26, >>> "\277\312\0\0\4\0\0\0\246\267\0\0\0\0\0\0\30\0\0\0\0\0\0"..., >>> 24, MSG_PEEK|MSG_NOSIGNAL, NULL, NULL) = 24 >>> fcntl(26, F_GETFL) = 0x802 (flags >>> O_RDWR|O_NONBLOCK) >>> recvfrom(26, >>> "\277\312\0\0\4\0\0\0\246\267\0\0\0\0\0\0\30\0\0\0\0\0\0"..., >>> 24, MSG_NOSIGNAL, NULL, NULL) = 24 >>> readv(26, >>> [{"p\27\0\0\2\0\0\0\10\0\0\0\0\0\0\0\331\323\17\0\0\0\0\0"..., >>> 24}], 1) = 24 >>> read(5, 0x7d2020, 8544) = -1 EAGAIN (Resource >>> temporarily unavailable) >>> epoll_ctl(6, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLERR|EPOLLHUP, >>> {u32=6084112, u64=6084112}}) = -1 EEXIST (File exists) >>> epoll_wait(6, {}, 16, 0) = 0 >>> read(5, <unfinished ...> >>> >>> >>> I appreciate any and all feedback! >>> _______________________________________________ >>> Pvfs2-users mailing list >>> [email protected] >>> <mailto:[email protected]> >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >>> >>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Pvfs2-users mailing list >>> [email protected] >>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >> > > > > -- > Jalal Haddad > [email protected] > Portland&Corvallis Oregon > > _______________________________________________ > Pvfs2-users mailing list > [email protected] > http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users > > _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
