I have been having the same issues since I originally set up pvfs2 (I
also have nodes spontaneously reboot if they've had a lot of pvfs2
activity via the kernel module, or more commonly, a process doing I/O
via the kernel module will just lock up, and any commands executed in
that directory will hang.  My system will show a loadavg equal to the
number of hung processes accessing pvfs via the kernel interface, and
there is absolutely no way to kill the processes; the only thing I can
do to restore functionality is to reboot the node in question.

Even when things are working well, the speeds are abysmal if I'm using
the kernel interface.  As this is a multi-user cluster, and I am but
the architect/admin, using the pvfs2-fs tools are not really an option
(users won't do it, nor will they code with romio...most of the time,
they're running someone else's MPI code and don't even understand
enough to modify it).  So in the end, the only thing I get is
scalability...individual nodes' performance is still very poor.

I have not yet upgraded to the latest release (I'm running 2.7.1), but
I have done 2-3 upgrades, with very small improvements between
releases.  I have a maintenance window tomorrow (Wednesday), so I
intend to try out the latest version of pvfs.

--Jim

On Fri, Jul 17, 2009 at 2:00 PM, Jalal<[email protected]> wrote:
> Hi kevin,
>
>
> here is the output of dd:
>
> ****PVFS******
> # dd if=/dev/zero of=file.out bs=1048576 count=100
> 100+0 records in
> 100+0 records out
> 104857600 bytes (105 MB) copied, 2.15666 seconds, 48.6 MB/s
> *****local disk*****
> dd if=/dev/zero of=file.out bs=1048576 count=100
> 100+0 records in
> 100+0 records out
> 104857600 bytes (105 MB) copied, 0.266154 seconds, 394 MB/s
>
>
> Does that look reasonable for my setup considering that I have I only have
> 1GbE network on all nodes, and using 16 PVFS2 servers ?
>
>
> On Thu, Jul 16, 2009 at 6:53 PM, Kevin Harms <[email protected]> wrote:
>>
>>  have you tried using dd? what about: dd if=/dev/zero
>> of=/mnt/pvfs2/file.out bs=1048576 count=100
>>
>> kevin
>>
>> On Jul 16, 2009, at 7:07 PM, Jalal wrote:
>>
>>> hello there,
>>>
>>> I have been trying to setup pvfs2 on a small cluster (16 servers, and
>>> 16 clients) running SUSE10SP2-64bit and I am running into some major
>>> performance problems that are causing me to doubt my install. I am
>>> hoping to get some help from this great users group.
>>>
>>> The server side of things seems to be working great. I have 14 I/O
>>> servers, and 2 metaDB servers. I don't see any errors at all. I can
>>> run the pvfs2 native tools (ex: pvfs2-cp) and I am seeing some
>>> fantastic results (500+ Mbs). The pvfs2-fs.conf is bone stock and is
>>> as generated by pvfs2-genconfig.
>>>
>>> When I use the native linux FS commands (ex: cp, rsync...) I am seeing
>>> some dismal results that are 10-15 times slower then the pvfs2 FS
>>> tools. The kernel driver build goes very smoothly, and I am not seeing
>>> any errors. Here are the steps that I am taking:
>>>
>>>
>>> cd /tmp
>>> tar zxvf pvfs-2.8.1.tar.gz
>>> cd pvfs-2.8.1/
>>> ./configure --prefix=/opt/pvfs2 --with-kernel=/tmp/linux
>>> --disable-server --disable-karma
>>> make kmod
>>> make kmod_install
>>> depmod -a
>>> modprobe pvfs2
>>> /opt/pvfs2/sbin/pvfs2-client -p  /opt/pvfs2/sbin/pvfs2-client-core
>>> mount -t pvfs2 tcp://lab1:3334/pvfs2-fs /mnt/pvfs2
>>>
>>> I did an strace on the pvfs2-client process and I am seeing lots and
>>> lots of retries:
>>>
>>> readv(26, [{"p\27\0\0\2\0\0\0\4\0\0\0\0\0\0\0C\362\0\0d\0\0\0\244\1"...,
>>> 128}], 1) = 128
>>> read(5, 0x7d4450, 8544)                 = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> getrusage(RUSAGE_SELF, {ru_utime={2, 960185}, ru_stime={7, 904494}, ...})
>>> = 0
>>> writev(5, [{"AQ\0\0", 4}, {")\5\3 ", 4}, {"5\316\0\0\0\0\0\0", 8},
>>> {"\4\0\0\377\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0C\362"..., 8224}],
>>> 4) = 8240
>>> poll([{fd=5, events=POLLIN, revents=POLLIN}], 1, 10) = 1
>>> read(5, "AQ\0\0)\5\3 6\316\0\0\0\0\0\0\5\0\0\377\0\0\0\0C\362\0"...,
>>> 8544) = 8544
>>> read(5, 0x7d1020, 8544)                 = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> getrusage(RUSAGE_SELF, {ru_utime={2, 960185}, ru_stime={7, 904494}, ...})
>>> = 0
>>> epoll_ctl(6, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLERR|EPOLLHUP,
>>> {u32=6084112, u64=6084112}}) = -1 EEXIST (File exists)
>>> epoll_wait(6, {}, 16, 0)                = 0
>>> read(5, 0x7d2020, 8544)                 = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> writev(26,
>>> [{"\277\312\0\0\2\0\0\0\246\267\0\0\0\0\0\0L\0\0\0\0\0\0\0"...,
>>> 24}, {"p\27\0\0\2\0\0\0\10\0\0\0\0\0\0\0C\362\0\0d\0\0\0\1\0\0"...,
>>> 76}], 2) = 100
>>> epoll_wait(6, {{EPOLLIN, {u32=6084112, u64=6084112}}}, 16, 10) = 1
>>> fcntl(26, F_GETFL)                      = 0x802 (flags O_RDWR|O_NONBLOCK)
>>> recvfrom(26,
>>> "\277\312\0\0\4\0\0\0\246\267\0\0\0\0\0\0\30\0\0\0\0\0\0"...,
>>> 24, MSG_PEEK|MSG_NOSIGNAL, NULL, NULL) = 24
>>> fcntl(26, F_GETFL)                      = 0x802 (flags O_RDWR|O_NONBLOCK)
>>> recvfrom(26,
>>> "\277\312\0\0\4\0\0\0\246\267\0\0\0\0\0\0\30\0\0\0\0\0\0"...,
>>> 24, MSG_NOSIGNAL, NULL, NULL) = 24
>>> readv(26, [{"p\27\0\0\2\0\0\0\10\0\0\0\0\0\0\0\331\323\17\0\0\0\0\0"...,
>>> 24}], 1) = 24
>>> read(5, 0x7d2020, 8544)                 = -1 EAGAIN (Resource
>>> temporarily unavailable)
>>> epoll_ctl(6, EPOLL_CTL_ADD, 26, {EPOLLIN|EPOLLERR|EPOLLHUP,
>>> {u32=6084112, u64=6084112}}) = -1 EEXIST (File exists)
>>> epoll_wait(6, {}, 16, 0)                = 0
>>> read(5,  <unfinished ...>
>>>
>>>
>>> I appreciate any and all feedback!
>>> _______________________________________________
>>> Pvfs2-users mailing list
>>> [email protected]
>>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
>
>
> _______________________________________________
> Pvfs2-users mailing list
> [email protected]
> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>
>

_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to