Rob Ross wrote:
Heh well that's a little different -- that's a read workload. The NFS client is reading ahead.

Have a look at this:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00074000000000000000
and this:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00077000000000000000

Also this email and specifically the immutable option; you could set this on your files after you are done ripping and encoding: http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-September/002688.html

You'd probably want to use the pvfs2-xattr utility to set the attribute so you don't have to sudo it.

Other networked file systems can hide some of this latency by caching data (either coherently or not) on the client. PVFS does not do this, so each little operation goes across the wire.
Can this be investigated with some networktool and if yes, how ?
There's really no advantage to using a parallel file system for the workload you have described,
But should the disadvantage be in this order of magnitude ?

Apparently :). You could strace the app to see how big/small the IOs are. Some apps have options for block sizes for IO that can be used to improve performance. Also, there's no reason to bother with striping files in this case, since you're accessing serially. You should set the the number of datafiles (objects holding data) to 1 on the directory you're storing into:
  setfattr -n "user.pvfs2.num_dfiles" -v "1" /mnt/pvfs2/directory

unless you're planning on having a lot of systems doing this process in parallel and want a single place to store the output. What sort of network do you have in this system? What sort of nodes are you using for the PVFS servers?
All AMD 4000+ systems with 1Gb networkcards and 320GB disk in each or them. Copying from to clients to these 3 servers is > 100MB/sec pretty close to what Gb ethernet can do.

In what context do you get that performance? How do tools like pvfs2-cp compare in performance?
cp and pvfs2-cp are not significantly different, although load with pvfs2-cp is a lot higher, at least at client-side [EMAIL PROTECTED] pvfs]$ time cp "/pvfs2/videos/In the Flesh - Roger Waters.mpg" .
0.00user 23.17system 1:42.64elapsed 22%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+1227minor)pagefaults 0swaps
[EMAIL PROTECTED] pvfs]$ time pvfs2-cp "/pvfs2/videos/In the Flesh - Roger Waters.mpg" test.mpg
2.62user 71.50system 1:41.21elapsed 73%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (12major+3160minor)pagefaults 0swaps

I've found a workaround, by cat and |
time strace -o strace.mplayer cat "/pvfs2/videos/In the Flesh - Roger Waters.mpg" | mplayer -dumpstream -dumpfile /pvfs2/videos/flesh.mpg -
works very well, strace gives read and write blocks as 'I understand it'
write(1, "\0\0\1\272G\21Uk\315\257\1\211\303\370\0\0\1\275\7\354"..., 4194304) = 4194304
Doing it without the cat and | I get
read(3, "\0\0\1\272D\2%StQ\1\211\303\370\0\0\1\275\7\354\201\200"..., 2048) = 2048 What I have in mind is that > 1000 nodes, can read the same file(s) simultaneaously, not at exactly the same time though. So striping is crucial to what I hope to accomplish.
PNFS should/would make this possible I think/hope.

Rob


_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to