Rob Ross wrote:
Heh well that's a little different -- that's a read workload. The NFS
client is reading ahead.
Have a look at this:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00074000000000000000
and this:
http://www.pvfs.org/cvs/pvfs-2-7-branch.build/doc/pvfs2-faq/pvfs2-faq.php#SECTION00077000000000000000
Also this email and specifically the immutable option; you could set
this on your files after you are done ripping and encoding:
http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-September/002688.html
You'd probably want to use the pvfs2-xattr utility to set the attribute
so you don't have to sudo it.
Other networked file systems can hide some of this latency by caching
data (either coherently or not) on the client. PVFS does not do this,
so each little operation goes across the wire.
Can this be investigated with some networktool and if yes, how ?
There's really no advantage to using a parallel file system for the
workload you have described,
But should the disadvantage be in this order of magnitude ?
Apparently :). You could strace the app to see how big/small the IOs
are. Some apps have options for block sizes for IO that can be used to
improve performance. Also, there's no reason to bother with striping
files in this case, since you're accessing serially. You should set the
the number of datafiles (objects holding data) to 1 on the directory
you're storing into:
setfattr -n "user.pvfs2.num_dfiles" -v "1" /mnt/pvfs2/directory
unless you're planning on having a lot of systems doing this process
in parallel and want a single place to store the output.
What sort of network do you have in this system? What sort of nodes
are you using for the PVFS servers?
All AMD 4000+ systems with 1Gb networkcards and 320GB disk in each or
them.
Copying from to clients to these 3 servers is > 100MB/sec pretty close
to what Gb ethernet can do.
In what context do you get that performance? How do tools like pvfs2-cp
compare in performance?
cp and pvfs2-cp are not significantly different, although load with
pvfs2-cp is a lot higher, at least at client-side
[EMAIL PROTECTED] pvfs]$ time cp "/pvfs2/videos/In the Flesh - Roger
Waters.mpg" .
0.00user 23.17system 1:42.64elapsed 22%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (1major+1227minor)pagefaults 0swaps
[EMAIL PROTECTED] pvfs]$ time pvfs2-cp "/pvfs2/videos/In the Flesh - Roger
Waters.mpg" test.mpg
2.62user 71.50system 1:41.21elapsed 73%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (12major+3160minor)pagefaults 0swaps
I've found a workaround, by cat and |
time strace -o strace.mplayer cat "/pvfs2/videos/In the Flesh - Roger
Waters.mpg" | mplayer -dumpstream -dumpfile /pvfs2/videos/flesh.mpg -
works very well, strace gives read and write blocks as 'I understand it'
write(1, "\0\0\1\272G\21Uk\315\257\1\211\303\370\0\0\1\275\7\354"...,
4194304) = 4194304
Doing it without the cat and | I get
read(3, "\0\0\1\272D\2%StQ\1\211\303\370\0\0\1\275\7\354\201\200"...,
2048) = 2048
What I have in mind is that > 1000 nodes, can read the same file(s)
simultaneaously, not at exactly the same time though. So striping is
crucial to what I hope to accomplish.
PNFS should/would make this possible I think/hope.
Rob
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users