Re: [Pvfs2-users] Improving PVFS2 Small I/O Performance

Rob Ross Sun, 27 Sep 2009 07:27:50 -0700

On Sep 27, 2009, at 12:19 AM, Milo wrote:

Hi, guys. Milo from CMU here.
I'm looking into small I/O performance on PVFS2. It's actually partof a larger project investigating possible improvements to theperformance of cloud computing software, and we're using PVFS2 as akind of upper bound for performance (e.g. writing a flat file on aparallel filesystem as opposed to updating data in an HBase table).
One barrier I've encountered is the small I/O nature of many ofthese Cloud Workloads. For example, the one we're looking atcurrently does 1 KB I/O requests even when performing sequentialwrites to generate a file.
On large I/O requests, I've managed to tweak PVFS2 to get close tothe performance of the underlying filesystem (115 MB/s or so). Buton small I/O requests performance is much lower. It seems I can onlyperformance approximately 5,000 I/O operations/second even whensequentially writing testing on a single node server (4.7 MB/s with1KB sequential writes. 19.0 MB/s with 4KB sequential writes). Thefilesystem system is mounted through the PVFS2 kernel mod. Thisseems similar to the Bonnie++ rates in ftp://info.mcs.anl.gov/pub/tech_reports/reports/P1010.pdf
None of this is unexpected to me and I'm happy with PVFS2's large I/O performance. But I'd like to get a better handle on where thisbottleneck is coming from, codewise (and how I could fix it if Ifind coding time between research). Here's some experimentation I'vedone so far:
1) A small pair of C client/servert programs that open and close TCPconnections in a tight loop, pinging each other with a small of data('Hello World'). I see about 10,000 connections/second with thisapproach. So if each small I/O is opening and closing two TCPconnections, this could be the bottleneck. I haven't yet dug intothe pvfs2-client code and the library to see if it reuses TCPconnections or makes new ones on each request (that's deeper intothe flow code than I remember. =;) )


Don't waste your time; it keeps the connections open.

2) I can write to the underlying filesystem with 1 KB sequentialwrites almost as quickly as with 1 MB writes. So it's not theunderlying ext3.
3) The IO ops/s bottleneck is there even with the null-aioTroveMethod, so I doubt it's Trove.
4) atime is getting updated with null-aio, so a MetaData barrier ispossible.
Some configuration information about the filesystem:
* version 2.8.1
* The strip_size is 4194304. Not that this should matter a greatdeal with one server.
* FlowBufferSizeBytes 4194304
* TroveSyncMeta and TroveSyncData are set to no
* I've applied the patch from http://www.pvfs.org/fisheye/rdiff/PVFS?csid=MAIN:slang:20090421161045&u&Nto be sure metadata syncing really is off, though I'm not sure howto check. =:)

It would be interesting to know how much time is spent on the client(in the kernel module, in the daemon) vs. how much on the server.This would probably help us rule out quite a few things too.

Thanks.

~Milo
PS: Should I send this to the pvfs2-developers list instead?Apologies if I've used the wrong venue.




_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Re: [Pvfs2-users] Improving PVFS2 Small I/O Performance

Reply via email to