Matthieu: Are you seeing any 100% CPU utilizations on the client? We have seen this with the client core (which you are not using) on a multicore system; however, both the client core and the PVFS interface do use the same request structures, etc.
Becky On Tue, Apr 2, 2013 at 11:11 AM, Becky Ligon <[email protected]> wrote: > Matthieu: > > I have asked Phil Carns to help you since he is more familiar with the > benchmark and MPIIO. I think Rob Latham or Rob Ross may be helping too. I > continue to look at your data in the mean time. > > Becky > > Phil/Rob: > > Thanks so much for helping Matthieu. I am digging into the matter but MPI > is still new to me and I'm not familiar with the PVFS interface that > accompanies ROMIO. > > Becky > > PS. Can we keep this on the pvfs2-users list so I can see how things > progress? > > > On Tue, Apr 2, 2013 at 10:47 AM, Matthieu Dorier <[email protected] > > wrote: > >> Hi Rob and Phil, >> >> This thread moved to the ofs-support mailing list (probably because the >> first personne to answer was part of this team), but I didn't get much >> answer to my problem, so I'll try to summarize here what I have done. >> >> First to answer Phil, here is the PVFS config file attached, and here is >> the script file used for IOR: >> >> IOR START >> testFile = pvfs2:/mnt/pvfs2/testfileA >> filePerProc=0 >> api=MPIIO >> repetitions=100 >> verbose=2 >> blockSize=4m >> transferSize=4m >> collective=1 >> writeFile=1 >> interTestDelay=60 >> readFile=0 >> RUN >> IOR STOP >> >> Besides the tests I was describing on my first mail, I also did the same >> experiments on another cluster also with TCP over IB, and then on Ethernet, >> with 336 clients and 672 clients, with 2, 4 and 8 storage servers. In every >> cases, this behavior appears. >> >> I benchmarked the local disk attached to the storage servers and got >> 42MB/s, so the high throughput of over 2GB/s I get obviously benefits from >> some caching mechanisme and the periodic behavior observed at high output >> frequency could be explained by that. Yet this does not explain why, >> overall, the performance decreases over time. >> >> I attach a set of graphics summarizing the experiments (on the x axis >> it's the iteration number and on the y axis the aggregate throughput >> obtained for this iteration, 100 consecutive iterations are performed). >> It seems that the performance follows the law D = a*T+b where D is the >> duration of the write, T is the wallclock time since the beginning of the >> experiment, and "a" and "b" are constants. >> >> When I stop IOR and immediately restart it, I get the good performance >> back, it does not continue at the reduced performance the previous instance >> finished. >> >> I also thought it could come from the fact that the same file is >> re-written at every iteration, and tried with the multiFile=1 option to >> have one new file at every iteration instead, but this didn't help. >> >> Last thing I can mention: I'm using mpich 3.0.2, compiled with PVFS >> support. >> >> Matthieu >> >> ----- Mail original ----- >> > De: "Rob Latham" <[email protected]> >> > À: "Matthieu Dorier" <[email protected]> >> > Cc: "pvfs2-users" <[email protected]> >> > Envoyé: Mardi 2 Avril 2013 15:57:54 >> > Objet: Re: [Pvfs2-users] Strange performance behavior with IOR >> > >> > On Sat, Mar 23, 2013 at 03:31:22PM +0100, Matthieu Dorier wrote: >> > > I've installed PVFS (orangeFS 2.8.7) on a small cluster (2 PVFS >> > > nodes, 28 compute nodes of 24 cores each, everything connected >> > > through infiniband but using an IP stack on top of it, so the >> > > protocol for PVFS is TCP), and I witness some strange performance >> > > behaviors with IOR (using ROMIO compiled against PVFS, no kernel >> > > support): >> > >> > > IOR is started on 336 processes (14 nodes), writing 4MB/process in >> > > a >> > > single shared file using MPI-I/O (4MB transfer size also). It >> > > completes 100 iterations. >> > >> > OK, so you have one pvfs client per core. All these are talking to >> > two servers. >> > >> > > First every time I start an instance of IOR, the first I/O >> > > operation >> > > is extremely slow. I'm guessing this is because ROMIO has to >> > > initialize everything, get the list of PVFS servers, etc. Is there >> > > a >> > > way to speed this up? >> > >> > ROMIO isn't doing a whole lot here, but there is one thing different >> > about ROMIO's 1st call vs the Nth call. The 1st call (first time any >> > pvfs2 file is opened or deleted), ROMIO will call the function >> > PVFS_util_init_defaults(). >> > >> > If you have 336 clients banging away on just two servers, I bet that >> > could explain some slowness. In the old days, the PVFS server had to >> > service these requests one at a time. >> > >> > I don't think this restriction has been relaxed? Since it is a >> > read-only operation, though, it sure seems like one could just have >> > servers shovel out pvfs2 configuration information as fast as >> > possible. >> > >> > >> > > Then, I set some delay between each iteration, to better reflect >> > > the >> > > behavior of an actual scientific application. >> > >> > Fun! this is kind of like what MADNESS does. "computes" by sleeping >> > for a bit. I think Phil's questions will help us understand the >> > highly variable performance. >> > >> > Can you experiment with IORs collective I/O? by default, collective >> > I/O will select one client per node as an "i/o aggregator". The IOR >> > workload will not benefit from ROMIO's two-phase optimization, but >> > you've got 336 clients banging away on two servers. When I last >> > studied pvfs scalability, 100x more clients than servers wasn't a >> > big >> > deal, but 5-6 years ago nodes did not have 24 way parallelism. >> > >> > ==rob >> > >> > -- >> > Rob Latham >> > Mathematics and Computer Science Division >> > Argonne National Lab, IL USA >> > >> >> _______________________________________________ >> Pvfs2-users mailing list >> [email protected] >> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users >> >> > > > -- > Becky Ligon > OrangeFS Support and Development > Omnibond Systems > Anderson, South Carolina > > -- Becky Ligon OrangeFS Support and Development Omnibond Systems Anderson, South Carolina
_______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
