Matthieu:

Are you seeing any 100% CPU utilizations on the client?  We have seen this
with the client core (which you are not using) on a multicore system;
however, both the client core and the PVFS interface do use the same
request structures, etc.

Becky

On Tue, Apr 2, 2013 at 11:11 AM, Becky Ligon <[email protected]> wrote:

> Matthieu:
>
> I have asked Phil Carns to help you since he is more familiar with the
> benchmark and MPIIO.  I think Rob Latham or Rob Ross may be helping too.  I
> continue to look at your data in the mean time.
>
> Becky
>
> Phil/Rob:
>
> Thanks so much for helping Matthieu.  I am digging into the matter but MPI
> is still new to me and I'm not familiar with the PVFS interface that
> accompanies ROMIO.
>
> Becky
>
> PS.  Can we keep this on the pvfs2-users list so I can see how things
> progress?
>
>
> On Tue, Apr 2, 2013 at 10:47 AM, Matthieu Dorier <[email protected]
> > wrote:
>
>> Hi Rob and Phil,
>>
>> This thread moved to the ofs-support mailing list (probably because the
>> first personne to answer was part of this team), but I didn't get much
>> answer to my problem, so I'll try to summarize here what I have done.
>>
>> First to answer Phil, here is the PVFS config file attached, and here is
>> the script file used for IOR:
>>
>> IOR START
>>   testFile = pvfs2:/mnt/pvfs2/testfileA
>>   filePerProc=0
>>   api=MPIIO
>>   repetitions=100
>>   verbose=2
>>   blockSize=4m
>>   transferSize=4m
>>   collective=1
>>   writeFile=1
>>   interTestDelay=60
>>   readFile=0
>>   RUN
>> IOR STOP
>>
>> Besides the tests I was describing on my first mail, I also did the same
>> experiments on another cluster also with TCP over IB, and then on Ethernet,
>> with 336 clients and 672 clients, with 2, 4 and 8 storage servers. In every
>> cases, this behavior appears.
>>
>> I benchmarked the local disk attached to the storage servers and got
>> 42MB/s, so the high throughput of over 2GB/s I get obviously benefits from
>> some caching mechanisme and the periodic behavior observed at high output
>> frequency could be explained by that. Yet this does not explain why,
>> overall, the performance decreases over time.
>>
>> I attach a set of graphics summarizing the experiments (on the x axis
>> it's the iteration number and on the y axis the aggregate throughput
>> obtained for this iteration, 100 consecutive iterations are performed).
>> It seems that the performance follows the law D = a*T+b where D is the
>> duration of the write, T is the wallclock time since the beginning of the
>> experiment, and "a" and "b" are constants.
>>
>> When I stop IOR and immediately restart it, I get the good performance
>> back, it does not continue at the reduced performance the previous instance
>> finished.
>>
>> I also thought it could come from the fact that the same file is
>> re-written at every iteration, and tried with the multiFile=1 option to
>> have one new file at every iteration instead, but this didn't help.
>>
>> Last thing I can mention: I'm using mpich 3.0.2, compiled with PVFS
>> support.
>>
>> Matthieu
>>
>> ----- Mail original -----
>> > De: "Rob Latham" <[email protected]>
>> > À: "Matthieu Dorier" <[email protected]>
>> > Cc: "pvfs2-users" <[email protected]>
>> > Envoyé: Mardi 2 Avril 2013 15:57:54
>> > Objet: Re: [Pvfs2-users] Strange performance behavior with IOR
>> >
>> > On Sat, Mar 23, 2013 at 03:31:22PM +0100, Matthieu Dorier wrote:
>> > > I've installed PVFS (orangeFS 2.8.7) on a small cluster (2 PVFS
>> > > nodes, 28 compute nodes of 24 cores each, everything connected
>> > > through infiniband but using an IP stack on top of it, so the
>> > > protocol for PVFS is TCP), and I witness some strange performance
>> > > behaviors with IOR (using ROMIO compiled against PVFS, no kernel
>> > > support):
>> >
>> > > IOR is started on 336 processes (14 nodes), writing 4MB/process in
>> > > a
>> > > single shared file using MPI-I/O (4MB transfer size also). It
>> > > completes 100 iterations.
>> >
>> > OK, so you have one pvfs client per core.  All these are talking to
>> > two servers.
>> >
>> > > First every time I start an instance of IOR, the first I/O
>> > > operation
>> > > is extremely slow. I'm guessing this is because ROMIO has to
>> > > initialize everything, get the list of PVFS servers, etc. Is there
>> > > a
>> > > way to speed this up?
>> >
>> > ROMIO isn't doing a whole lot here, but there is one thing different
>> > about ROMIO's 1st call vs the Nth call.  The 1st call (first time any
>> > pvfs2 file is opened or deleted), ROMIO will call the function
>> > PVFS_util_init_defaults().
>> >
>> > If you have 336 clients banging away on just two servers, I bet that
>> > could explain some slowness.  In the old days, the PVFS server had to
>> > service these requests one at a time.
>> >
>> > I don't think this restriction has been relaxed?  Since it is a
>> > read-only operation, though, it sure seems like one could just have
>> > servers shovel out pvfs2 configuration information as fast as
>> > possible.
>> >
>> >
>> > > Then, I set some delay between each iteration, to better reflect
>> > > the
>> > > behavior of an actual scientific application.
>> >
>> > Fun! this is kind of like what MADNESS does.  "computes" by sleeping
>> > for a bit.   I think Phil's questions will help us understand the
>> > highly variable performance.
>> >
>> > Can you experiment with IORs collective I/O?  by default, collective
>> > I/O will select one client per node as an "i/o aggregator".  The IOR
>> > workload will not benefit from ROMIO's two-phase optimization, but
>> > you've got 336 clients banging away on two servers.  When I last
>> > studied pvfs scalability,  100x more clients than servers wasn't a
>> > big
>> > deal, but 5-6 years ago nodes did not have 24 way parallelism.
>> >
>> > ==rob
>> >
>> > --
>> > Rob Latham
>> > Mathematics and Computer Science Division
>> > Argonne National Lab, IL USA
>> >
>>
>> _______________________________________________
>> Pvfs2-users mailing list
>> [email protected]
>> http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
>>
>>
>
>
> --
> Becky Ligon
> OrangeFS Support and Development
> Omnibond Systems
> Anderson, South Carolina
>
>


-- 
Becky Ligon
OrangeFS Support and Development
Omnibond Systems
Anderson, South Carolina
_______________________________________________
Pvfs2-users mailing list
[email protected]
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users

Reply via email to