On 07/27/2014 05:58 PM, Angel de Vicente wrote:

If you're with me so far, I think you'll see much better parallel write
performance once the MPI-IO library is trying harder to align writes.

I'm going to try your suggestion and will report back, but do you think
this could explain the different performance for PMLX, PMLY, and PMLZ?
In the sample code in pastebin, the data written to file in each of
these cases only differs by who is writing it (in the case of PMLX,
processors that in a Cartesian decomposition would be in the smallest X
plane, PMLY in the smallest Y plane, and PMLZ those in the smallest Z
plane), but the amount of data, the distribution of that data in memory
(for each processor), and the place where it is stored in the actual
file is the same...

There are two things that might be happening when the layout and the file system interact.

For some decompositions, MPI-IO might not even use collective I/O. (Except on blue gene) ROMIO will check for "interleave" -- if each process accesses a continguous region already, there's little benefit to two-phase. or at least that's what we thought 15 years ago. it's a little more complicated today...

Some decompositions might introduce a "hole": in order to carry out a partial update, ROMIO will "data sieve" the request, instead of updating piece by piece. ROMIO will read into a buffer, update the regions, then write out a large contiguous request. Often this is a good optimizations, but sometimes the holes are so small that the overhead of the read outweighs the benefits.

Are you familiar with the Darshan statistics tool?  you can use it to confirm
you are hitting (or not) unaligned writes.

Not really. Only heard about it, but I will try it. This issue is
proving pretty hard to figure out, and it is a real bottleneck for our
code, so I will try anything...

Yeah, i'm getting off into the woods here, so a tool like Darshan can help you answer the low-level details I'm bugging you about.

==rob

--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to