Re: [OMPI users] parallel I/O on 64-bit indexed arays

Troels Haugboelle Tue, 7 Jun 2011 04:54:01 -0400

If I understand your question correctly, this is *exactly* one of the reasons that the 
MPI Forum has been arguing about the use of a new type, "MPI_Count", for 
certain parameters that can get very, very large.


Yes, that would help, but unfortunately only in the future.

Sidenote: I believe that a workaround for you is to create some new MPI 
datatypes (e.g., of type contiguous) that you can then use to multiply to get 
to the offsets that you want.  I.e., if you make a type contig datatype of 4 
doubles, you can still only specify up to 2B of them, but that will now get you 
up to an offset of (2B * 4 * sizeof(double)) rather than (2B * sizeof(double)). 
 Make sense?

In principle yes, but the problem is we have an unequal amount ofparticles on each node, so the length of each array is not guaranteed tobe divisible by 2, 4 or any other number. If I have understood thedefinition of MPI_TYPE_CREATE_SUBARRAY correctly the offset can be64-bit, but not the global array size, so, optimally, what I am lookingfor is something that has unequal size for each thread, simple vector,and with 64-bit offsets and global array size.

Another possible workaround would be to identify subsections that do notpass 2B elements, make sub communicators, and then let each of them dumptheir elements with proper offsets. It may work. The problematicarchitecture is a BG/P. On other clusters doing simple I/O, letting allthreads open the file, seek to their position, and then write theirchunk works fine, but somehow on BG/P performance drops dramatically. Myguess is that there is some file locking, or we are overwhelming the I/Onodes..

This ticket for the MPI-3 standard is a first step in the right direction, but 
won't do everything you need (this is more FYI):

     https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/265

See the PDF attached to the ticket; it's going up for a "first reading" in a 
month.  It'll hopefully be part of the MPI-3 standard by the end of the year (Fab 
Tillier, CC'ed, has been the chief proponent of this ticket for the past several months).

Quincey Koziol from the HDF group is going to propose a follow on to this 
ticket, specifically about the case you're referring to -- large counts for 
file functions and datatype constructors.  Quincey -- can you expand on what 
you'll be proposing, perchance?

Interesting, I think something along the lines of the note would be veryuseful and needed for large applications.


Thanks a lot for the pointers and your suggestions,

cheers,

Troels

Re: [OMPI users] parallel I/O on 64-bit indexed arays

Reply via email to