Re: [OMPI users] Best way to reduce 3D array

Gus Correa Tue, 30 Mar 2010 18:39:33 -0400

Hi Derek

Great to read that you parallelized the code.
Sorry to hear about the OO problems,
although I enjoyed to read your characterization of it.  :)
We also have plenty of that,
mostly with some Fortran90 codes that go OOverboard.


I think I suggested "YZ-books", i.e., decompose the domain across X,
which I guess would take advantage of the C array "row major order",
and obviate the need for creating MPI vector types.
However, I guess your choice really depends on how your data
is laid out in memory.

I am not sure if I understood the I/O (output) problem you described.
However, here is a suggestion.
I think I sent it in a previous email.
It assumes the global array fits rank 0/master process memory:

A) To input data (at the beginning) ,

rank 0 can read the all the data from a file to a big buffer/globalarray, then all processes call MPI_Scatter[v],

which distributes the subarrays
to all ranks/slave processes;

B) To output data (at the end),
all processes call MPI_Gather[v],

which allows rank 0/master to collect the final results on a bigbuffer/global array,

and then rank 0 does the output to a file (and in your case,
also converts to "Tecplot", I suppose).

If your domain decomposition took advantage of the array layout
in memory, each process can do a single call to MPI_Scatter
and/or to MPI_Gather[v] to do the job.  All you need know is
the pointer to the first element of the (sub)array and its size
(and for the global array on rank0/master).

If the domain decomposition cuts across the array memory layout,
you may need to define an MPI vector type, with strides, etc,
and use it in the MPI functions above, which again can be called
only once.
With MPI type vector it is a  bit more work and bookkeeping,
but not too hard.

This master/slave I/O pattern is quite common,
and admittedly old fashioned, since it doesn't take advantage of MPI-IO.
However, it is a reliable workhorse,
particularly if you have a plain NFS
mounted file system (as opposed to a parallel file system).

I hope this helps.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


Cole, Derek E wrote:

Hi all,
I posted before about doing a domain decomposition on a 3D array in C,and this is sort of a follow up to that. I was able to get thecalculations working correctly by performing the calculations on XZsub-domains for all Y dimensions of the space. I think someone referredto this as a “book.” In the space. Being that I now have an X startingand ending point, a Z starting and ending point, and a total number of Xand Z points to visit in each direction during the computation, I am nowat another hanging point. First, some background.
I am working on modifying a code that was originally written to be runserially. That being said, there is a massive amount of object orientedcrap that is making this a total nightmare to work on. All of theproperties that are computed for each point in the 3D mesh are stored instructures, and those structures are stored in structures, blah blah, itlooks very gross. In order to speed this code up, I was able to pull outthe most computationally sensitive property (potential) and get it setup in this 3D array that is allocated nicely, etc. The problem is, thiscode eventually outputs after all the iterations to a Tecplot format.The code to do this is very, very contrived.
My idea was to, for the sake of wanting to move on, stuff back all ofthese XZ subdomains that I have calculated into a single array on thefirst processor, so it can go about its way and do the file output onthe WHOLE domain. I seem to be having problems though, extracting outthese SubX * SubZ * Y sized portions of the original that can be sent tothe first processor. Does anyone have any examples anywhere of code thatdoes something like that? It appears that my 3D mesh is in X majorformat in memory, so I tried to create some loops to extract Y, SubZsized columns of X to send back to the zero’th processor but I haven’thad much luck yet.
Any tips are appreciated…thanks!


------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Best way to reduce 3D array

Reply via email to