No. It fixes an issue when correctly rebuilding (i.e. with the real displacements) the data-type on the remote side, but it didn't fix the wrong values problem.

  george.

On Dec 13, 2008, at 07:59 , Jeff Squyres wrote:

George -- you had a commit after this (r20123) -- did that fix the problem?


On Dec 12, 2008, at 8:14 PM, George Bosilca wrote:

Dorian,

I looked into this again. So far I can confirm that the datatype is correctly created, and always contain the correct values (internally). If instead of one sided you use send/recv then the output is exactly what you expect. With the one sided there are several strange things. What I can say so far is that everything works fine, except when the block indexed datatype is used as the remote datatype in the MPI_Put operation. In this case the remote memory is not modified.

george.


On Dec 12, 2008, at 08:20 , Dorian Krause wrote:

Hi again.

I adapted my testing program by overwriting the window buffer
complete with 1. This allows me to see at which places OpenMPI writes.
The result is:

*** -DO1=1 -DV1=1 *** (displ 3,2,1,0 , MPI_Type_create_indexed_block)
mem[0] = {  0.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000,  0.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {     nan,     nan,     nan}
mem[4] = {     nan,     nan,     nan}
mem[5] = {     nan,     nan,     nan}
mem[6] = {     nan,     nan,     nan}
mem[7] = {     nan,     nan,     nan}
mem[8] = {     nan,     nan,     nan}
mem[9] = {     nan,     nan,     nan}
*** -DO1=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
mem[0] = {  0.0000,  1.0000,  2.0000}
mem[1] = {  3.0000,  4.0000,  5.0000}
mem[2] = {  6.0000,  7.0000,  8.0000}
mem[3] = {  9.0000, 10.0000, 11.0000}
mem[4] = {     nan,     nan,     nan}
mem[5] = {     nan,     nan,     nan}
mem[6] = {     nan,     nan,     nan}
mem[7] = {     nan,     nan,     nan}
mem[8] = {     nan,     nan,     nan}
mem[9] = {     nan,     nan,     nan}
*** -DO2=1 -DV1=1 *** (displ 0,1,2,3 , MPI_Type_create_indexed_block)
mem[0] = {  0.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000,  0.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {  0.0000,  0.0000,  0.0000}
mem[4] = {     nan,     nan,     nan}
mem[5] = {     nan,     nan,     nan}
mem[6] = {     nan,     nan,     nan}
mem[7] = {     nan,     nan,     nan}
mem[8] = {     nan,     nan,     nan}
mem[9] = {     nan,     nan,     nan}
*** -DO2=1 -DV2=1 *** MPI_Type_contiguous(4, mpi_double3, &mpit)
mem[0] = {  0.0000,  1.0000,  2.0000}
mem[1] = {  3.0000,  4.0000,  5.0000}
mem[2] = {  6.0000,  7.0000,  8.0000}
mem[3] = {  9.0000, 10.0000, 11.0000}
mem[4] = {     nan,     nan,     nan}
mem[5] = {     nan,     nan,     nan}
mem[6] = {     nan,     nan,     nan}
mem[7] = {     nan,     nan,     nan}
mem[8] = {     nan,     nan,     nan}
mem[9] = {     nan,     nan,     nan}

Note that for the reversed ordering (3,2,1,0) only 3 lines are written. If I use displacements 3,2,1,8
I get

*** -DO1=1 -DV1=1 ***
mem[0] = {  0.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000,  0.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {     nan,     nan,     nan}
mem[4] = {     nan,     nan,     nan}
mem[5] = {     nan,     nan,     nan}
mem[6] = {     nan,     nan,     nan}
mem[7] = {     nan,     nan,     nan}
mem[8] = {  0.0000,  0.0000,  0.0000}
mem[9] = {     nan,     nan,     nan}

but 3,2,8,1 yields

*** -DO1=1 -DV1=1 ***
mem[0] = {  0.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000,  0.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {     nan,     nan,     nan}
mem[4] = {     nan,     nan,     nan}
mem[5] = {     nan,     nan,     nan}
mem[6] = {     nan,     nan,     nan}
mem[7] = {     nan,     nan,     nan}
mem[8] = {     nan,     nan,     nan}
mem[9] = {     nan,     nan,     nan}

Dorian


-----Ursprüngliche Nachricht-----
Von: "Dorian Krause" <doriankra...@web.de>
Gesendet: 12.12.08 13:49:25
An: Open MPI Users <us...@open-mpi.org>
Betreff: Re: [OMPI users] Onesided + derived datatypes


Thanks George (and Brian :)).

The MPI_Put error is gone. Did you take a look at the problem
that with the block_indexed type the PUT doesn't work? I'm
still getting the following output (V1 corresponds to the datatype
created with MPI_Type_create_indexed_block while the V2 type
is created with MPI_Type_contiguous, the ordering doesn't care anymore after
your fix) which confuses me
because I remember that (on one machine) MPI_Put with MPI_Type_create_indexed worked until the invalid datatype error showed up (after a couple of timesteps).

*** -DO1=1 -DV1=1 ***
mem[0] = {  0.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000,  0.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {  0.0000,  0.0000,  0.0000}
mem[4] = {  0.0000,  0.0000,  0.0000}
mem[5] = {  0.0000,  0.0000,  0.0000}
mem[6] = {  0.0000,  0.0000,  0.0000}
mem[7] = {  0.0000,  0.0000,  0.0000}
mem[8] = {  0.0000,  0.0000,  0.0000}
mem[9] = {  0.0000,  0.0000,  0.0000}
*** -DO1=1 -DV2=1 ***
mem[0] = {  5.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000, -1.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {  0.0000,  0.0000,  0.0000}
mem[4] = {  0.0000,  0.0000,  0.0000}
mem[5] = {  0.0000,  0.0000,  0.0000}
mem[6] = {  0.0000,  0.0000,  0.0000}
mem[7] = {  0.0000,  0.0000,  0.0000}
mem[8] = {  0.0000,  0.0000,  0.0000}
mem[9] = {  0.0000,  0.0000,  0.0000}
*** -DO2=1 -DV1=1 ***
mem[0] = {  0.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000,  0.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {  0.0000,  0.0000,  0.0000}
mem[4] = {  0.0000,  0.0000,  0.0000}
mem[5] = {  0.0000,  0.0000,  0.0000}
mem[6] = {  0.0000,  0.0000,  0.0000}
mem[7] = {  0.0000,  0.0000,  0.0000}
mem[8] = {  0.0000,  0.0000,  0.0000}
mem[9] = {  0.0000,  0.0000,  0.0000}
*** -DO2=1 -DV2=1 ***
mem[0] = {  5.0000,  0.0000,  0.0000}
mem[1] = {  0.0000,  0.0000, -1.0000}
mem[2] = {  0.0000,  0.0000,  0.0000}
mem[3] = {  0.0000,  0.0000,  0.0000}
mem[4] = {  0.0000,  0.0000,  0.0000}
mem[5] = {  0.0000,  0.0000,  0.0000}
mem[6] = {  0.0000,  0.0000,  0.0000}
mem[7] = {  0.0000,  0.0000,  0.0000}
mem[8] = {  0.0000,  0.0000,  0.0000}
mem[9] = {  0.0000,  0.0000,  0.0000}


Thanks for your help.

Dorian


-----Ursprüngliche Nachricht-----
Von: "George Bosilca" <bosi...@eecs.utk.edu>
Gesendet: 12.12.08 01:35:57
An: Open MPI Users <us...@open-mpi.org>
Betreff: Re: [OMPI users] Onesided + derived datatypes


Dorian,

You are right, the datatype generated using the block_index function is a legal data-type. We wrongly determined some overlapping regions in the description [which is illegal based on the MPI standard]. The detection of such overlapping regions being a very expensive process if we don't want any false positives (such as your datatype), I prefer
to remove it completely.

To keep it short I just committed a patch (r20120) in the trunk, and
I'll take care to move it in the 1.3 and the 1.2.9.

Thanks for your help,
  george.

On Dec 10, 2008, at 18:07 , doriankrause wrote:

Hi List,

I have a MPI program which uses one sided communication with derived datatypes (MPI_Type_create_indexed_block). I developed the code with
MPICH2 and unfortunately didn't thought about trying it out with
OpenMPI. Now that I'm "porting" the Application to OpenMPI I'm facing some problems. On the most machines I get an SIGSEGV in MPI_Win_fence, sometimes an invalid datatype shows up. I ran the program in Valgrind and didn't get anything valuable. Since I can't see a reason for this problem (at least if I understand the standard correctly), I wrote the
attached testprogram.

Here are my experiences:

* If I compile without ONESIDED defined, everything works and V1 and
V2
give the same results
* If I compile with ONESIDED and V2 defined (MPI_Type_contiguous) it
works.
* ONESIDED + V1 + O2: No errors but obviously nothing is send? (Am I
in
assuming that V1+O2 and V2 should be equivalent?)
* ONESIDED + V1 + O1:
[m02:03115] *** An error occurred in MPI_Put
[m02:03115] *** on win
[m02:03115] *** MPI_ERR_TYPE: invalid datatype
[m02:03115] *** MPI_ERRORS_ARE_FATAL (goodbye)

I didn't get a segfault as in the "real life example" but if
ompitest.cc
is correct it means that OpenMPI is buggy when it comes to onesided communication and (some) derived datatypes, so that it is probably not
of problem in my code.

I'm using OpenMPI-1.2.8 with the newest gcc 4.3.2 but the same
behaviour
can be be seen with gcc-3.3.1 and intel 10.1.

Please correct me if ompitest.cc contains errors. Otherwise I would be glad to hear how I should report these problems to the develepors (if
they don't read this).

Thanks + best regards

Dorian




<ompitest.tar.gz>_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



____________________________________________________________________
Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
Der kann`s mit allen: http://www.produkte.web.de/messenger/? did=3123


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



____________________________________________________________________
Psssst! Schon vom neuen WEB.DE MultiMessenger gehört?
Der kann`s mit allen: http://www.produkte.web.de/messenger/?did=3123


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to