Re: [OMPI devel] New Romio for OpenMPI available in bitbucket

2010-09-22 Thread Pascal Deveze

Jeff Squyres a écrit :

On Sep 17, 2010, at 6:36 AM, Pascal Deveze wrote:

  

In charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888) ,
I have put the resulting code in bitbucket at:
http://bitbucket.org/devezep/new-romio-for-openmpi/



Sweet!

  

The work in this repo consisted in refreshing ROMIO to a newer
version: the one from the very last MPICH2 release (mpich2-1.3b1).



Great!  I saw there was another MPICH2 release, and I saw a ROMIO patch or 
three go by on the MPICH list recently.  Do you expect there to be major 
differences between what you have and those changes?

  
I also see this new release (mpich2-1.3rc1). I am going to report the 
modifications and inform the list.

I don't have any parallel filesystems to test with, but if someone else in the 
community could confirm/verify at least one or two of the parallel filesystems 
supported in ROMIO, I think we should bring this stuff into the trunk soon.

  

Testing:
1. runs fine except one minor error (see the explanation below) on various FS.
2. runs fine with Lustre, but:
   . had to add a small patch in romio/adio/ad_lustre_open.c



Did this patch get pushed upstream?

  
This patch has been integrated yesterday in mpich2-1.3rc1 with another 
patch in romio/adio/common/lock.c. They will be available very soon in 
bitbucket.

 The minor error ===
The test error.c fails because OpenMPI does not handle correctly the
"two level" error functions of ROMIO:
  error_code = MPIO_Err_create_code(MPI_SUCCESS, MPIR_ERR_RECOVERABLE,
  myname, __LINE__, MPI_ERR_ARG,
  "**iobaddisp", 0);
OpenMPI limits its view to MPI_ERR_ARG, but the real error is "**iobaddisp".



Do you mean that we should be returning an error string "**iobaddisp" instead of 
"MPI_ERR_ARG"?

  
In MPICH2, they have a file mpi/errhan/errnames.txt that will generate 
mpi/errhan/errnames.h making the links between codes
like "**iobaddisp" and the corresponding error string "Invalid 
displacement argument".
The error.c program tests the presence of "displacement" in the error 
string.


With OpenMPI ,the error message is:
" MPI_ERR_ARG: invalid argument of some other kind"

With MPICH2 , the error message is :
"Invalid argument, error stack:
MPI_FILE_SET_VIEW(60): Invalid displacement argument"

It would be better if OpenMPI displays at least the "Invalid 
displacement argument" message.

This is not a new problem in OpenMPI, it was also the case in the trunk.



Re: [OMPI devel] New Romio for OpenMPI available in bitbucket

2010-09-22 Thread Pascal Deveze
I just commited the very last modifications of ROMIO (mpich2-1.3rc1) 
into bitbucket.


Pascal

Jeff Squyres a écrit :

On Sep 17, 2010, at 6:36 AM, Pascal Deveze wrote:

  

In charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888) ,
I have put the resulting code in bitbucket at:
http://bitbucket.org/devezep/new-romio-for-openmpi/



Sweet!

  

The work in this repo consisted in refreshing ROMIO to a newer
version: the one from the very last MPICH2 release (mpich2-1.3b1).



Great!  I saw there was another MPICH2 release, and I saw a ROMIO patch or 
three go by on the MPICH list recently.  Do you expect there to be major 
differences between what you have and those changes?

I don't have any parallel filesystems to test with, but if someone else in the 
community could confirm/verify at least one or two of the parallel filesystems 
supported in ROMIO, I think we should bring this stuff into the trunk soon.

  

Testing:
1. runs fine except one minor error (see the explanation below) on various FS.
2. runs fine with Lustre, but:
   . had to add a small patch in romio/adio/ad_lustre_open.c



Did this patch get pushed upstream?

  

 The minor error ===
The test error.c fails because OpenMPI does not handle correctly the
"two level" error functions of ROMIO:
  error_code = MPIO_Err_create_code(MPI_SUCCESS, MPIR_ERR_RECOVERABLE,
  myname, __LINE__, MPI_ERR_ARG,
  "**iobaddisp", 0);
OpenMPI limits its view to MPI_ERR_ARG, but the real error is "**iobaddisp".



Do you mean that we should be returning an error string "**iobaddisp" instead of 
"MPI_ERR_ARG"?

  




[OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-11-10 Thread Pascal Deveze

WHAT: Port the lastest ROMIO version from MPICH2-1.3 into the trunk.

WHY: There is a considerable interest in updating the ROMIO branch that 
was ported from mpich2-1.0.7


WHERE: ompi/mca/io/romio/

WHEN: Before 1.5.2, so asap

TIMEOUT: Next Tuesday teleconf, 23 Nov 2010

-

I am in charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888).
I have made the porting of ROMIO available in bitbucket since September 
17th 2010. (http://bitbucket.org/devezep/new-romio-for-openmpi/ )
Until now, I do not have any report on this porting and it's now time to 
bring it into the trunk.

All modified files are located under the romio subtree.

Pascal Devèze



Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-11-24 Thread Pascal Deveze

Hi Jeff,

Here is the unified diff.
As only the romio subtree is modified, I made the following command:
 diff -u -r -x .svn ompi-trunk/ompi/mca/io/romio/romio/ 
NEW-ROMIO-FOR-OPENMPI/ompi/mca/io/romio/romio/ > DIFF_UPDATE

 tar cvzf DIFF_UPDATE.TGZ DIFF_UPDATE

Compilation is OK. I run the ROMIO tests.

There are a few new modifications that are not in bitbucket. I think it 
is not necessary to update bitbucket 
(http://bitbucket.org/devezep/new-romio-for-openmpi/ ).


Pascal

Jeff Squyres a écrit :

Thanks Pascal!

Is there any change you could send a unified diff of the tip of your hg vs. the 
SVN trunk HEAD?

E.g., if you have an hg+ssh combo tree, could you "hg up" in there to get all your work, and 
then "svn diff > diff.out" and then compress and send the diff.out?

Thanks!



On Nov 10, 2010, at 8:43 AM, Pascal Deveze wrote:

  

WHAT: Port the lastest ROMIO version from MPICH2-1.3 into the trunk.

WHY: There is a considerable interest in updating the ROMIO branch that was 
ported from mpich2-1.0.7

WHERE: ompi/mca/io/romio/

WHEN: Before 1.5.2, so asap

TIMEOUT: Next Tuesday teleconf, 23 Nov 2010

-

I am in charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888).
I have made the porting of ROMIO available in bitbucket since September 17th 
2010. (http://bitbucket.org/devezep/new-romio-for-openmpi/ )
Until now, I do not have any report on this porting and it's now time to bring 
it into the trunk.
All modified files are located under the romio subtree.

Pascal Devèze

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




  




DIFF_UPDATE.TGZ
Description: application/compressed


Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-11-29 Thread Pascal Deveze

Jeff,

The last changes are not committed back in bitbucket. I thought that was 
not necessary. Would you like that I update also bitbucket ? If yes, I 
will do it.


Applying the diff on a local copy of the trunk, you should be able to 
generated a library with the new ROMIO.


Pascal

Jeff Squyres a écrit :
Great!  


Are those final changes committed back to the bitbucket?  If so, I'll give it a 
whirl.


On Nov 24, 2010, at 10:48 AM, Pascal Deveze wrote:

  

Hi Jeff,

Here is the unified diff.
As only the romio subtree is modified, I made the following command:
  diff -u -r -x .svn ompi-trunk/ompi/mca/io/romio/romio/ 
NEW-ROMIO-FOR-OPENMPI/ompi/mca/io/romio/romio/ > DIFF_UPDATE
  tar cvzf DIFF_UPDATE.TGZ DIFF_UPDATE

Compilation is OK. I run the ROMIO tests.

There are a few new modifications that are not in bitbucket. I think it is not 
necessary to update bitbucket 
(http://bitbucket.org/devezep/new-romio-for-openmpi/ ).

Pascal
 
Jeff Squyres a écrit :


Thanks Pascal!

Is there any change you could send a unified diff of the tip of your hg vs. the 
SVN trunk HEAD?

E.g., if you have an hg+ssh combo tree, could you "hg up" in there to get all your work, and 
then "svn diff > diff.out" and then compress and send the diff.out?

Thanks!



On Nov 10, 2010, at 8:43 AM, Pascal Deveze wrote:

  

  

WHAT: Port the lastest ROMIO version from MPICH2-1.3 into the trunk.

WHY: There is a considerable interest in updating the ROMIO branch that was 
ported from mpich2-1.0.7

WHERE: ompi/mca/io/romio/

WHEN: Before 1.5.2, so asap

TIMEOUT: Next Tuesday teleconf, 23 Nov 2010

-

I am in charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888

).
I have made the porting of ROMIO available in bitbucket since September 17th 
2010. (
http://bitbucket.org/devezep/new-romio-for-openmpi/
 )
Until now, I do not have any report on this porting and it's now time to bring 
it into the trunk.
All modified files are located under the romio subtree.

Pascal Devèze

___
devel mailing list

de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




  

  

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




  




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-11-30 Thread Pascal Deveze

Hi Jeff,

Thanks for having a look in my unified diff (see comments in the text)

I have commited all my last changes in bitbucket, including those that 
follows.



Pascal

Jeff Squyres a écrit :

Some questions about the patch:

configure.in:

@@ -2002,9 +1987,8 @@
# Turn off the building of the Fortran interface and the Info routines
EXTRA_DIRS=""
AC_DEFINE(HAVE_STATUS_SET_BYTES,1,[Define if status_set_bytes available])
-   DEFINE_HAVE_MPI_GREQUEST="#define HAVE_MPI_GREQUEST"
-   # Add the MPICH2_INCLUDE_FLAGS to CPPFLAGS
-   CPPFLAGS="$CPPFLAGS $MPICH2_INCLUDE_FLAGS"
+   DEFINE_HAVE_MPI_GREQUEST="#define HAVE_MPI_GREQUEST 1"
+   AC_DEFINE(HAVE_MPIU_FUNCS,1,[Define if MPICH2 memory tracing macros 
defined])
  
 fi

 #
 #

Do we have the MPIU functions?  Or is that an MPICH2-specific thing?
  


I have put in comments this last "AC_DEFINE":
# Open MPI does not have the MPIU functions
# AC_DEFINE(HAVE_MPIU_FUNCS,1,[Define if MPICH2 memory tracing macros 
defined])



I see that you moved confdb/aclocal_cc.m4 to acinclude.m4.  Shoudn't we just -I 
confdb instead to get all of their .m4 files?

  

This has been done during the last porting (years ago).
I have now changed this: All confdb/.m4 files are now copied from 
MPICH2. Only the definition of PAC_FUNC_NEEDS_DECL is still kept in 
acinclude.m4.

If I do not so, configure is still blocking on this macro.
All seems working well so. If you have any clue about this, I will take it !


In mpipr.h, why remove the #if 0?

-/* Open MPI: these functions are not supposed to be profiled */
-#if 0
 #undef MPI_Wtick
 #define MPI_Wtick PMPI_Wtick
 #undef MPI_Wtime
 #define MPI_Wtime PMPI_Wtime
-#endif

  


OK, I put the #if 0 again.


In configure.in, please update the version number in AM_INIT_AUTOMAKE.
  


AM_INIT_AUTOMAKE(io-romio, 1.0.0, 'no')
is changed to
AM_INIT_AUTOMAKE(io-romio, 1.0.1, 'no')


I thought there was one other thing that I saw, but I can't recall it right 
now...

This is just from looking at your diff; I didn't try to run it yet because you 
said there were some things that weren't pushed back up to bitbucket yet.





On Nov 24, 2010, at 10:48 AM, Pascal Deveze wrote:

  

Hi Jeff,

Here is the unified diff.
As only the romio subtree is modified, I made the following command:
  diff -u -r -x .svn ompi-trunk/ompi/mca/io/romio/romio/ 
NEW-ROMIO-FOR-OPENMPI/ompi/mca/io/romio/romio/ > DIFF_UPDATE
  tar cvzf DIFF_UPDATE.TGZ DIFF_UPDATE

Compilation is OK. I run the ROMIO tests.

There are a few new modifications that are not in bitbucket. I think it is not 
necessary to update bitbucket 
(http://bitbucket.org/devezep/new-romio-for-openmpi/ ).

Pascal
 
Jeff Squyres a écrit :


Thanks Pascal!

Is there any change you could send a unified diff of the tip of your hg vs. the 
SVN trunk HEAD?

E.g., if you have an hg+ssh combo tree, could you "hg up" in there to get all your work, and 
then "svn diff > diff.out" and then compress and send the diff.out?

Thanks!



On Nov 10, 2010, at 8:43 AM, Pascal Deveze wrote:

  

  

WHAT: Port the lastest ROMIO version from MPICH2-1.3 into the trunk.

WHY: There is a considerable interest in updating the ROMIO branch that was 
ported from mpich2-1.0.7

WHERE: ompi/mca/io/romio/

WHEN: Before 1.5.2, so asap

TIMEOUT: Next Tuesday teleconf, 23 Nov 2010

-

I am in charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888

).
I have made the porting of ROMIO available in bitbucket since September 17th 
2010. (
http://bitbucket.org/devezep/new-romio-for-openmpi/
 )
Until now, I do not have any report on this porting and it's now time to bring 
it into the trunk.
All modified files are located under the romio subtree.

Pascal Devèze

___
devel mailing list

de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




  

  

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




  




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-12-01 Thread Pascal Deveze

Hi Jeff,

Comments are in the text

Jeff Squyres a écrit :

On Nov 30, 2010, at 6:44 AM, Pascal Deveze wrote:

  

I have commited all my last changes in bitbucket, including those that follows.



I got a checkout, and still have some problems/questions.  More below.

If you do the IM thing, ping me on IM (I sent you my IDs in an off-list email).

  

I am not on AIM nor on google talk. Sorry. In the case you think it is 
necessary, I could ask for an ID.
We could also continue with email.


Do we have the MPIU functions?  Or is that an MPICH2-specific thing?
  

I have put in comments this last "AC_DEFINE":
# Open MPI does not have the MPIU functions
# AC_DEFINE(HAVE_MPIU_FUNCS,1,[Define if MPICH2 memory tracing macros defined]) 



Good.

  

I see that you moved confdb/aclocal_cc.m4 to acinclude.m4.  Shoudn't we just -I 
confdb instead to get all of their .m4 files?

  

This has been done during the last porting (years ago).
I have now changed this: All confdb/.m4 files are now copied from MPICH2. Only 
the definition of PAC_FUNC_NEEDS_DECL is still kept in acinclude.m4.
If I do not so, configure is still blocking on this macro.
All seems working well so. If you have any clue about this, I will take it !



I see that we have the whole romio/confdb directory, so it seems like we should 
use that tree rather than copy to acinclude.m4.
  
I agree with you. But, as I said, I have a problem with the macro 
PAC_FUNC_NEEDS_DECL and the only way to solve it is to put it in 
acinclude.m4.

But I note that when I get an hg clone of your repo:

- there's no .hgignore file -- making "hg status" difficult.  In your SVN+HG 
tree, can you run ./contrib/hg/build-hgignore.pl and commit/push the resulting .hgignore? 
 That would be most helpful.
  

I have done it, and pushed.

- ompi/mca/io/romio/romio/adio/include/romioconf.h.in is in the hg repo, but 
should not be (it's generated).
  

I removed it and pushed the modification.
- I don't see a romio/acinclude.m4 file in the repo, so whatever you did there doesn't show up for me.  
  

I see the file romio/romio/acinclude.m4 in bitbucket:
http://bitbucket.org/devezep/new-romio-for-openmpi/src/f06f1a24c75b/ompi/mca/io/romio/romio/acinclude.m4


- I tried to add an ompi/mca/io/romio/romio/autogen.sh executable file that 
contained:

:
autoreconf -ivf -I confdb

and that seems to make everything work.  Can you confirm/double check?

  
Yes I tried what you suggest (without acinclude.m4), it seems that 
everything work:

autoreconf -ivf -I confdb
autoreconf: Entering directory `.'
autoreconf: configure.in: not using Gettext
autoreconf: running: aclocal -I confdb --force
autoreconf: configure.in: tracing
autoreconf: running: libtoolize --copy --force
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `confdb'.
libtoolize: copying file `confdb/ltmain.sh'
libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.in and
libtoolize: rerunning libtoolize, to keep the correct libtool macros 
in-tree.

libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
libtoolize: `AC_PROG_RANLIB' is rendered obsolete by `LT_INIT'
autoreconf: running: /homes/openmpi/tools/2010-10-12/bin/autoconf 
--include=confdb --force
autoreconf: running: /homes/openmpi/tools/2010-10-12/bin/autoheader 
--include=confdb --force

autoreconf: running: automake --add-missing --copy --force-missing
autoreconf: Leaving directory `.'

If I try to generate the whole MPI, autogen.sh works but configure fails 
in the romio directory.

If I try your autoreconf, then it works for ROMIO.
= This does not work without acinclude.m4 ==
./autogen.sh
./configure --prefix=$HOME/bitbucket/new-romio-for-openmpi/install 
--disable-ipv6 --with-openib=${OFED_BUILDROOT}/usr 
--enable-openib-connectx-xrc --enable-contrib-no-build=libnbc,vt 
--with-io-romio-flags="CFLAGS=-I$LUSTRE_PATH/usr/include/ 
--with-file-system=ufs+nfs+lustre"



= This works without acinclude.m4 ==
./autogen.sh
cd ompi/mca/io/romio/romio
autoreconf -ivf -I confdb
cd -
./configure --prefix=$HOME/bitbucket/new-romio-for-openmpi/install 
--disable-ipv6 --with-openib=${OFED_BUILDROOT}/usr 
--enable-openib-connectx-xrc --enable-contrib-no-build=libnbc,vt 
--with-io-romio-flags="CFLAGS=-I$LUSTRE_PATH/usr/include/ 
--with-file-system=ufs+nfs+lustre"


My conclusion is: There is something to change in autogen.sh to deal 
with ROMIO (call autoreconf -ivf -I confdb). In that case, the file 
acinclude.m4 is no more usefull.

In configure.in, please update the version number in AM_INIT_AUTOMAKE.
  

AM_INIT_AUTOMAKE(io-romio, 1.0.0, 'no')
is changed to
AM_INIT_AUTOMAKE(io-romio, 1.0.1, 'no')



Can we use whatever the real ROMIO version number is?  

  
It seems that real version is 1.2.6 (see README).  So I changed it, 
committed and pushed.




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-12-06 Thread Pascal Deveze

Jeff,

I removed ompi/mca/io/romio/romio/acinclude.m4. I put "autoreconf -ivf 
-I confdb" in autogen.sh. And I "chmod +x autogen.sh" (my

stupid error is that this file wasn't executable).
And all is now OK.
These modifications have been pushed in bitbucket.

I tried to run the ROMIO tests and got an error in 
ompi/mpi/c/profile/MPI_File_set_errhandler.c:

OBJ_RELEASE(tmp) is calling an assertion:

pfile_set_errhandler.c:75: PMPI_File_set_errhandler: Assertion 
`((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *) 
(tmp))->obj_magic_id' failed.

[cuzco10:10336] *** Process received signal ***
[cuzco10:10336] Signal: Aborted (6)
[cuzco10:10336] Signal code:  (-6)
[cuzco10:10336] [ 0] /lib64/libpthread.so.0() [0x3e8560f440]
[cuzco10:10336] [ 1] /lib64/libc.so.6(gsignal+0x35) [0x3e852329c5]
[cuzco10:10336] [ 2] /lib64/libc.so.6(abort+0x175) [0x3e852341a5]
[cuzco10:10336] [ 3] /lib64/libc.so.6(__assert_fail+0xf5) [0x3e8522b945]
[cuzco10:10336] [ 4] 
/home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0(MPI_File_set_errhandler+0x1e4) 
[0x7fcbee89d1d4]
[cuzco10:10336] [ 5] 
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so(mca_io_romio_dist_MPI_File_close+0x12a) 
[0x7fcbe7dbc4ea]
[cuzco10:10336] [ 6] 
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so(+0x9764) 
[0x7fcbe7d8e764]
[cuzco10:10336] [ 7] 
/home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0(+0x50309) 
[0x7fcbee853309]
[cuzco10:10336] [ 8] 
/home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0(+0x4faa0) 
[0x7fcbee852aa0]
[cuzco10:10336] [ 9] 
/home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0(PMPI_File_close+0xa2) 
[0x7fcbee896832]

[cuzco10:10336] [10] ./a.out(main+0x3a4) [0x402434]
[cuzco10:10336] [11] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3e8521ec5d]
[cuzco10:10336] [12] ./a.out() [0x401fc9]
[cuzco10:10336] *** End of error message ***

I am currently analysing the problem (MPI_File_close() now calls 
MPI_File_set_errhandler()).


Pascal

Jeff Squyres a écrit :

On Dec 1, 2010, at 7:35 AM, Pascal Deveze wrote:

  

I am not on AIM nor on google talk. Sorry. In the case you think it is 
necessary, I could ask for an ID.



FWIW.  Many of us find it convenient for quickie/informal discussions.  We can 
keep going here in email and switch to phone if it becomes necessary.
  

I see that we have the whole romio/confdb directory, so it seems like we should 
use that tree rather than copy to acinclude.m4.
  
  

I agree with you. But, as I said, I have a problem with the macro 
PAC_FUNC_NEEDS_DECL and the only way to solve it is to put it in acinclude.m4.



Per below, I think this is now moot -- the romio/autogen.sh script should fix 
this.

  
- there's no .hgignore file -- making "hg status" difficult.  In your SVN+HG tree, can you run ./contrib/hg/build-hgignore.pl and commit/push the resulting .hgignore?  That would be most helpful.  

  

I have done it, and pushed.



Awesome; thanks.

  

- ompi/mca/io/romio/romio/adio/include/romioconf.h.in is in the hg repo, but 
should not be (it's generated).

  

I removed it and pushed the modification.

- I don't see a romio/acinclude.m4 file in the repo, so whatever you did there doesn't show up for me.  

  

I see the file romio/romio/acinclude.m4 in bitbucket:
http://bitbucket.org/devezep/new-romio-for-openmpi/src/f06f1a24c75b/ompi/mca/io/romio/romio/acinclude.m4



Weird.  Ok.  But I think this is now moot.

  

- I tried to add an ompi/mca/io/romio/romio/autogen.sh executable file that 
contained:

:
autoreconf -ivf -I confdb

and that seems to make everything work.  Can you confirm/double check?  

  

Yes I tried what you suggest (without acinclude.m4), it seems that everything 
work:
autoreconf -ivf -I confdb
autoreconf: Entering directory `.'
autoreconf: configure.in: not using Gettext
autoreconf: running: aclocal -I confdb --force 
autoreconf: configure.in: tracing

autoreconf: running: libtoolize --copy --force
libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `confdb'.
libtoolize: copying file `confdb/ltmain.sh'
libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.in and
libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree.
libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am.
libtoolize: `AC_PROG_RANLIB' is rendered obsolete by `LT_INIT'
autoreconf: running: /homes/openmpi/tools/2010-10-12/bin/autoconf 
--include=confdb --force
autoreconf: running: /homes/openmpi/tools/2010-10-12/bin/autoheader 
--include=confdb --force
autoreconf: running: automake --add-missing --copy --force-missing
autoreconf: Leaving directory `.'

If I try to generate the whole MPI, autogen.sh works but configure fails in the 
romio directory.



I'm confused by this statement.  Did you run the top-level autogen.sh first?  
That would sho

Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2010-12-16 Thread Pascal Deveze

Jeff Squyres a écrit :

On Dec 16, 2010, at 3:31 AM, Pascal Deveze wrote:

  

I got the assert every time with the following "trivial" code:
#include "mpi.h"



Good; let's add this trivial test to ompi-tests.  Do you guys have a set of 
ROMIO / IO test cases that you run?  I don't think we have many in ompi-tests.
  

I use the tests under romio/test. Does anyone know other tests ?

  

int main(int argc, char **argv) {
  MPI_File fh;
  MPI_Info info, info_used;

  MPI_Init(&argc,&argv);

  MPI_File_open(MPI_COMM_WORLD, "/tmp/A", MPI_MODE_CREATE | MPI_MODE_RDWR, 
MPI_INFO_NULL, &fh);
  MPI_File_close(&fh);

  MPI_File_open(MPI_COMM_WORLD, "/tmp/A", MPI_MODE_CREATE | MPI_MODE_RDWR, 
MPI_INFO_NULL, &fh);
  MPI_File_close(&fh);

  MPI_Finalize();
}

I run this programon one process : salloc -p debug  -n1 mpirun -np 1 ./a.out
And I get teh assertion error:

a.out: attribute/attribute.c:763: ompi_attr_delete: Assertion `((0xdeafbeedULL << 
32) + 0xdeafbeedULL) == ((opal_object_t *) (keyval))->obj_magic_id' failed.
[cuzco10:24785] *** Process received signal ***
[cuzco10:24785] Signal: Aborted (6)



Ok.

  

I saw that there is a problem with an MPI_COMM_SELF communicator.

The problem disappears (and all ROMIO tests are OK) when I comment line 89 in 
the file ompi/mca/io/romio/romio/adio/common/ad_close.c :
 // MPI_Comm_free(&(fd->comm));

The problem disappears (and all ROMIO tests are OK) when I comment line 425 in 
the file ompi/mca/io/romio/romio/adio/common/cb_config_list.c :
   //  MPI_Keyval_free(&keyval);

The problem also disappears (but only 50% of the ROMIO tests are OK) when I 
comment line 133 in the file ompi/runtime/ompi_mpi_finalize.c:
  // ompi_attr_delete_all(COMM_ATTR, &ompi_mpi_comm_self,
 // ompi_mpi_comm_self.comm.c_keyhash);



It sounds like there's a problem with the ordering of shutdown of things in 
MPI_FINALIZE w.r.t. ROMIO.

FWIW: ROMIO violates some of our abstractions, but it's the price we pay for using a 
3rd party package.  One very, very important abstraction that we have is that no 
top-level MPI API functions are not allowed to call any other MPI API functions.  
E.g., MPI_Send (i.e., ompi/mpi/c/send.c) cannot call MPI_Isend (i.e., 
ompi/mpi/c/isend.c).  MPI_Send *can* call the same back-end implementation functions 
that isend does -- it's just not allowed to call MPI_.

The reason is that the top-level MPI API functions do things like check for 
whether MPI_INIT / MPI_FINALIZE have been called, etc.  The back-end functions 
do not do this.  Additionally, top-level MPI API functions may be overridden 
via PMPI kinds of things.  We wouldn't want our internal library calls to get 
intercepted by user code.

  

I am not very familiar with the OBJ_RELEASE/OBJ_RETAIN mechanism and till now I 
do not understand what is the real origin of that problem.



RETAIN/RELEASE is part of OMPI's "poor man's C++" design.  Wy back in the beginning of the project, we debated whether to use C or C++ for developing the code.  There was a desire to use some of the basic object functionality of C++ (e.g., derived classes, constructors, destructors, etc.), but we wanted to stay as portable as possible.  So we ended up going with C, but with a few macros that emulate some C++-like functionality.  This led to OMPI's OBJ system that is used all over the place.  


The OBJ system does several things:

- allows you to have "constructor"- and "destructor"-like behavior for structs
- works for both stack and heap memory
- reference counting

The reference counting is perhaps the most-used function of OBJ.  Here's a 
sample scenario:

/* allocate some memory, call the some_object_type "constructor",
   and set the reference count of "foo" to 1 */
foo = OBJ_NEW(some_object_type);

/* increment the reference count of foo (to 2) */
OBJ_RETAIN(foo);

/* increment the reference count of foo (to 3) */
OBJ_RETAIN(foo);

/* decrement the reference count of foo (to 1) */
OBJ_RELEASE(foo);
OBJ_RELEASE(foo);

/* decrement the reference count of foo to 0 -- which will
   call foo's "destructor" and then free the memory */
OBJ_RELEASE(foo);

The same principle works for structs on the stack -- we do the same constructor 
/ destructor behavior, but just don't free the memory.  For example:

/* Instantiate the memory and call its "constructor" and set the
   ref count to 1 */
some_object_type foo;
OBJ_CONSTRUCT(&foo, some_object_type);

/* Increment and decrement the ref count */
OBJ_RETAIN(&foo);
OBJ_RETAIN(&foo);
OBJ_RELEASE(&foo);
OBJ_RELEASE(&foo);

/* The last RELEASE will call the destructor, but won't actually
   free the memory, because the memory was not allocated with 
   OBJ_NEW */

OBJ_RELEASE(&

[OMPI devel] Problem with attributes attached to communicators

2011-01-06 Thread Pascal Deveze
I have a problem to finish the porting of ROMIO into Open MPI. It is 
related to the routines MPI_Comm_dup together with MPI_Keyval_create, 
MPI_Keyval_free, MPI_Attr_get and MPI_Attr_put.


Here is a simple program that reproduces my problem:

===
#include 
#include "mpi.h"

int copy_fct(MPI_Comm comm, int keyval, void *extra, void *attr_in, void 
**attr_out, int *flag) {

   return MPI_SUCCESS;
}

int delete_fct(MPI_Comm comm, int keyval, void *attr_val, void *extra) {
   MPI_Keyval_free(&keyval);
   return MPI_SUCCESS;
}

int main(int argc, char **argv) {
   int i, found, attribute_val=100, keyval = MPI_KEYVAL_INVALID;
   MPI_Comm dupcomm;

   MPI_Init(&argc,&argv);

   for (i=0; i<100;i++) {
   /* This simulates the MPI_File_open() */
   if (keyval == MPI_KEYVAL_INVALID) {
   MPI_Keyval_create((MPI_Copy_function *) copy_fct, 
(MPI_Delete_function *) delete_fct, &keyval, NULL);

   MPI_Attr_put(MPI_COMM_WORLD, keyval, &attribute_val);
   MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
   }
   else {
   MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
   MPI_Attr_get(MPI_COMM_WORLD, keyval, (void *) 
&attribute_val, &found);

   }
   /* This simulates the MPI_File_close() */
   MPI_Comm_free(&dupcomm);
   }
   MPI_Finalize();
===
I run it on only one process and get the error:
*** An error occurred in MPI_Attr_get
*** on communicator MPI_COMM_WORLD
*** MPI_ERR_OTHER: known error not in list
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)

I think this error is displayed because  keyval does not exist any more.

This programm runs well on MPICH2 (ROMIO is comming with MPICH2).
This programm runs well when delete_fct() does not call MPI_Keyval_free
This programm runs well when I call MPI_Keyval_create with 
"MPI_NULL_COPY_FN" instead of "(MPI_Copy_function *) copy_fct" (this is 
quite strange : copy_fct does nothing !).


I suspect that there could be a bug in OpenMPI: In 
ompi/attribute/attribute.c two functions are calling OBJ_RELEASE: 
ompi_attr_delete and ompi_attr_free_keyval. So, the

reference count is decremented twice.

Pascal




Re: [OMPI devel] Problem with attributes attached to communicators

2011-01-10 Thread Pascal Deveze

Dave,

Your proposed patch does not work when the call to MPI_File_open() is 
done on MPI_COMM_SELF.

For example, with the romio test program "simple.c", I got the fatal error:

mpirun -np 1 ./simple -fname /tmp//TEST
Fatal error in MPI_Attr_put: Invalid keyval, error stack:
MPI_Attr_put(131): MPI_Attr_put(comm=0x8400, keyval=603979776, 
attr_value=0x2279fa0) failed

MPI_Attr_put(89).: Attribute key was MPI_KEYVAL_INVALID
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

Pascal

Dave Goodell a écrit :

Try this (untested) patch instead:

  





-Dave

On Jan 7, 2011, at 3:50 AM CST, Rob Latham wrote:

  

Hi Pascal.  I'm really happy that you have been working with the
OpenMPI folks to re-sync romio.  I meant to ask you how that work was
progressing, so thanks for the email!

I need copy Dave Goodell on this conversation because he helped me
understand the keyval issues when we last worked on this two years
ago.  


Dave, some background.  We added some code in ROMIO to address ticket
222:
http://trac.mcs.anl.gov/projects/mpich2/ticket/222

But that code apparently makes OpenMPI unhappy.  I think when we
talked about this I remember it came down to a, shall we say,
different interpretation of the standard between MPICH2 and OpenMPI.

In case it's not clear from the nesting of messages, here's Pascal's
extraction of the ROMIO keyval code:

http://www.open-mpi.org/community/lists/devel/2011/01/8837.php

and here's the OpenMPI developer's response:
http://www.open-mpi.org/community/lists/devel/2011/01/8838.php

I think this is related to a discussion I had a couple years ago:
http://www.open-mpi.org/community/lists/users/2009/03/8409.php

So, to eventually answer your question yes I do have some remarks, but
I have no answers.  It's been a couple of years since I added those
frees...

==rob

On Fri, Jan 07, 2011 at 09:47:17AM +0100, Pascal Deveze wrote:


Hi Rob,

As you perhaps remember, I was porting ROMIO on OpenMPI.
The job is quite finished, I only have a problem with the
allocation/dealocation of Keyval (cb_config_list_keyval in
adio/common/cb_config_list.c).
As the alogorithm runs on MPICH2, I asked for help on the
de...@open-mpi.org mailing list.
I just received the following answer from George Bosilca.

The solution I found to run ROMIO with OpenMPI is to delete the line:
   MPI_Keyval_free(&keyval);
in the function ADIOI_cb_delete_name_array
(romio/adio/common/cb_config_list.c).

Do you have any remarks about that ?

Regards,

Pascal

 Message original 
Sujet:  Re: [OMPI devel] Problem with attributes attached to communicators
Date:   Thu, 6 Jan 2011 13:15:14 -0500
De: George Bosilca 
Répondre à: Open MPI Developers 
Pour:   Open MPI Developers 
Références: <4d25daf9.3070...@bull.net>



MPI_Comm_create_keyval and MPI_Comm_free_keyval are the functions you should 
use in order to be MPI 2.2 compliant.

Based on my understanding of the MPI standard, your application is incorrect, 
and therefore the MPICH behavior is incorrect. The delete function is not there 
for you to delete the keyval (!) but to delete the attribute. Here is what the 
MPI standard states about this:

  

Note that it is not erroneous to free an attribute key that is in use, because 
the actual free does not transpire until after all references (in other 
communicators on the process) to the key have been freed. These references need 
to be explictly freed by the program, either via calls to MPI_COMM_DELETE_ATTR 
that free one attribute instance, or by calls to MPI_COMM_FREE that free all 
attribute instances associated with the freed communicator.


george.

On Jan 6, 2011, at 10:08 , Pascal Deveze wrote:

  

I have a problem to finish the porting of ROMIO into Open MPI. It is related to 
the routines MPI_Comm_dup together with MPI_Keyval_create, MPI_Keyval_free, 
MPI_Attr_get and MPI_Attr_put.

Here is a simple program that reproduces my problem:

===
#include 
#include "mpi.h"

int copy_fct(MPI_Comm comm, int keyval, void *extra, void *attr_in, void 
**attr_out, int *flag) {
return MPI_SUCCESS;
}

int delete_fct(MPI_Comm comm, int keyval, void *attr_val, void *extra) {
MPI_Keyval_free(&keyval);
return MPI_SUCCESS;
}

int main(int argc, char **argv) {
int i, found, attribute_val=100, keyval = MPI_KEYVAL_INVALID;
MPI_Comm dupcomm;

MPI_Init(&argc,&argv);

for (i=0; i<100;i++) {
/* This simulates the MPI_File_open() */
if (keyval == MPI_KEYVAL_INVALID) {
MPI_Keyval_create((MPI_Copy_function *) copy_fct, (MPI_Delete_function 
*) delete_fct, &keyval, NULL);
MPI_Attr_put(MPI_COMM_WORLD, keyval, &attribute_val);
MPI_Comm_dup(MPI_COMM_WORLD, &dupcomm);
}
else {
MPI_Comm_dup(MPI_COMM_WORLD, 

Re: [OMPI devel] Problem with attributes attached to communicators

2011-01-13 Thread Pascal Deveze

A new patch in ROMIO solves this problem
Thanks to Dave.

Pascal

Dave Goodell a écrit :

Hmm... Apparently I was too optimistic about my untested patch.  I'll work with 
Rob this afternoon to straighten this out.

-Dave

On Jan 10, 2011, at 5:53 AM CST, Pascal Deveze wrote:

  

Dave,

Your proposed patch does not work when the call to MPI_File_open() is done on 
MPI_COMM_SELF.
For example, with the romio test program "simple.c", I got the fatal error:

mpirun -np 1 ./simple -fname /tmp//TEST
Fatal error in MPI_Attr_put: Invalid keyval, error stack:
MPI_Attr_put(131): MPI_Attr_put(comm=0x8400, keyval=603979776, 
attr_value=0x2279fa0) failed
MPI_Attr_put(89).: Attribute key was MPI_KEYVAL_INVALID
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

Pascal

Dave Goodell a écrit :


Try this (untested) patch instead:

  





-Dave

On Jan 7, 2011, at 3:50 AM CST, Rob Latham wrote:

  

  

Hi Pascal.  I'm really happy that you have been working with the
OpenMPI folks to re-sync romio.  I meant to ask you how that work was
progressing, so thanks for the email!

I need copy Dave Goodell on this conversation because he helped me
understand the keyval issues when we last worked on this two years
ago.  


Dave, some background.  We added some code in ROMIO to address ticket
222:

http://trac.mcs.anl.gov/projects/mpich2/ticket/222


But that code apparently makes OpenMPI unhappy.  I think when we
talked about this I remember it came down to a, shall we say,
different interpretation of the standard between MPICH2 and OpenMPI.

In case it's not clear from the nesting of messages, here's Pascal's
extraction of the ROMIO keyval code:


http://www.open-mpi.org/community/lists/devel/2011/01/8837.php


and here's the OpenMPI developer's response:

http://www.open-mpi.org/community/lists/devel/2011/01/8838.php


I think this is related to a discussion I had a couple years ago:

http://www.open-mpi.org/community/lists/users/2009/03/8409.php


So, to eventually answer your question yes I do have some remarks, but
I have no answers.  It's been a couple of years since I added those
frees...

==rob

On Fri, Jan 07, 2011 at 09:47:17AM +0100, Pascal Deveze wrote:




Hi Rob,

As you perhaps remember, I was porting ROMIO on OpenMPI.
The job is quite finished, I only have a problem with the
allocation/dealocation of Keyval (cb_config_list_keyval in
adio/common/cb_config_list.c).
As the alogorithm runs on MPICH2, I asked for help on the

de...@open-mpi.org
 mailing list.
I just received the following answer from George Bosilca.

The solution I found to run ROMIO with OpenMPI is to delete the line:
   MPI_Keyval_free(&keyval);
in the function ADIOI_cb_delete_name_array
(romio/adio/common/cb_config_list.c).

Do you have any remarks about that ?

Regards,

Pascal

 Message original 
Sujet:  Re: [OMPI devel] Problem with attributes attached to communicators
Date:   Thu, 6 Jan 2011 13:15:14 -0500
De: 	George Bosilca 



Répondre à: 	Open MPI Developers 



Pour: 	Open MPI Developers 



Références: 
<4d25daf9.3070...@bull.net>




MPI_Comm_create_keyval and MPI_Comm_free_keyval are the functions you should 
use in order to be MPI 2.2 compliant.

Based on my understanding of the MPI standard, your application is incorrect, 
and therefore the MPICH behavior is incorrect. The delete function is not there 
for you to delete the keyval (!) but to delete the attribute. Here is what the 
MPI standard states about this:

  

  

Note that it is not erroneous to free an attribute key that is in use, because 
the actual free does not transpire until after all references (in other 
communicators on the process) to the key have been freed. These references need 
to be explictly freed by the program, either via calls to MPI_COMM_DELETE_ATTR 
that free one attribute instance, or by calls to MPI_COMM_FREE that free all 
attribute instances associated with the freed communicator.
    

    

george.

On Jan 6, 2011, at 10:08 , Pascal Deveze wrote:

  

  

I have a problem to finish the porting of ROMIO into Open MPI. It is related to 
the routines MPI_Comm_dup together with MPI_Keyval_create, MPI_Keyval_free, 
MPI_Attr_get and MPI_Attr_put.

Here is a simple program that reproduces my problem:

===
#include 
#include "mpi.h"

int copy_fct(MPI_Comm comm, int keyval, void *extra, void *attr_in, void 
**attr_out, int *flag) {
return MPI_SUCCESS;
}

int delete_fct(MPI_Comm comm, int keyval, void *attr_val, void *extra) {
MPI_Keyval_free(&keyval);
return MPI_SUCCESS;
}

int main(int argc, char **argv) {
int i, found, attribute_val=100, keyval = MPI_KEYVAL_INVALID;
MPI_Comm dupcomm;

MPI_Init(&argc,&argv);

for (i=0; i<100;i++) {
/* This simulates the MPI_File_open() */
if (keyval == MPI_KEYVAL_INVALID) {
MPI_Keyval_crea

Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2011-01-13 Thread Pascal Deveze
This problem of assertion is now solved by a patch in ROMIO just 
commited in http://bitbucket.org/devezep/new-romio-for-openmpi


I don't know any other problem in this porting of ROMIO.

Pascal

Pascal Deveze a écrit :

Jeff Squyres a écrit :

On Dec 16, 2010, at 3:31 AM, Pascal Deveze wrote:

 

int main(int argc, char **argv) {
  MPI_File fh;
  MPI_Info info, info_used;

  MPI_Init(&argc,&argv);

  MPI_File_open(MPI_COMM_WORLD, "/tmp/A", MPI_MODE_CREATE | MPI_MODE_RDWR, 
MPI_INFO_NULL, &fh);
  MPI_File_close(&fh);

  MPI_File_open(MPI_COMM_WORLD, "/tmp/A", MPI_MODE_CREATE | MPI_MODE_RDWR, 
MPI_INFO_NULL, &fh);
  MPI_File_close(&fh);

  MPI_Finalize();
}

I run this programon one process : salloc -p debug  -n1 mpirun -np 1 ./a.out
And I get teh assertion error:

a.out: attribute/attribute.c:763: ompi_attr_delete: Assertion `((0xdeafbeedULL << 
32) + 0xdeafbeedULL) == ((opal_object_t *) (keyval))->obj_magic_id' failed.
[cuzco10:24785] *** Process received signal ***
[cuzco10:24785] Signal: Aborted (6)



Ok.

  

I saw that there is a problem with an MPI_COMM_SELF communicator.

The problem disappears (and all ROMIO tests are OK) when I comment line 89 in 
the file ompi/mca/io/romio/romio/adio/common/ad_close.c :
 // MPI_Comm_free(&(fd->comm));

The problem disappears (and all ROMIO tests are OK) when I comment line 425 in 
the file ompi/mca/io/romio/romio/adio/common/cb_config_list.c :
   //  MPI_Keyval_free(&keyval);

The problem also disappears (but only 50% of the ROMIO tests are OK) when I 
comment line 133 in the file ompi/runtime/ompi_mpi_finalize.c:
  // ompi_attr_delete_all(COMM_ATTR, &ompi_mpi_comm_self,
 // ompi_mpi_comm_self.comm.c_keyhash);



It sounds like there's a problem with the ordering of shutdown of things in 
MPI_FINALIZE w.r.t. ROMIO.

FWIW: ROMIO violates some of our abstractions, but it's the price we pay for using a 
3rd party package.  One very, very important abstraction that we have is that no 
top-level MPI API functions are not allowed to call any other MPI API functions.  
E.g., MPI_Send (i.e., ompi/mpi/c/send.c) cannot call MPI_Isend (i.e., 
ompi/mpi/c/isend.c).  MPI_Send *can* call the same back-end implementation functions 
that isend does -- it's just not allowed to call MPI_.

The reason is that the top-level MPI API functions do things like check for 
whether MPI_INIT / MPI_FINALIZE have been called, etc.  The back-end functions 
do not do this.  Additionally, top-level MPI API functions may be overridden 
via PMPI kinds of things.  We wouldn't want our internal library calls to get 
intercepted by user code.

  

I am not very familiar with the OBJ_RELEASE/OBJ_RETAIN mechanism and till now I 
do not understand what is the real origin of that problem.



RETAIN/RELEASE is part of OMPI's "poor man's C++" design.  Wy back in the beginning of the project, we debated whether to use C or C++ for developing the code.  There was a desire to use some of the basic object functionality of C++ (e.g., derived classes, constructors, destructors, etc.), but we wanted to stay as portable as possible.  So we ended up going with C, but with a few macros that emulate some C++-like functionality.  This led to OMPI's OBJ system that is used all over the place.  


The OBJ system does several things:

- allows you to have "constructor"- and "destructor"-like behavior for structs
- works for both stack and heap memory
- reference counting

The reference counting is perhaps the most-used function of OBJ.  Here's a 
sample scenario:

/* allocate some memory, call the some_object_type "constructor",
   and set the reference count of "foo" to 1 */
foo = OBJ_NEW(some_object_type);

/* increment the reference count of foo (to 2) */
OBJ_RETAIN(foo);

/* increment the reference count of foo (to 3) */
OBJ_RETAIN(foo);

/* decrement the reference count of foo (to 1) */
OBJ_RELEASE(foo);
OBJ_RELEASE(foo);

/* decrement the reference count of foo to 0 -- which will
   call foo's "destructor" and then free the memory */
OBJ_RELEASE(foo);

The same principle works for structs on the stack -- we do the same constructor 
/ destructor behavior, but just don't free the memory.  For example:

/* Instantiate the memory and call its "constructor" and set the
   ref count to 1 */
some_object_type foo;
OBJ_CONSTRUCT(&foo, some_object_type);

/* Increment and decrement the ref count */
OBJ_RETAIN(&foo);
OBJ_RETAIN(&foo);
OBJ_RELEASE(&foo);
OBJ_RELEASE(&foo);

/* The last RELEASE will call the destructor, but won't actually
   free the memory, because the memory was not allocated with 
   OBJ_NEW */

OBJ_RELEASE(&foo);

When the destructor is called, the OBJ system sets the magic number in the 
obj's memory to a sentinel value s

Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2011-01-14 Thread Pascal Deveze

Jeff Squyres a écrit :

Great!

I see in your other mail that you pulled something from MPICH2 to make this 
work.

Does that mean that there's a even-newer version of ROMIO that we should pull in its entirety?  It's a little risky to pull most stuff from one released version of ROMIO and then more stuff from another released version.  Meaning: it's little nicer/safer to say that we have ROMIO from a single released version of MPICH2.  


If possible.  :-)

Is it possible?

Don't get me wrong -- I want the new ROMIO, and I'm sorry you've had to go 
through so many hoops to get it ready.  :-(  But we should do it the best way 
we can; we have history/precedent for taking ROMIO from a single 
source/released version of MPICH[2], and I'd like to maintain that precedent if 
at all possible.


  
I've just made a comparison with the very last MPICH2 version 
(mpich2-1.3.1) and found very little differencies.


I've  reported them into bitbucket. I 've tested with the ROMIO tests 
and I 've commited them.


So, we now have on bitbucket the version from mpich2-1.3.1 plus the 
patch for the attribute issue.



Pascal




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2011-01-17 Thread Pascal Deveze

Jeff Squyres a écrit :

I'm actually confused by the changelog on the repo:

- r1 (https://bitbucket.org/devezep/new-romio-for-openmpi) says "Initial import from 
branch v1.5"
- r15 (https://bitbucket.org/devezep/new-romio-for-openmpi/changeset/a535d7cdbe79) then 
says "Update with openmpi-1.4.3"

...?
  
I thought it was necessary to be in line with a "stable version", not 
with a snapshot of the trunk.



Did you not use the SVN+HG procedure outlined below, perchance?

https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial
  

No, I did not use this procedure:
I cloned new-romio-for-openmpi, then I deleted all files and 
subdirectories (excepted .hg and .hgignore)


I got openmpi-1.4.3.tar and untar it in the new-romio-for-openmpi directory.
Then I replace the romio branch by the romio branch from 
new-romio-for-openmpi


After testing, I committed and pushed it in bitbucket.

I realize now that I have to go back to v1.5.
What is the best way to do it ?
  - Synchro with the trunk?
   - Using openmpi-1.5.tar.gz with the same procedure I used to update 
with openmpi-1.4.3 ?


. 


On Jan 14, 2011, at 10:01 AM, Jeff Squyres wrote:

  

I just (re?)noticed that your mercurial tree is based on the 1.4 branch:

   https://bitbucket.org/devezep/new-romio-for-openmpi

Are we targeting the v1.4 series for this?  


I thought we were targeting trunk/v1.5 for the new ROMIO, but perhaps I'm 
forgetting something...?




On Jan 14, 2011, at 8:20 AM, Pascal Deveze wrote:



Jeff Squyres a écrit :
  

Great!

I see in your other mail that you pulled something from MPICH2 to make this 
work.

Does that mean that there's a even-newer version of ROMIO that we should pull in its entirety?  It's a little risky to pull most stuff from one released version of ROMIO and then more stuff from another released version.  Meaning: it's little nicer/safer to say that we have ROMIO from a single released version of MPICH2.  
If possible.  :-)


Is it possible?

Don't get me wrong -- I want the new ROMIO, and I'm sorry you've had to go 
through so many hoops to get it ready.  :-(  But we should do it the best way 
we can; we have history/precedent for taking ROMIO from a single 
source/released version of MPICH[2], and I'd like to maintain that precedent if 
at all possible.





I've just made a comparison with the very last MPICH2 version (mpich2-1.3.1) 
and found very little differencies.

I've  reported them into bitbucket. I 've tested with the ROMIO tests and I 've 
commited them.

So, we now have on bitbucket the version from mpich2-1.3.1 plus the patch for 
the attribute issue.

Pascal


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/





  




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2011-01-17 Thread Pascal Deveze

Pascal Deveze a écrit :

Jeff Squyres a écrit :

I'm actually confused by the changelog on the repo:

- r1 (https://bitbucket.org/devezep/new-romio-for-openmpi) says "Initial import from 
branch v1.5"
- r15 (https://bitbucket.org/devezep/new-romio-for-openmpi/changeset/a535d7cdbe79) then 
says "Update with openmpi-1.4.3"

...?
  
I thought it was necessary to be in line with a "stable version", not 
with a snapshot of the trunk.



Did you not use the SVN+HG procedure outlined below, perchance?

https://svn.open-mpi.org/trac/ompi/wiki/UsingMercurial
  

No, I did not use this procedure:
I cloned new-romio-for-openmpi, then I deleted all files and 
subdirectories (excepted .hg and .hgignore)


I got openmpi-1.4.3.tar and untar it in the new-romio-for-openmpi 
directory.
Then I replace the romio branch by the romio branch from 
new-romio-for-openmpi


After testing, I committed and pushed it in bitbucket.

I realize now that I have to go back to v1.5.
What is the best way to do it ?
   - Synchro with the trunk?
- Using openmpi-1.5.tar.gz with the same procedure I used to 
update with openmpi-1.4.3 ?
 
.

Sorry,
 I was a bit confused with V1.5. I will have to synchronize with the 
trunk. When do you think is the best time to do it ?




On Jan 14, 2011, at 10:01 AM, Jeff Squyres wrote:

  

I just (re?)noticed that your mercurial tree is based on the 1.4 branch:

   https://bitbucket.org/devezep/new-romio-for-openmpi

Are we targeting the v1.4 series for this?  


I thought we were targeting trunk/v1.5 for the new ROMIO, but perhaps I'm 
forgetting something...?




On Jan 14, 2011, at 8:20 AM, Pascal Deveze wrote:



Jeff Squyres a écrit :
  

Great!

I see in your other mail that you pulled something from MPICH2 to make this 
work.

Does that mean that there's a even-newer version of ROMIO that we should pull in its entirety?  It's a little risky to pull most stuff from one released version of ROMIO and then more stuff from another released version.  Meaning: it's little nicer/safer to say that we have ROMIO from a single released version of MPICH2.  
If possible.  :-)


Is it possible?

Don't get me wrong -- I want the new ROMIO, and I'm sorry you've had to go 
through so many hoops to get it ready.  :-(  But we should do it the best way 
we can; we have history/precedent for taking ROMIO from a single 
source/released version of MPICH[2], and I'd like to maintain that precedent if 
at all possible.





I've just made a comparison with the very last MPICH2 version (mpich2-1.3.1) 
and found very little differencies.

I've  reported them into bitbucket. I 've tested with the ROMIO tests and I 've 
commited them.

So, we now have on bitbucket the version from mpich2-1.3.1 plus the patch for 
the attribute issue.

Pascal


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/





  




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2011-01-17 Thread Pascal Deveze

The bitbucket tree (https://bitbucket.org/devezep/new-romio-for-openmpi) has 
just been updated with the open-mpi trunk.


I have made three patches:

hg out
comparing with ssh://h...@bitbucket.org/devezep/new-romio-for-openmpi
searching for changes
changeset:   25:3e677102a125
user:Pascal Deveze 
date:Mon Jan 17 13:40:10 2011 +0100
summary: Remove all files

changeset:   26:e3989f46f83a
user:Pascal Deveze 
date:Mon Jan 17 14:46:48 2011 +0100
summary: Import from http://svn.open-mpi.org/svn/ompi/trunki (r24256)

changeset:   27:97f54ec8a575
tag: tip
user:Pascal Deveze 
date:Mon Jan 17 16:14:06 2011 +0100
summary: New Romio

I have tested the result and the ROMIO tests are OK.




Re: [OMPI devel] RFC: Bring the lastest ROMIO version from MPICH2-1.3 into the trunk

2011-01-17 Thread Pascal Deveze

Jeff,

You removed the following files 
(https://bitbucket.org/devezep/new-romio-for-openmpi/changeset/9b8f70de722d). 
I see that they are in the trunk. Shall I remove them again ?


HACKING
config/Makefile.options
config/libltdl-preopen-error.diff
config/lt224-icc.diff
config/mca_acinclude.m4
config/mca_configure.ac
config/mca_make_configure.pl
config/ompi_check_libfca.m4
config/ompi_ensure_contains_optflags.m4
config/ompi_ext.m4
config/ompi_microsoft.m4
config/ompi_setup_component_package.m4
config/ompi_strip_optflags.m4
contrib/check_unnecessary_headers.sh
contrib/code_counter.pl
contrib/copyright.pl
contrib/dist/find-copyrights.pl
contrib/dist/gkcommit.pl
contrib/dist/linux/README
contrib/dist/linux/README.ompi-spec-generator
contrib/dist/linux/buildrpm.sh
contrib/dist/linux/buildswitcherrpm.sh
contrib/dist/linux/ompi-spec-generator.py
contrib/dist/linux/openmpi-switcher-modulefile.spec
contrib/dist/linux/openmpi-switcher-modulefile.tcl
contrib/dist/make-authors.pl
contrib/dist/make_tarball
contrib/find_occurence.pl
contrib/find_offenders.pl
contrib/fix_headers.pl
contrib/fix_indent.pl
contrib/gen_stats.pl
contrib/generate_file_list.pl
contrib/header_replacement.sh
contrib/headers.txt
contrib/hg/build-hgignore.pl
contrib/hg/set-hg-share-perms.csh
contrib/nightly/build_sample_config.txt
contrib/nightly/build_tarball.pl
contrib/nightly/build_tests.pl
contrib/nightly/check_devel_headers.pl
contrib/nightly/create_tarball.sh
contrib/nightly/illegal_symbols_report.pl
contrib/nightly/ompi_cronjob.sh
contrib/nightly/unimplemented_report.sh
contrib/ompi_cplusplus.sed
contrib/ompi_cplusplus.sh
contrib/ompi_cplusplus.txt
contrib/platform/cisco/ebuild/hlfr
contrib/platform/cisco/ebuild/hlfr.conf


Pascal Deveze a écrit :
The bitbucket tree 
(https://bitbucket.org/devezep/new-romio-for-openmpi) has just been 
updated with the open-mpi trunk.



I have made three patches:

hg out
comparing with ssh://h...@bitbucket.org/devezep/new-romio-for-openmpi
searching for changes
changeset:   25:3e677102a125
user:Pascal Deveze 
date:Mon Jan 17 13:40:10 2011 +0100
summary: Remove all files

changeset:   26:e3989f46f83a
user:Pascal Deveze 
date:Mon Jan 17 14:46:48 2011 +0100
summary: Import from http://svn.open-mpi.org/svn/ompi/trunki (r24256)

changeset:   27:97f54ec8a575
tag: tip
user:Pascal Deveze 
date:Mon Jan 17 16:14:06 2011 +0100
summary: New Romio

I have tested the result and the ROMIO tests are OK.


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel








[OMPI devel] Problem of memory lost in MPI_Type_create_hindexed() with count = 1 (patch proposed)

2011-04-14 Thread Pascal Deveze

Calling MPI_Type_create_hindexed(int count, int array_of_blocklengths[],
   MPI_Aint array_of_displacements[], MPI_Datatype oldtype,
   MPI_Datatype *newtype)
with a count parameter of 1 causes a loss of memory detected by valgrind :

==2053== 576 (448 direct, 128 indirect) bytes in 1 blocks are definitely 
lost in loss record 157 of 182

==2053==at 0x4C2415D: malloc (vg_replace_malloc.c:195)
==2053==by 0x4E7CEC7: opal_obj_new (opal_object.h:469)
==2053==by 0x4E7D134: ompi_datatype_create (ompi_datatype_create.c:71)
==2053==by 0x4E7D58E: ompi_datatype_create_hindexed 
(ompi_datatype_create_indexed.c:89)
==2053==by 0x4EA74D0: PMPI_Type_create_hindexed 
(ptype_create_hindexed.c:75)

==2053==by 0x401A5C: main (in /home_nfs/xxx/type_create_hindexed)

This can be reproduced with the following trivial code:
=
#include "mpi.h"

MPI_Datatype newtype;
int lg[3];
MPI_Aint disp[3];

int main(int argc, char **argv) {
   MPI_Init(&argc,&argv);

   disp[0] = (MPI_Aint)disp;
   disp[1] = (MPI_Aint)disp+1;
   lg[0] = 5;
   lg[1] = 5;

   MPI_Type_create_hindexed(1, lg, disp, MPI_BYTE, &newtype);
   MPI_Type_free(&newtype);

   MPI_Finalize();
}
==
If MPI_Type_create_hindexed() is called with a count parameter greater 
1, valgrind does not detect any lost record.


Patch proposed:

hg diff ompi/datatype/ompi_datatype_create_indexed.c
diff -r a2d94a70f474 ompi/datatype/ompi_datatype_create_indexed.c
--- a/ompi/datatype/ompi_datatype_create_indexed.c  Wed Mar 30 
18:47:31 2011 +0200
+++ b/ompi/datatype/ompi_datatype_create_indexed.c  Thu Apr 14 
16:16:08 2011 +0200

@@ -91,11 +91,6 @@
dLength = pBlockLength[0];
endat = disp + dLength * extent;

-if( 1 >= count ) {
-pdt = ompi_datatype_create( oldType->super.desc.used + 2 );
-/* multiply by count to make it zero if count is zero */
-ompi_datatype_add( pdt, oldType, count * dLength, disp, extent );
-} else {
for( i = 1; i < count; i++ ) {
if( endat == pDisp[i] ) {
/* contiguous with the previsious */
@@ -109,7 +104,6 @@
}
}
ompi_datatype_add( pdt, oldType, dLength, disp, extent );
-}
*newType = pdt;
return OMPI_SUCCESS;
}

Explanation:
   The case (0 == count) was resolved before by returning.
   The problem is that, in the case ( 1 >= count ), 
ompi_datatype_create() is called again (it has been just called before).
   In fact the case (1 == count) is not different of the case (1 < 
count), so it is possible to just avoid the if-else statement.


We need a patch for OpenMPI 1.5 branch.



[OMPI devel] Problem with the openmpi-default-hostfile (on the trunk)

2012-02-27 Thread pascal . deveze
Hi all,

I have problems with the openmpi-default-hostfile since the following 
patch on the trunk

changeset:   19874:088fc6c84a9f
user:rhc
date:Wed Feb 01 17:40:44 2012 +
summary: In accordance with prior releases, we are supposed to default 
to looking at the openmpi-default-hostfile as a default hostfile. Restore 
that behavior, but ignore the file if it is empty. Allow the user to 
ignore any MCA param setting pointing to a default hostfile by setting the 
param to "none" (via cmd line or whatever) - this allows them to override 
a setting in the system default MCA param file.

According to the summary of this patch, the openmpi-default-hostfile is 
ignored if it is empty.
But, when I run my jobs with slurm + mpirun, I get the following message:
--
No nodes are available for this job, either due to a failure to
allocate nodes to the job, or allocated nodes being marked
as unavailable (e.g., down, rebooting, or a process attempting
to be relocated to another node when none are available).
--

I am able to run my job if:
 - either I put my node(s) in the file etc/openmpi-default-hostfile
 - or use "-mca orte_default_hostfile=none" in the mpirun command line
 - or "export OMPI_MCA_orte_default_hostfile none" in my environment

It appears that an empty openmpi-default-hostfile is not ignored. This 
patch seems not be complete

 Or do I misunderstand something ?

Pascal Devèze

Re: [OMPI devel] Problem with the openmpi-default-hostfile (on the trunk)

2012-02-28 Thread pascal . deveze
devel-boun...@open-mpi.org a écrit sur 27/02/2012 15:53:06 :

> De : Ralph Castain 
> A : Open MPI Developers 
> Date : 27/02/2012 16:17
> Objet : Re: [OMPI devel] Problem with the openmpi-default-hostfile 
> (on the trunk)
> Envoyé par : devel-boun...@open-mpi.org
> 
> That's strange - I run on slurm frequently and never have this 
> problem, and my default hostfile is present and empty. Do you have 
> anything in your default mca param file that might be telling us to 
> use the hostfile?
> 
> The only way I can find to get that behavior is if your default mca 
> param file includes the orte_default_hostfile value. In that case, 
> you are telling us to use the default hostfile, and so we will enforce 
it.

Hi Ralph,

On my side, the default value of orte_default_hostfile is a pointer to 
etc/openmpi-default-hostfile.
The command ompi_info -a gives :

MCA orte: parameter "orte_default_hostfile" (current value: 
, data source: default value)
Name of the default hostfile (relative or absolute path, "none" to ignore 
environmental or default MCA param setting)

The following files are empty:
 - .../etc/openmpi-mca-params.conf
 - $HOME/.openmpi/mca-params.conf
Another solution for me is to put "orte_default_hostfile=none" in one of 
these files.

Pascal

> 
> On Feb 27, 2012, at 5:57 AM, pascal.dev...@bull.net wrote:
> 
> Hi all, 
> 
> I have problems with the openmpi-default-hostfile since the 
> following patch on the trunk 
> 
> changeset:   19874:088fc6c84a9f 
> user:rhc 
> date:Wed Feb 01 17:40:44 2012 + 
> summary: In accordance with prior releases, we are supposed to 
> default to looking at the openmpi-default-hostfile as a default 
> hostfile. Restore that behavior, but ignore the file if it is empty.
> Allow the user to ignore any MCA param setting pointing to a default
> hostfile by setting the param to "none" (via cmd line or whatever) -
> this allows them to override a setting in the system default MCA param 
file. 
> 
> According to the summary of this patch, the openmpi-default-hostfile
> is ignored if it is empty. 
> But, when I run my jobs with slurm + mpirun, I get the following 
message: 
> 
-- 

> No nodes are available for this job, either due to a failure to 
> allocate nodes to the job, or allocated nodes being marked 
> as unavailable (e.g., down, rebooting, or a process attempting 
> to be relocated to another node when none are available). 
> 
-- 

> 
> I am able to run my job if: 
>  - either I put my node(s) in the file etc/openmpi-default-hostfile 
>  - or use "-mca orte_default_hostfile=none" in the mpirun command line 
>  - or "export OMPI_MCA_orte_default_hostfile none" in my environment 
> 
> It appears that an empty openmpi-default-hostfile is not ignored. 
> This patch seems not be complete 
> 
>  Or do I misunderstand something ? 
> 
> Pascal Devèze___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] Problem with the openmpi-default-hostfile (on the trunk)

2012-02-28 Thread pascal . deveze
devel-boun...@open-mpi.org a écrit sur 28/02/2012 10:54:15 :

> De : Ralph Castain 
> A : Open MPI Developers 
> Date : 28/02/2012 10:54
> Objet : Re: [OMPI devel] Problem with the openmpi-default-hostfile 
> (on the trunk)
> Envoyé par : devel-boun...@open-mpi.org
> 
> I'll see what I can do when next I have access to a slurm machine - 
> hopefully in a day or two.
> 
> Are you sure you are at the top of the trunk? I reviewed the code, 
> and it clearly detects that the default hostile is empty and ignores
> it if so. Like I said, I'm not seeing this behavior, and neither are
> the slurm machines on MTT.

I ran with a version from Feb 12th (I had a synchronization problem).
Now with the latest patches (Feb 27th), by default I have no more problem.

But, ... it is no more possible to change the mca parameter 
"orte_default_hostfile".
For example in $HOME/.openmpi/mca-params.conf I put:
   orte_default_hostfile=none
Then, even with ompi_info, I get a segfault:

[:17426] *** Process received signal ***
[:17426] Signal: Segmentation fault (11)
[:17426] Signal code: Address not mapped (1)
[:17426] Failing at address: (nil)
[:17426] [ 0] /lib64/libpthread.so.0() [0x327220f490]
[:17426] [ 1] /lib64/libc.so.6() [0x3271f24676]
[:17426] [ 2] //lib/libopen-rte.so.0(orte_register_params+0xaac) 
[0x7fa46989677a]
[:17426] [ 3] mpirun(orterun+0xeb) [0x4039ed]
[:17426] [ 4] mpirun(main+0x20) [0x4034b4]
[:17426] [ 5] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3271e1ec9d]
[:17426] [ 6] mpirun() [0x4033d9]
[:17426] *** End of error message ***

After a look at orte/runtime/orte_mca_params.c, I propose the following 
patch :
--- a/orte/runtime/orte_mca_params.cMon Feb 27 15:53:14 2012 +
+++ b/orte/runtime/orte_mca_params.cTue Feb 28 14:44:11 2012 +0100
@@ -301,7 +301,7 @@
 asprintf(&orte_default_hostfile, 
"%s/etc/openmpi-default-hostfile", opal_install_dirs.prefix);
 /* flag that nothing was given */
 orte_default_hostfile_given = false;
-} else if (0 == strcmp(orte_default_hostfile, "none")) {
+} else if (0 == strcmp(strval, "none")) {
 orte_default_hostfile = NULL;
 /* flag that it was given */
 orte_default_hostfile_given = true;


> 
> On Feb 28, 2012, at 1:25 AM, pascal.dev...@bull.net wrote:
> 
> 
> devel-boun...@open-mpi.org a écrit sur 27/02/2012 15:53:06 :
> 
> > De : Ralph Castain  
> > A : Open MPI Developers  
> > Date : 27/02/2012 16:17 
> > Objet : Re: [OMPI devel] Problem with the openmpi-default-hostfile 
> > (on the trunk) 
> > Envoyé par : devel-boun...@open-mpi.org 
> > 
> > That's strange - I run on slurm frequently and never have this 
> > problem, and my default hostfile is present and empty. Do you have 
> > anything in your default mca param file that might be telling us to 
> > use the hostfile? 
> > 
> > The only way I can find to get that behavior is if your default mca 
> > param file includes the orte_default_hostfile value. In that case, 
> > you are telling us to use the default hostfile, and so we will enforce 
it. 
> 
> Hi Ralph, 
> 
> On my side, the default value of orte_default_hostfile is a pointer 
> to etc/openmpi-default-hostfile. 
> The command ompi_info -a gives : 
> 
> MCA orte: parameter "orte_default_hostfile" (current value:  etc/openmpi-default-hostfile>, data source: default value) 
> Name of the default hostfile (relative or absolute path, "none" to 
> ignore environmental or default MCA param setting) 
> 
> The following files are empty: 
>  - .../etc/openmpi-mca-params.conf 
>  - $HOME/.openmpi/mca-params.conf 
> Another solution for me is to put "orte_default_hostfile=none" in 
> one of these files. 
> 
> Pascal 
> 
> > 
> > On Feb 27, 2012, at 5:57 AM, pascal.dev...@bull.net wrote: 
> > 
> > Hi all, 
> > 
> > I have problems with the openmpi-default-hostfile since the 
> > following patch on the trunk 
> > 
> > changeset:   19874:088fc6c84a9f 
> > user:rhc 
> > date:Wed Feb 01 17:40:44 2012 + 
> > summary: In accordance with prior releases, we are supposed to 
> > default to looking at the openmpi-default-hostfile as a default 
> > hostfile. Restore that behavior, but ignore the file if it is empty.
> > Allow the user to ignore any MCA param setting pointing to a default
> > hostfile by setting the param to "none" (via cmd line or whatever) -
> > this allows them to override a setting in the system default MCA 
> param file. 
> > 
> > According to the summary of this patch, the openmpi-default-hostfile
> > is ignored if it is empty. 
> > But, when I run my jobs with slurm + mpirun, I get the following 
message: 
> > 
-- 

> > No nodes are available for this job, either due to a failure to 
> > allocate nodes to the job, or allocated nodes being marked 
> > as unavailable (e.g., down, rebooting, or a process attempting 
> > to be relocated to another n

[OMPI devel] fix 2014: Problems in romio

2009-09-09 Thread pascal . deveze
I have seen that ROMIO goes wrong with fix 2014: A lot of ROMIO tests in
ompi/mca/io/romio/romio/test/ are failing
For example, with noncontig_coll2:

[inti15:28259] *** Process received signal ***
[inti15:28259] Signal: Segmentation fault (11)
[inti15:28259] Signal code: Address not mapped (1)
[inti15:28259] Failing at address: (nil)
[inti15:28259] [ 0] /lib64/libpthread.so.0 [0x3f19c0e4c0]
[inti15:28259] [ 1]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_btl_openib.so
[0x2b6640c74d79]
[inti15:28259] [ 2]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_rml_oob.so
[0x2b663e2e6e92]
[inti15:28259] [ 3]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_oob_tcp.so
[0x2b663e4f8e63]
[inti15:28259] [ 4]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_oob_tcp.so
[0x2b663e4ff485]
[inti15:28259] [ 5]
/home_nfs/devezep/ATLAS/openmpi-default/lib/libopen-pal.so.0(opal_event_loop+0x5df)
 [0x2b663d3d92ff]
[inti15:28259] [ 6]
/home_nfs/devezep/ATLAS/openmpi-default/lib/libopen-pal.so.0(opal_progress+0x5e)
 [0x2b663d3ba33e]
[inti15:28259] [ 7] /home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0
[0x2b663ce26624]
[inti15:28259] [ 8]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_coll_tuned.so
[0x2b664217fda2]
[inti15:28259] [ 9]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_coll_tuned.so
[0x2b6642179966]
[inti15:28259] [10]
/home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0(MPI_Alltoall+0x6f)
[0x2b663ce352ef]
[inti15:28259] [11]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so(ADIOI_Calc_others_req+0x65)
 [0x2aaab1cfc525]
[inti15:28259] [12]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so(ADIOI_GEN_WriteStridedColl+0x433)
 [0x2aaab1cf0ac3]
[inti15:28259] [13]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so(MPIOI_File_write_all+0xc0)
 [0x2aaab1d0a8f0]
[inti15:28259] [14]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so(mca_io_romio_dist_MPI_File_write_all+0x23)
 [0x2aaab1d0a823]
[inti15:28259] [15]
/home_nfs/devezep/ATLAS/openmpi-default/lib/openmpi/mca_io_romio.so
[0x2aaab1cedce9]
[inti15:28259] [16]
/home_nfs/devezep/ATLAS/openmpi-default/lib/libmpi.so.0(MPI_File_write_all+0x4e)
 [0x2b663ce64f9e]
[inti15:28259] [17] ./noncontig_coll2(test_file+0x32b) [0x4034bb]
[inti15:28259] [18] ./noncontig_coll2(main+0x58b) [0x402d03]
[inti15:28259] [19] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3f1901d974]
[inti15:28259] [20] ./noncontig_coll2 [0x4026c9]
[inti15:28259] *** End of error message ***

All the ROMIO tests pass without this fix

Is there a problem in ROMIO with the datatype interface ?

Pascal

Here is the export of the corresponding patch:

hg export 16301
# HG changeset patch
# User rusraink
# Date 1251912841 0
# Node ID eefd4bd4551969dc7454e63c2f42871cc9376a8f
# Parent  8aab76743e58474f1341be6f9d0ac9ae338507f1
 - This fixes #2014:
   As noted in
http://www.open-mpi.org/community/lists/devel/2009/08/6741.php,
   we do not correctly free a dupped predefined datatype.
   The fix is a bit more involving. See ticket for details.
   Tested with ibm tests and mpi_test_suite (though there's two "old"
failures
   zero5.c and zero6.c)

   Thanks to Lisandro Dalcin for bringing this up.

diff -r 8aab76743e58 -r eefd4bd45519 ompi/datatype/ompi_datatype.h
--- a/ompi/datatype/ompi_datatype.h Wed Sep 02 11:23:54 2009 +
+++ b/ompi/datatype/ompi_datatype.h Wed Sep 02 17:34:01 2009 +
@@ -202,11 +202,14 @@
 }
 opal_datatype_clone ( &oldType->super, &new_ompi_datatype->super);

+new_ompi_datatype->super.flags &= (~OMPI_DATATYPE_FLAG_PREDEFINED);
+
 /* Set the keyhash to NULL -- copying attributes is *only* done at
the top level (specifically, MPI_TYPE_DUP). */
 new_ompi_datatype->d_keyhash = NULL;
 new_ompi_datatype->args = NULL;
-strncpy (new_ompi_datatype->name, oldType->name, MPI_MAX_OBJECT_NAME);
+snprintf (new_ompi_datatype->name, MPI_MAX_OBJECT_NAME, "Dup %s",
+  oldType->name);

 return OMPI_SUCCESS;
 }
diff -r 8aab76743e58 -r eefd4bd45519 opal/datatype/opal_datatype_clone.c
--- a/opal/datatype/opal_datatype_clone.c   Wed Sep 02 11:23:54 2009
+
+++ b/opal/datatype/opal_datatype_clone.c   Wed Sep 02 17:34:01 2009
+
@@ -33,9 +33,13 @@
 int32_t opal_datatype_clone( const opal_datatype_t * src_type,
opal_datatype_t * dest_type )
 {
 int32_t desc_length = src_type->desc.used + 1;  /* +1 because of the
fake OPAL_DATATYPE_END_LOOP entry */
-dt_elem_desc_t* temp = dest_type->desc.desc; /* temporary copy of the
desc pointer */
+dt_elem_desc_t* temp = dest_type->desc.desc;/* temporary copy of
the desc pointer */

-memcpy( dest_type, src_type, sizeof(opal_datatype_t) );
+/* copy _excluding_ the super object, we want to keep the
cls_destruct_array */
+memcpy( dest_type+sizeof(opal_object_t),
+src_type+sizeof(opal_object_t),
+sizeof(opal_datatype_t)-sizeof(opal_object_t) )

[OMPI devel] Problem with MPI_Type_indexed and hole (defined with MPI_Type_create_resized )

2010-03-17 Thread Pascal Deveze

Hi all,

I use a very simple datatype defined as follow:
lng[0]= 1;
   dsp[0]= 1;
   err=MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
   err=MPI_Type_create_resized(offtype, 0, 2, &filetype);
   MPI_Type_commit(&filetype);

This datatype consists of a hole (of length 1 char) followed by a char.

The datatype with hole at the beginning is not correctly handled by 
ROMIO integrated in OpenMPI (I tried with MPICH2 and it worked fine).

You will see bellow a program to reproduce the problem.

After investigations, I see that the difference between OpenMPI and 
MPICH appears at line 542 in the file romio/adio/comm/flatten.c:


   case MPI_COMBINER_RESIZED:
   /* This is done similar to a type_struct with an lb, datatype, ub */

   /* handle the Lb */
   j = *curr_index;
   flat->indices[j] = st_offset + adds[0];
   flat->blocklens[j] = 0;

   (*curr_index)++;

   /* handle the datatype */

   MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
 &old_ntypes, &old_combiner);
   ADIOI_Datatype_iscontig(types[0], &old_is_contig); <== 
ligne 542


For MPICH2, the datatype is not contiguous, but it is for OpenMPI. The 
routine ADIOI_Datatype_iscontig is
quite different in OpenMPI because the datatypes are handled very 
differently. If I reset old_is_contig just after

line 542, the problem disappears (Of course, this is not a solution).

I am not able to propose a right solution. Can somebody help ?

Pascal

 Program to reproduce the problem 
#include 
#include "mpi.h"

char filename[256]="VIEW_TEST";
char buffer[100];
int err, i, myid, dsp[3], lng[3];
MPI_Status status;
MPI_File fh;
MPI_Datatype filetype, offtype;
MPI_Aint lb, extent;

int main(int argc, char **argv) {

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
for (i=0; i   MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_CREATE | 
MPI_MODE_RDWR , MPI_INFO_NULL, &fh);

   MPI_File_write(fh, buffer, sizeof(buffer), MPI_CHAR, &status);
   MPI_File_close(&fh);

   lng[0]= 1;
   dsp[0]= 1;
   MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
   MPI_Type_create_resized(offtype, 0, 2, &filetype);
   MPI_Type_commit(&filetype);

   MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_RDONLY , 
MPI_INFO_NULL, &fh);

   MPI_File_set_view(fh, 0, MPI_CHAR, filetype,"native", MPI_INFO_NULL);
   MPI_File_read(fh, buffer, 5, MPI_CHAR, &status);

   printf("Data: ");
   for (i=0 ; i<5 ; i++) printf(" %x ", buffer[i]);
   if (buffer[1] != 3) printf("\n ===>  test KO : buffer[1]=%d 
instead of %d \n", buffer[1], 4);

   else printf("\n ===> test OK\n");
   MPI_Type_free(&filetype);
   MPI_File_close(&fh);
}
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
}
 The result of the program with MPICH2 
Data:  1  3  5  7  9
===> test OK

 The result of the program with OpenMPI 
Data:  0  2  4  6  8
===>  test KO : buffer[1]=2 instead of 4

Comment: Only the first hole is ommited.





Re: [OMPI devel] Problem with MPI_Type_indexed and hole (defined with MPI_Type_create_resized )

2010-03-18 Thread Pascal Deveze

Hi all,

Sorry, I missed my porting from MPICH2 to OpenMPI concerning the file 
romio/adio/comm/flatten.c

(flatten.c in OpenMPI does not support MPI_COMBINER_RESIZED).

Here is the diff:

diff -u flatten.c flatten.c.old
--- flatten.c   2010-03-18 17:07:43.0 +0100
+++ flatten.c.old   2010-03-18 17:14:04.0 +0100
@@ -525,44 +525,6 @@
   }
   break;

-case MPI_COMBINER_RESIZED:
-/* This is done similar to a type_struct with an lb, datatype, ub */
-
-/* handle the Lb */
-   j = *curr_index;
-   flat->indices[j] = st_offset + adds[0];
-   flat->blocklens[j] = 0;
-
-   (*curr_index)++;
-
-   /* handle the datatype */
-
-   MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
- &old_ntypes, &old_combiner);
-   ADIOI_Datatype_iscontig(types[0], &old_is_contig);
-
-   if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
-   ADIOI_Flatten(types[0], flat, st_offset+adds[0], curr_index);
-   }
-   else {
-/* current type is basic or contiguous */
-   j = *curr_index;
-   flat->indices[j] = st_offset;
-   MPI_Type_size(types[0], (int*)&old_size);
-   flat->blocklens[j] = old_size;
-
-   (*curr_index)++;
-   }
-
-   /* take care of the extent as a UB */
-   j = *curr_index;
-   flat->indices[j] = st_offset + adds[0] + adds[1];
-   flat->blocklens[j] = 0;
-
-   (*curr_index)++;
-
-   break;
-
default:
   /* TODO: FIXME (requires changing prototypes to return errors...) */
   FPRINTF(stderr, "Error: Unsupported datatype passed to 
ADIOI_Flatten\n");

@@ -827,29 +789,6 @@
   }
   }
   break;
-
-case MPI_COMBINER_RESIZED:
-   /* treat it as a struct with lb, type, ub */
-
-   /* add 2 for lb and ub */
-   (*curr_index) += 2;
-   count += 2;
-
-   /* add for datatype */
-   MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
-  &old_ntypes, &old_combiner);
-   ADIOI_Datatype_iscontig(types[0], &old_is_contig);
-
-   if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
-   count += ADIOI_Count_contiguous_blocks(types[0], curr_index);
-   }
-   else {
-/* basic or contiguous type */
-   count++;
-   (*curr_index)++;
-   }
-   break;
-
default:
   /* TODO: FIXME */
   FPRINTF(stderr, "Error: Unsupported datatype passed to 
ADIOI_Count_contiguous_blocks, combiner = %d\n", combiner);



Regards,

Pascal

Pascal Deveze a écrit :

Hi all,

I use a very simple datatype defined as follow:
lng[0]= 1;
   dsp[0]= 1;
   err=MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
   err=MPI_Type_create_resized(offtype, 0, 2, &filetype);
   MPI_Type_commit(&filetype);

This datatype consists of a hole (of length 1 char) followed by a char.

The datatype with hole at the beginning is not correctly handled by 
ROMIO integrated in OpenMPI (I tried with MPICH2 and it worked fine).

You will see bellow a program to reproduce the problem.

After investigations, I see that the difference between OpenMPI and 
MPICH appears at line 542 in the file romio/adio/comm/flatten.c:


   case MPI_COMBINER_RESIZED:
   /* This is done similar to a type_struct with an lb, datatype, ub */

   /* handle the Lb */
   j = *curr_index;
   flat->indices[j] = st_offset + adds[0];
   flat->blocklens[j] = 0;

   (*curr_index)++;

   /* handle the datatype */

   MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
 &old_ntypes, &old_combiner);
   ADIOI_Datatype_iscontig(types[0], &old_is_contig); <== 
ligne 542


For MPICH2, the datatype is not contiguous, but it is for OpenMPI. The 
routine ADIOI_Datatype_iscontig is
quite different in OpenMPI because the datatypes are handled very 
differently. If I reset old_is_contig just after

line 542, the problem disappears (Of course, this is not a solution).

I am not able to propose a right solution. Can somebody help ?

Pascal

 Program to reproduce the problem 
#include 
#include "mpi.h"

char filename[256]="VIEW_TEST";
char buffer[100];
int err, i, myid, dsp[3], lng[3];
MPI_Status status;
MPI_File fh;
MPI_Datatype filetype, offtype;
MPI_Aint lb, extent;

int main(int argc, char **argv) {

MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myid);
for (i=0; i   MPI_File_open(MPI_COMM_SELF, filename, MPI_MODE_CREATE | 
MPI_MODE_RDWR , MPI_INFO_NULL, &fh);

   MPI_File_write(fh, buffer, sizeof(buffer), MPI_CHAR, &status);
   MPI_File_close(&fh);

   lng[0]= 1;
   dsp[0]= 1;
   MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
   MPI_Type_create_resized(offtype, 0, 2, &filet

Re: [OMPI devel] Problem with MPI_Type_indexed and hole (defined with MPI_Type_create_resized )

2010-03-19 Thread Pascal Deveze

Hi George,

I went further on my investigations, and I found a solution.

ADIOI_Datatype_iscontig is defined in the file 
ompi/mca/io/romio/src/io_romio_module.c as:


void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
{
   /*
* Open MPI contiguous check return true for datatype with
* gaps in the beginning and at the end. We have to provide
* a count of 2 in order to get these gaps taken into acount.
*/
   *flag = ompi_datatype_is_contiguous_memory_layout(datatype, 2);
}

It is clearly written here that the gaps should be taken into account 
with a count of 2. But that's not everytime the case.


Your proposition is to modify ROMIO code.
So, I propose to fix ADIOI_Datatype_iscontig and add the following code 
after the call

to ompi_datatype_is_contiguous_memory_layout():

   if (*flag) {
   MPI_Aint true_extent, true_lb;

   ompi_datatype_get_true_extent(datatype, &true_lb, &true_extent);

   if (true_lb > 0)
   *flag = 0;
   }

Regards,

Pascal

On Mar 18, 2010, at 13:24, George Bosilca wrote:
We will disagree on that, but your datatype is contiguous. It doesn't 
matter that there are gaps in the beginning and at the end, as long as 
you only send one such datatype the real data that has to go over the 
network _is_ contiguous. And this is what the Open MPI datatype engine 
is reporting back.


Apparently, ROMIO expect a contiguous datatype to start from the 
position 0 relative to the beginning of the user buffer. I don't see 
why they have such a restrictive view, but I guess the original MPICH 
datatype engine was not able to distinguish between gaps in the middle 
and gaps at the beginning and the end of the datatype.


I don't see how to fix that in ROMIO code. But in case you plan to fix 
it, the correct solution is to retrieve the true lower bound of the 
datatype in the contiguous case and add it to st_offset.


 george.

On Mar 18, 2010, at 12:27 , Pascal Deveze wrote:


 Hi all,

 Sorry, I missed my porting from MPICH2 to OpenMPI concerning the file 

romio/adio/comm/flatten.c

 (flatten.c in OpenMPI does not support MPI_COMBINER_RESIZED).

 Here is the diff:

 diff -u flatten.c flatten.c.old
 --- flatten.c 2010-03-18 17:07:43.0 +0100
 +++ flatten.c.old 2010-03-18 17:14:04.0 +0100
 @@ -525,44 +525,6 @@
 }
 break;
 - case MPI_COMBINER_RESIZED:
 - /* This is done similar to a type_struct with an lb, datatype, ub */
 -
 - /* handle the Lb */
 - j = *curr_index;
 - flat->indices[j] = st_offset + adds[0];
 - flat->blocklens[j] = 0;
 -
 - (*curr_index)++;
 -
 - /* handle the datatype */
 -
 - MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
 - &old_ntypes, &old_combiner);
 - ADIOI_Datatype_iscontig(types[0], &old_is_contig);
 -
 - if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
 - ADIOI_Flatten(types[0], flat, st_offset+adds[0], curr_index);
 - }
 - else {
 - /* current type is basic or contiguous */
 - j = *curr_index;
 - flat->indices[j] = st_offset;
 - MPI_Type_size(types[0], (int*)&old_size);
 - flat->blocklens[j] = old_size;
 -
 - (*curr_index)++;
 - }
 -
 - /* take care of the extent as a UB */
 - j = *curr_index;
 - flat->indices[j] = st_offset + adds[0] + adds[1];
 - flat->blocklens[j] = 0;
 -
 - (*curr_index)++;
 -
 - break;
 -
 default:
 /* TODO: FIXME (requires changing prototypes to return errors...) */
 FPRINTF(stderr, "Error: Unsupported datatype passed to 
ADIOI_Flatten\n");

 @@ -827,29 +789,6 @@
 }
 }
 break;
 -
 - case MPI_COMBINER_RESIZED:
 - /* treat it as a struct with lb, type, ub */
 -
 - /* add 2 for lb and ub */
 - (*curr_index) += 2;
 - count += 2;
 -
 - /* add for datatype */
 - MPI_Type_get_envelope(types[0], &old_nints, &old_nadds,
 - &old_ntypes, &old_combiner);
 - ADIOI_Datatype_iscontig(types[0], &old_is_contig);
 -
 - if ((old_combiner != MPI_COMBINER_NAMED) && (!old_is_contig)) {
 - count += ADIOI_Count_contiguous_blocks(types[0], curr_index);
 - }
 - else {
 - /* basic or contiguous type */
 - count++;
 - (*curr_index)++;
 - }
 - break;
 -
 default:
 /* TODO: FIXME */
 FPRINTF(stderr, "Error: Unsupported datatype passed to 

ADIOI_Count_contiguous_blocks, combiner = %d\n", combiner);



 Regards,

 Pascal

 Pascal Deveze a écrit :
> Hi all,
>
> I use a very simple datatype defined as follow:
> lng[0]= 1;
> dsp[0]= 1;
> err=MPI_Type_indexed(1, lng, dsp, MPI_CHAR, &offtype);
> err=MPI_Type_create_resized(offtype, 0, 2, &filetype);
> MPI_Type_commit(&filetype);
>
> This datatype consists of a hole (of length 1 char) followed by a 
char.

>
> The datatype with hole at the beginning is not correctly handled by 

ROMIO integrated in OpenMPI (I tried with MPICH2 and it worked fine).

> You will see bellow a program to reproduce the problem.
>
> After investigations, I see that the difference 

Re: [OMPI devel] devel Digest, Vol 1613, Issue 1

2010-03-22 Thread Pascal Deveze

George,

You are right.
- I agree with you: The Open MPI ompi_datatype_is_contigous_memory runs 
correctly.
- The problem comes with ROMIO: They need a function that returns true 
if the content is contiguous AND the content start at the pointer 
position (displacement zero).

- MPI Datatypes are a fanny world  ;-)

If you take a look at source 
ompi/mca/io/romio/romio/adio/common/iscontig.c you will see:


#if (defined(MPICH) || defined(MPICH2))
/* MPICH2 also provides this routine */
void MPIR_Datatype_iscontig(MPI_Datatype datatype, int *flag);

void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
{
   MPIR_Datatype_iscontig(datatype, flag);

   /* if it is MPICH2 and the datatype is reported as contigous,
  check if the true_lb is non-zero, and if so, mark the
  datatype as noncontiguous */
#ifdef MPICH2
   if (*flag) {
   MPI_Aint true_extent, true_lb;

   MPI_Type_get_true_extent(datatype, &true_lb, &true_extent);

   if (true_lb > 0)
   *flag = 0;
   }
#endif
}
#elif 

My proposition is just to take these 12 last lines and put them into 
ompi/mca/io/romio/src/io_romio_module.c to

conform to what ROMIO wants.
If my proposition is accepted, just take my patch:

diff -u ompi/mca/io/romio/src/io_romio_module.c 
ompi/mca/io/romio/src/io_romio_module.c.OLD
--- ompi/mca/io/romio/src/io_romio_module.c 2010-03-19 
11:19:57.0 +0100
+++ ompi/mca/io/romio/src/io_romio_module.c.OLD 2010-03-22 
11:05:57.0 +0100

@@ -133,12 +133,4 @@
 * a count of 2 in order to get these gaps taken into acount.
 */
*flag = ompi_datatype_is_contiguous_memory_layout(datatype, 2);
-if (*flag) {
-MPI_Aint true_extent, true_lb;
-
-ompi_datatype_get_true_extent(datatype, &true_lb, &true_extent);
-
-if (true_lb > 0)
-*flag = 0;
-}
}

Pascal



On Mar 19, 2010, at 11:52, George Bosilca wrote:


Pascal,

I went inside the code, and I have to say it's a long tricky story. 
Let me try to sort it out:


- you create two types:
  - the indexed one containing just one element. This type is 
contiguous as there are no holes around the data, i.e. the size and 
the extent of this datatype are equal.
  - the resized one. This type resize the previous one by adding a 
hole in the beginning, thus it is not a contiguous type, even if the 
memory layout is in a single piece.


Now, let's go one step up in the ROMIO code attached to your previous 
email. You get the content of the main type, in this example RESIZED, 
and them the content of the internal type which is TYPE indexed. When 
you look if the internal type is contiguous, Open MPI answer yes as 
the indexed type has its extent equal to its size. While this is true, 
the fact that this type is resized make it non-contiguous, as by 
resizing it you explicitly alter the lower bound.


The fix you proposed in your last email (i.e. modify the ADIO is 
contig function) is a workaround. Let me think a little bit more about 
this. I'll be in right here, please read below...


If I read the MPI 2-2 standard in the Chapter about the Datatypes 
(page 87), at the section about the MPI_Type_indexed. I have the 
original typemap, i.e. the one for the MPI_CHAR type (char,0). When I 
create the indexed type I get the typemap (char, 1). Based on the 
definition of lower and upper bounds on the page 100, lb is equal to 1 
and ub is equal to 2, which make the extent of the indexed type equal 
to 1. So far so good. Now let's look what the MPI standard says about 
having multiple of such datatype in an array, aka MPI_Type_contiguous 
based on your MPI_Type_indexed. As a reminder you indexed type has the 
typemap (char, 1) and the extent 1. Based on the definition of 
MPI_Type_contiguous on page 84, the typemap of the 
MPI_Type_contiguous( 4, your_indexed_type) is: (char,1), (char, 2), 
(char, 3), (char, 4) which as far as I can say it is __contiguous__. 
So the Open MPI ompi_datatype_is_contigous_memory correctly returns 
the fact that the resulting datatype even with a co!
 unt greater than 1 is contiguous. Welcome to the fancy world of MPI 
datatypes.


Therefore, I think the Open MPI functions __really__ do the correct 
thing, and the problem is in the COMBINER_RESIZED code. As the 
datatype is explicitly resized by the user, you should not look if the 
previous type (types[0]) is contiguous or not, it doesn't matter as it 
was clearly resized. I wonder what the ROMIO developers had in mind 
for the ADIOI_Datatype_iscontig function, but it doesn't look like 
they just want to know if the content is contiguous. I guess this 
function return true if the content is contiguous AND the content 
start at the pointer position (displacement zero).


  george.


On Mar 19, 2010, at 06:14 , Pascal Deveze wrote:

 

Hi George,

I went further on my investigations, and I found a solution.

ADIOI_Datatype_iscontig is defined in the file

Re: [OMPI devel] Problem with MPI_Type_indexed and hole (defined with MPI_Type_create_resized )

2010-03-22 Thread Pascal Deveze

George,

You are right.
- I agree with you: The Open MPI ompi_datatype_is_contigous_memory runs 
correctly.
- The problem comes with ROMIO: They need a function that returns true 
if the content is contiguous AND the content start at the pointer 
position (displacement zero).

- MPI Datatypes are a fanny world  ;-)

If you take a look at source 
ompi/mca/io/romio/romio/adio/common/iscontig.c you will see:


#if (defined(MPICH) || defined(MPICH2))
/* MPICH2 also provides this routine */
void MPIR_Datatype_iscontig(MPI_Datatype datatype, int *flag);

void ADIOI_Datatype_iscontig(MPI_Datatype datatype, int *flag)
{
  MPIR_Datatype_iscontig(datatype, flag);

  /* if it is MPICH2 and the datatype is reported as contigous,
 check if the true_lb is non-zero, and if so, mark the
 datatype as noncontiguous */
#ifdef MPICH2
  if (*flag) {
  MPI_Aint true_extent, true_lb;

  MPI_Type_get_true_extent(datatype, &true_lb, &true_extent);

  if (true_lb > 0)
  *flag = 0;
  }
#endif
}
#elif 

My proposition is just to take these 12 last lines and put them into 
ompi/mca/io/romio/src/io_romio_module.c to

conform to what ROMIO wants.
If my proposition is accepted, just take my patch:

diff -u ompi/mca/io/romio/src/io_romio_module.c 
ompi/mca/io/romio/src/io_romio_module.c.OLD
--- ompi/mca/io/romio/src/io_romio_module.c 2010-03-19 
11:19:57.0 +0100
+++ ompi/mca/io/romio/src/io_romio_module.c.OLD 2010-03-22 
11:05:57.0 +0100

@@ -133,12 +133,4 @@
* a count of 2 in order to get these gaps taken into acount.
*/
   *flag = ompi_datatype_is_contiguous_memory_layout(datatype, 2);
-if (*flag) {
-MPI_Aint true_extent, true_lb;
-
-ompi_datatype_get_true_extent(datatype, &true_lb, &true_extent);
-
-if (true_lb > 0)
-*flag = 0;
-}
}

Pascal



On Mar 19, 2010, at 11:52, George Bosilca wrote:


Pascal,

I went inside the code, and I have to say it's a long tricky story. 
Let me try to sort it out:


- you create two types:
  - the indexed one containing just one element. This type is 
contiguous as there are no holes around the data, i.e. the size and 
the extent of this datatype are equal.
  - the resized one. This type resize the previous one by adding a 
hole in the beginning, thus it is not a contiguous type, even if the 
memory layout is in a single piece.


Now, let's go one step up in the ROMIO code attached to your previous 
email. You get the content of the main type, in this example RESIZED, 
and them the content of the internal type which is TYPE indexed. When 
you look if the internal type is contiguous, Open MPI answer yes as 
the indexed type has its extent equal to its size. While this is true, 
the fact that this type is resized make it non-contiguous, as by 
resizing it you explicitly alter the lower bound.


The fix you proposed in your last email (i.e. modify the ADIO is 
contig function) is a workaround. Let me think a little bit more about 
this. I'll be in right here, please read below...


If I read the MPI 2-2 standard in the Chapter about the Datatypes 
(page 87), at the section about the MPI_Type_indexed. I have the 
original typemap, i.e. the one for the MPI_CHAR type (char,0). When I 
create the indexed type I get the typemap (char, 1). Based on the 
definition of lower and upper bounds on the page 100, lb is equal to 1 
and ub is equal to 2, which make the extent of the indexed type equal 
to 1. So far so good. Now let's look what the MPI standard says about 
having multiple of such datatype in an array, aka MPI_Type_contiguous 
based on your MPI_Type_indexed. As a reminder you indexed type has the 
typemap (char, 1) and the extent 1. Based on the definition of 
MPI_Type_contiguous on page 84, the typemap of the 
MPI_Type_contiguous( 4, your_indexed_type) is: (char,1), (char, 2), 
(char, 3), (char, 4) which as far as I can say it is __contiguous__. 
So the Open MPI ompi_datatype_is_contigous_memory correctly returns 
the fact that the resulting datatype even with a co!
 unt greater than 1 is contiguous. Welcome to the fancy world of MPI 
datatypes.


Therefore, I think the Open MPI functions __really__ do the correct 
thing, and the problem is in the COMBINER_RESIZED code. As the 
datatype is explicitly resized by the user, you should not look if the 
previous type (types[0]) is contiguous or not, it doesn't matter as it 
was clearly resized. I wonder what the ROMIO developers had in mind 
for the ADIOI_Datatype_iscontig function, but it doesn't look like 
they just want to know if the content is contiguous. I guess this 
function return true if the content is contiguous AND the content 
start at the pointer position (displacement zero).


  george.


On Mar 19, 2010, at 06:14 , Pascal Deveze wrote:

 

Hi George,

I went further on my investigations, and I found a solution.

ADIOI_Datatype_iscontig is defined in the file 
ompi/mca/io/romio/src/

[OMPI devel] sendrecv_replace: long time to allocate/free memory

2010-04-22 Thread Pascal Deveze

Hi all,

The sendrecv_replace in Open MPI seems to allocate/free memory with 
MPI_Alloc_mem()/MPI_Free_mem()


I measured the time to allocate/free a buffer of 1MB.
MPI_Alloc_mem/MPI_Free_mem take 350us while malloc/free only take 8us.

malloc/free in ompi/mpi/c/sendrecv_replace.c was replaced by 
MPI_Alloc_mem/MPI_Free_mem with this commit :


user:twoodall
date:Thu Sep 22 16:43:17 2005 +
summary: use MPI_Alloc_mem/MPI_Free_mem for internally allocated 
buffers


Is there a real reason to use these functions or can we move back to 
malloc/free ?
Is there a problem on my configuration explaining such slow performance 
with MPI_Alloc_mem ?


Pascal


Re: [OMPI devel] sendrecv_replace: long time to allocate/free memory

2010-04-30 Thread Pascal Deveze

On Fri, 23 Apr 2010 at 11:29:53, George Bosilca wrote:

If you use any kind of high performance network that require memory 
registration for communications, then this high cost for the 
MPI_Alloc_mem will be hidden by the communications. However, the 
MPI_Alloc_mem function seems horribly complicated to me, as we do the 
whole "find-the-right-allocator" step every time instead of caching 
it. While this might be improved, I'm pretty sure the major part of 
the overhead comes from the registration itself.
The MPI_Alloc_mem function allocate the memory and then it register it 
with the high speed interconnect (Infiniband as an example). If you 
don't have IB, then this should not happens. You can try to force the 
mpool to nothing, or disable the pinning 
(mpi_leave_pinned=0,mpi_leave_pinned_pipeline=0) to see if this affect 
the performances.


  
I have an IB cluster with 32 cores nodes. A big part of my 
communications is done through sm, so registering systematically buffers 
with IB is killing performance for nothing.
Following your tip, I disabled the pinning (using "mpirun -mca 
mpi_leave_pinned 0 -mca mpi_leave_pinned_pipeline 0)".
The cycle (MPI_Alloc_mem/MPI_Free_mem) takes now 120 us, while 
(malloc/free) takes 1 us.


In all cases, a program calling MPI_Sendrecv_replace() is hardly 
penalized by these calls to MPI_Alloc_mem/MPI_Free_mem.
That's why I proposed to come back to the malloc/free scheme in this 
routine.


Pascal

  george.

On Apr 22, 2010, at 08:50 , Pascal Deveze wrote:

 

Hi all,

The sendrecv_replace in Open MPI seems to allocate/free memory with 
MPI_Alloc_mem()/MPI_Free_mem()


I measured the time to allocate/free a buffer of 1MB.
MPI_Alloc_mem/MPI_Free_mem take 350us while malloc/free only take 8us.

malloc/free in ompi/mpi/c/sendrecv_replace.c was replaced by 
MPI_Alloc_mem/MPI_Free_mem with this commit :


user:twoodall
date:Thu Sep 22 16:43:17 2005 
summary: use MPI_Alloc_mem/MPI_Free_mem for internally allocated 
buffers


Is there a real reason to use these functions or can we move back to 
malloc/free ?
Is there a problem on my configuration explaining such slow 
performance with MPI_Alloc_mem ?


Pascal
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] sendrecv_replace: long time to allocate/free memory

2010-04-30 Thread Pascal Deveze

On Fri, 23 Apr 2010 at 11:29:53, George Bosilca wrote:

If you use any kind of high performance network that require memory 
registration for communications, then this high cost for the 
MPI_Alloc_mem will be hidden by the communications. However, the 
MPI_Alloc_mem function seems horribly complicated to me, as we do the 
whole "find-the-right-allocator" step every time instead of caching 
it. While this might be improved, I'm pretty sure the major part of 
the overhead comes from the registration itself.
The MPI_Alloc_mem function allocate the memory and then it register it 
with the high speed interconnect (Infiniband as an example). If you 
don't have IB, then this should not happens. You can try to force the 
mpool to nothing, or disable the pinning 
(mpi_leave_pinned=0,mpi_leave_pinned_pipeline=0) to see if this affect 
the performances.


  
I have an IB cluster with 32 cores nodes. A big part of my 
communications is done through sm, so registering systematically buffers 
with IB is killing performance for nothing.
Following your tip, I disabled the pinning (using "mpirun -mca 
mpi_leave_pinned 0 -mca mpi_leave_pinned_pipeline 0)".
The cycle (MPI_Alloc_mem/MPI_Free_mem) takes now 120 us, while 
(malloc/free) takes 1 us.


In all cases, a program calling MPI_Sendrecv_replace() is hardly 
penalized by these calls to MPI_Alloc_mem/MPI_Free_mem.
That's why I proposed to come back to the malloc/free scheme in this 
routine.


Pascal

  george.

On Apr 22, 2010, at 08:50 , Pascal Deveze wrote:

 

Hi all,

The sendrecv_replace in Open MPI seems to allocate/free memory with 
MPI_Alloc_mem()/MPI_Free_mem()


I measured the time to allocate/free a buffer of 1MB.
MPI_Alloc_mem/MPI_Free_mem take 350us while malloc/free only take 8us.

malloc/free in ompi/mpi/c/sendrecv_replace.c was replaced by 
MPI_Alloc_mem/MPI_Free_mem with this commit :


user:twoodall
date:Thu Sep 22 16:43:17 2005 
summary: use MPI_Alloc_mem/MPI_Free_mem for internally allocated 
buffers


Is there a real reason to use these functions or can we move back to 
malloc/free ?
Is there a problem on my configuration explaining such slow 
performance with MPI_Alloc_mem ?


Pascal
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



[OMPI devel] New Romio for OpenMPI available in bitbucket

2010-09-17 Thread Pascal Deveze

Hi all,

In charge of ticket 1888 (see at 
https://svn.open-mpi.org/trac/ompi/ticket/1888) ,

I have put the resulting code in bitbucket at:
http://bitbucket.org/devezep/new-romio-for-openmpi/

The work in this repo consisted in refreshing ROMIO to a newer
version: the one from the very last MPICH2 release (mpich2-1.3b1).

Testing:
 1. runs fine except one minor error (see the explanation below) on 
various FS.

 2. runs fine with Lustre, but:
. had to add a small patch in romio/adio/ad_lustre_open.c
 3. see below how to efficiently run with Lustre.

You are invited to test and send comments

Enjoy !

Pascal

 The minor error ===
The test error.c fails because OpenMPI does not handle correctly the
"two level" error functions of ROMIO:
   error_code = MPIO_Err_create_code(MPI_SUCCESS, MPIR_ERR_RECOVERABLE,
   myname, __LINE__, MPI_ERR_ARG,
   "**iobaddisp", 0);
OpenMPI limits its view to MPI_ERR_ARG, but the real error is 
"**iobaddisp".


= How to test performances with Lustre ===
1) Compile with Lustre ADIO driver. For this, add the flag
   --with-io-romio-flags="--with-file-system=ufs+nfs+lustre" to 
your configure command.


2) Of course, you should have a Lustre file system mounted on all the 
nodes you will run on.


3) Take an application like coll_perf.c (in the test directory). In this 
application, change the
   three dimensions to 1000, that will create a file of  4 GB (big 
files are required in order

   to reach good performances with Lustre).

4) Put the highest possible striping_factor in the hint. For this, one 
solution is :
- If your Lustre file system have 16 OST, create a hint file with the 
following line:

   striping_factor 16
- Export the path to this file in the variable ROMIO_HINTS:
  export ROMIO_HINTS=my_directory/my_hints
If you do not specify the striping_factor, Lustre will set the default 
value (often 2 only).
You can verify the striping_factor set by Lustre with the following 
command:

lfs getstripe  (look at the value of lmm_stripe_count)
 Note: The striping_factor is set once at file creation and cannot be 
changed after.


5) Run your test, specifying a file located in the Lustre file system.



[OMPI devel] Patch proposed: opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is called to late

2014-12-09 Thread Pascal Deveze

In case where MPI is compiled with --enable-mpi-thread-multiple, a call to 
opal_using_threads() always returns 0 in the routine btl_xxx_component_init() 
of the BTLs, event if the application calls MPI_Init_thread() with 
MPI_THREAD_MULTIPLE.

This is because opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is 
called to late.

I propose the following patch that solves the problem for me:

diff --git a/ompi/runtime/ompi_mpi_init.c b/ompi/runtime/ompi_mpi_init.c
index 35509cf..c2370fc 100644
--- a/ompi/runtime/ompi_mpi_init.c
+++ b/ompi/runtime/ompi_mpi_init.c
@@ -512,6 +512,13 @@ int ompi_mpi_init(int argc, char **argv, int requested, 
int *provided)
 }
#endif

+/* If thread support was enabled, then setup OPAL to allow for
+   them. */
+if ((OPAL_ENABLE_PROGRESS_THREADS == 1) ||
+(*provided != MPI_THREAD_SINGLE)) {
+opal_set_using_threads(true);
+}
+
 /* initialize datatypes. This step should be done early as it will
  * create the local convertor and local arch used in the proc
  * init.
@@ -724,13 +731,6 @@ int ompi_mpi_init(int argc, char **argv, int requested, 
int *provided)
goto error;
 }

-/* If thread support was enabled, then setup OPAL to allow for
-   them. */
-if ((OPAL_ENABLE_PROGRESS_THREADS == 1) ||
-(*provided != MPI_THREAD_SINGLE)) {
-opal_set_using_threads(true);
-}
-
 /* start PML/BTL's */
 ret = MCA_PML_CALL(enable(true));
 if( OMPI_SUCCESS != ret ) {


Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is called to late

2014-12-09 Thread Pascal Deveze
Hi Ralph,

This in in the trunk.

De : devel [mailto:devel-boun...@open-mpi.org] De la part de Ralph Castain
Envoyé : mardi 9 décembre 2014 09:32
À : Open MPI Developers
Objet : Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in 
ompi/runtime/ompi_mpi_init.c is called to late

Hi Pascal

Is this in the trunk or in the 1.8 series (or both)?


On Dec 9, 2014, at 12:28 AM, Pascal Deveze 
mailto:pascal.dev...@bull.net>> wrote:


In case where MPI is compiled with --enable-mpi-thread-multiple, a call to 
opal_using_threads() always returns 0 in the routine btl_xxx_component_init() 
of the BTLs, event if the application calls MPI_Init_thread() with 
MPI_THREAD_MULTIPLE.

This is because opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is 
called to late.

I propose the following patch that solves the problem for me:

diff --git a/ompi/runtime/ompi_mpi_init.c b/ompi/runtime/ompi_mpi_init.c
index 35509cf..c2370fc 100644
--- a/ompi/runtime/ompi_mpi_init.c
+++ b/ompi/runtime/ompi_mpi_init.c
@@ -512,6 +512,13 @@ int ompi_mpi_init(int argc, char **argv, int requested, 
int *provided)
 }
#endif

+/* If thread support was enabled, then setup OPAL to allow for
+   them. */
+if ((OPAL_ENABLE_PROGRESS_THREADS == 1) ||
+(*provided != MPI_THREAD_SINGLE)) {
+opal_set_using_threads(true);
+}
+
 /* initialize datatypes. This step should be done early as it will
  * create the local convertor and local arch used in the proc
  * init.
@@ -724,13 +731,6 @@ int ompi_mpi_init(int argc, char **argv, int requested, 
int *provided)
goto error;
 }

-/* If thread support was enabled, then setup OPAL to allow for
-   them. */
-if ((OPAL_ENABLE_PROGRESS_THREADS == 1) ||
-(*provided != MPI_THREAD_SINGLE)) {
-opal_set_using_threads(true);
-}
-
 /* start PML/BTL's */
 ret = MCA_PML_CALL(enable(true));
 if( OMPI_SUCCESS != ret ) {
___
devel mailing list
de...@open-mpi.org<mailto:de...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2014/12/16459.php



Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is called to late

2014-12-12 Thread Pascal Deveze
George,

My initial problem is that when MPI is compiled with 
“--enable-mpi-thread-multiple”, the variable enable_mpi_threads is set to 1 
even if MPI_Init() is called in place of MPI_Init_thread().
I saw also that  opal_using_threads() exists and was used by other BTLs.

Maybe the solution is to find the way to set enable_mpi_threads to 0 when 
MPI_Init() is called.


De : devel [mailto:devel-boun...@open-mpi.org] De la part de George Bosilca
Envoyé : vendredi 12 décembre 2014 07:03
À : Open MPI Developers
Objet : Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in 
ompi/runtime/ompi_mpi_init.c is called to late

On Thu, Dec 11, 2014 at 8:30 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Just to help me understand: I don’t think this change actually changed any 
behavior. However, it certainly *allows* a different behavior. Isn’t that true?

It depends how you look at this. To be extremely clear it prevents the modules 
from using anything else than their arguments to decide the provided threading 
model. With the current change, it is possible that some of the modules will 
continue to follow this "old" behavior, while others might switch to check 
opal_using_threads to see how they might behave.

My point here is not that one is better than the other, just that we 
inadvertently introduced a possibility for non-consistent behavior.

Let me take an example. In the old scheme, the PML was allowed to run each BTL 
in a separate thread, with absolutely no BTL support for thread safety. Thus, 
the PML could have managed all the interactions between BTL and requests in an 
atomic way, without the BTL knowing about. Now, if the BTL make his decision 
based on the value returned by opal_using_threads this approach is not possible 
anymore.

If so, I guess the real question is for Pascal at Bull: why do you feel this 
earlier setting is required?

This might allow to see if using functions that require protection, such as 
opal_lifo_push, will work by default or one should use directly their atomic 
version?

  George.



On Dec 11, 2014, at 4:21 PM, George Bosilca 
mailto:bosi...@icl.utk.edu>> wrote:

The overall design in OMPI was that no OMPI module should be allowed to decide 
if threads are on (thus it should not rely on the value returned by 
opal_using_threads during it's initialization stage). Instead, they should 
respect the level of thread support requested as an argument during the 
initialization step.

And this is true even for the BTLs. The PML component init function is 
propagating the  enable_progress_threads and enable_mpi_threads, down to the 
BML, and then to the BTL. This 2 variables, enable_progress_threads and 
enable_mpi_threads, are exactly what the ompi_mpi_init is using to compute the 
the value of the opal) using_thread (and that this patch moved).

The setting of the opal_using_threads was delayed during the initialization to 
ensure that it's value was not used to select a specific thread-level in any 
module, a behavior that is allowed now with the new setting.

A drastic change in behavior...

  George.


On Tue, Dec 9, 2014 at 3:33 AM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Kewl - I’ll fix. Thanks!

On Dec 9, 2014, at 12:32 AM, Pascal Deveze 
mailto:pascal.dev...@bull.net>> wrote:

Hi Ralph,

This in in the trunk.

De : devel [mailto:devel-boun...@open-mpi.org] De la part de Ralph Castain
Envoyé : mardi 9 décembre 2014 09:32
À : Open MPI Developers
Objet : Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in 
ompi/runtime/ompi_mpi_init.c is called to late

Hi Pascal

Is this in the trunk or in the 1.8 series (or both)?


On Dec 9, 2014, at 12:28 AM, Pascal Deveze 
mailto:pascal.dev...@bull.net>> wrote:


In case where MPI is compiled with --enable-mpi-thread-multiple, a call to 
opal_using_threads() always returns 0 in the routine btl_xxx_component_init() 
of the BTLs, event if the application calls MPI_Init_thread() with 
MPI_THREAD_MULTIPLE.

This is because opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is 
called to late.

I propose the following patch that solves the problem for me:

diff --git a/ompi/runtime/ompi_mpi_init.c b/ompi/runtime/ompi_mpi_init.c
index 35509cf..c2370fc 100644
--- a/ompi/runtime/ompi_mpi_init.c
+++ b/ompi/runtime/ompi_mpi_init.c
@@ -512,6 +512,13 @@ int ompi_mpi_init(int argc, char **argv, int requested, 
int *provided)
 }
#endif

+/* If thread support was enabled, then setup OPAL to allow for
+   them. */
+if ((OPAL_ENABLE_PROGRESS_THREADS == 1) ||
+(*provided != MPI_THREAD_SINGLE)) {
+opal_set_using_threads(true);
+}
+
 /* initialize datatypes. This step should be done early as it will
  * create the local convertor and local arch used in the proc
  * init.
@@ -724,13 +731,6 @@ int ompi_mpi_init(int argc, char **argv, int requested, 
int *provided)
goto error;
 }

-/* If thread

Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in ompi/runtime/ompi_mpi_init.c is called to late

2014-12-15 Thread Pascal Deveze
George,

Thanks for the patch. That was the solution.

Pascal

De : devel [mailto:devel-boun...@open-mpi.org] De la part de George Bosilca
Envoyé : samedi 13 décembre 2014 08:38
À : Open MPI Developers
Objet : Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in 
ompi/runtime/ompi_mpi_init.c is called to late

The source of this annoyance is the widely spread usage of 
OMPI_ENABLE_THREAD_MULTIPLE as an argument for all of the component init calls. 
This is obviously wrong as OMPI_ENABLE_THREAD_MULTIPLE is not about the 
requested support of thread support but about the less restrictive thread level 
supported by the library. Luckily the solution is simple, replace 
OMPI_ENABLE_THREAD_MULTIPLE by variable ompi_mpi_thread_multiple, and there 
should be no need for checking opal_using_threads in the initializers 
(open-mpi/ompi@343071498965a8f73d5f2b0c27a7ef404caf286c).

  George.


On Fri, Dec 12, 2014 at 2:58 AM, Pascal Deveze 
mailto:pascal.dev...@bull.net>> wrote:
George,

My initial problem is that when MPI is compiled with 
“--enable-mpi-thread-multiple”, the variable enable_mpi_threads is set to 1 
even if MPI_Init() is called in place of MPI_Init_thread().
I saw also that  opal_using_threads() exists and was used by other BTLs.

Maybe the solution is to find the way to set enable_mpi_threads to 0 when 
MPI_Init() is called.


De : devel 
[mailto:devel-boun...@open-mpi.org<mailto:devel-boun...@open-mpi.org>] De la 
part de George Bosilca
Envoyé : vendredi 12 décembre 2014 07:03

À : Open MPI Developers
Objet : Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in 
ompi/runtime/ompi_mpi_init.c is called to late

On Thu, Dec 11, 2014 at 8:30 PM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Just to help me understand: I don’t think this change actually changed any 
behavior. However, it certainly *allows* a different behavior. Isn’t that true?

It depends how you look at this. To be extremely clear it prevents the modules 
from using anything else than their arguments to decide the provided threading 
model. With the current change, it is possible that some of the modules will 
continue to follow this "old" behavior, while others might switch to check 
opal_using_threads to see how they might behave.

My point here is not that one is better than the other, just that we 
inadvertently introduced a possibility for non-consistent behavior.

Let me take an example. In the old scheme, the PML was allowed to run each BTL 
in a separate thread, with absolutely no BTL support for thread safety. Thus, 
the PML could have managed all the interactions between BTL and requests in an 
atomic way, without the BTL knowing about. Now, if the BTL make his decision 
based on the value returned by opal_using_threads this approach is not possible 
anymore.

If so, I guess the real question is for Pascal at Bull: why do you feel this 
earlier setting is required?

This might allow to see if using functions that require protection, such as 
opal_lifo_push, will work by default or one should use directly their atomic 
version?

  George.



On Dec 11, 2014, at 4:21 PM, George Bosilca 
mailto:bosi...@icl.utk.edu>> wrote:

The overall design in OMPI was that no OMPI module should be allowed to decide 
if threads are on (thus it should not rely on the value returned by 
opal_using_threads during it's initialization stage). Instead, they should 
respect the level of thread support requested as an argument during the 
initialization step.

And this is true even for the BTLs. The PML component init function is 
propagating the  enable_progress_threads and enable_mpi_threads, down to the 
BML, and then to the BTL. This 2 variables, enable_progress_threads and 
enable_mpi_threads, are exactly what the ompi_mpi_init is using to compute the 
the value of the opal) using_thread (and that this patch moved).

The setting of the opal_using_threads was delayed during the initialization to 
ensure that it's value was not used to select a specific thread-level in any 
module, a behavior that is allowed now with the new setting.

A drastic change in behavior...

  George.


On Tue, Dec 9, 2014 at 3:33 AM, Ralph Castain 
mailto:r...@open-mpi.org>> wrote:
Kewl - I’ll fix. Thanks!

On Dec 9, 2014, at 12:32 AM, Pascal Deveze 
mailto:pascal.dev...@bull.net>> wrote:

Hi Ralph,

This in in the trunk.

De : devel [mailto:devel-boun...@open-mpi.org] De la part de Ralph Castain
Envoyé : mardi 9 décembre 2014 09:32
À : Open MPI Developers
Objet : Re: [OMPI devel] Patch proposed: opal_set_using_threads(true) in 
ompi/runtime/ompi_mpi_init.c is called to late

Hi Pascal

Is this in the trunk or in the 1.8 series (or both)?


On Dec 9, 2014, at 12:28 AM, Pascal Deveze 
mailto:pascal.dev...@bull.net>> wrote:


In case where MPI is compiled with --enable-mpi-thread-multiple, a call to 
opal_using_threads() always returns 0 in the routine btl_xxx_component_init() 
of t