Re: [OMPI devel] Error in VT

2009-04-01 Thread Matthias Jurenz
Hi Leonardo,

I guess that your program uses POSIX threads and needs the MPI thread
support level MPI_THREAD_MULTIPLE, right?
Unfortunately, the OMPI integrated version of VT doesn't support neither
Pthreads nor any MPI thread level.

The latest "stand-alone-version" of VT (5.6.3) supports at least
Pthreads and the MPI thread support levels MPI_THREAD_SINGLE and
MPI_THREAD_FUNNELED. So if you can change the MPI thread level
requirement to MPI_THREAD_SINGLE or MPI_THREAD_FUNNELED tracing of your
code should work.
You can download the latest VT version at
http://www.tu-dresden.de/zih/vampirtrace. Please give it a try.

Regards,
Matthias Jurenz

On Mon, 2009-03-30 at 19:04 +0200, Leonardo Fialho wrote:
> Hi Jeff,
> 
> There are...
> 
> Thanks a lot,
> Leonardo
> 
> Jeff Squyres escribió:
> > Can you send all the information listed here:
> >
> > http://www.open-mpi.org/community/help/
> >
> >
> > On Mar 30, 2009, at 11:46 AM, Leonardo Fialho wrote:
> >
> >> Hi,
> >>
> >> I'm experimenting the following errors while using Open MPI release
> >> 1.3.1 combined with VT.
> >>
> >> STAT P 2.258062 43.% 488.997562 0
> >> STAT P 2.260121 44.% 485.672638 0
> >> STAT P 2.262175 45.% 486.854935 0
> >> RFG_Regions_stackPop(): Error: Stack underflow
> >> RFG_Regions_stackPop(): Error: Stack underflow
> >> VampirTrace [vt_otf_trc.c:1300]: Resource temporarily unavailable
> >> [nodo1][[43845,1],0][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
> >> mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> >> VampirTrace [vt_otf_trc.c:1300]: Resource temporarily unavailable
> >> RFG_Regions_stackPop(): Error: Stack underflow
> >> VampirTrace [vt_otf_trc.c:1300]: Resource temporarily unavailable
> >> -- 
> >>
> >> mpirun has exited due to process rank 1 with PID 8814 on
> >> node nodo2 exiting without calling "finalize". This may
> >> have caused other processes in the application to be
> >> terminated by signals sent by mpirun (as reported here).
> >> -- 
> >>
> >> [fialho@aoclsd gmwat]$
> >>
> >> Along different executions the error occurs in different situations.
> >>
> >> Any help?
> >>
> >> Thanks,
> >>
> >> -- 
> >> Leonardo Fialho
> >> Computer Architecture and Operating Systems Department - CAOS
> >> Universidad Autonoma de Barcelona - UAB
> >> ETSE, Edifcio Q, QC/3088
> >> http://www.caos.uab.es
> >> Phone: +34-93-581-2888
> >> Fax: +34-93-581-2478
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> 
> 
> plain text document attachment (environ.txt)
> declare -x 
> LD_LIBRARY_PATH="/home/fialho/local/tau-2.18.1p1/i386_linux/lib:/home/fialho/OSS/lib:/home/fialho/gmate/lib:/home/fialho/local/openmpi-1.3.1/lib:/home/fialho/dyninst/lib:/home/fialho/local/lib:/home/fialho/local/gsl-1.9/lib/:/home/fialho/dyninst/lib"
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] SM init failures

2009-04-01 Thread Iain Bason


On Mar 31, 2009, at 11:00 AM, Jeff Squyres wrote:


On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote:


Sorry to continue off-topic but going to System V shm would be for me
like going back in the past.

System V shared memory used to be the main way to do shared memory on
MPICH and from my (little) experience, this was truly painful :
 - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even  
kill

-9 ?)
 - Naming issues : shm segments identified as 32 bits key potentially
causing conflicts between applications or layers of the same  
application

on one node
 - Space issues : the total shm size on a system is bound to
/proc/sys/kernel/shmmax, needing admin configuration and causing  
conflicts

between MPI applications running on the same node



Indeed.  The one saving grace here is that the cleanup issues  
apparently can be solved on Linux with a special flag that indicates  
"automatically remove this shmem when all processes attaching to it  
have died."  That was really the impetus for [re-]investigating sysv  
shm.  I, too, remember the sysv pain because we used it in LAM, too...


What about the other issues?  I remember those being a PITA about 15  
to 20 years ago, but obviously a lot could have improved in the  
meantime.


Iain



[OMPI devel] mallopt fixes

2009-04-01 Thread Jeff Squyres
Inevitably, when you're testing in your own, private environment,  
everything works great.  You test test test and are finally convinced  
that it's all perfect.  Seconds after you merge it into the main SVN  
trunk, you find a dozen little mistakes.  Sigh.  :-\


After a bunch of SVN commits, I think I have all the mallopt fixes on  
the SVN trunk.  Please test as much as you can.  We should let this  
soak on the trunk for a day or three before moving it to the v1.3  
branch (CMR already filed).


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r20926

2009-04-01 Thread Jeff Squyres

Ah -- good catch.  Thanks.

Should the same fixes be applied to type_create_keyval_f.c and  
win_create_keyval_f.c?



On Apr 1, 2009, at 3:31 PM,  wrote:


Author: igb
Date: 2009-04-01 15:31:46 EDT (Wed, 01 Apr 2009)
New Revision: 20926
URL: https://svn.open-mpi.org/trac/ompi/changeset/20926

Log:
Fix Fortran bindings for MPI_KEYVAL_CREATE and MPI_COMM_CREATE_KEYVAL.

The EXTRA_STATE parameter is passed by reference, and thus should be
dereferenced before it is stored.  Similarly, the stored value should
be passed by reference to the copy and delete routines.

This fixes #1864.

Text files modified:
   trunk/ompi/attribute/attribute.c  |16 ++--
   trunk/ompi/mpi/f77/comm_create_keyval_f.c | 2 +-
   trunk/ompi/mpi/f77/keyval_create_f.c  | 2 +-
   3 files changed, 12 insertions(+), 8 deletions(-)

Modified: trunk/ompi/attribute/attribute.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/attribute/attribute.c(original)
+++ trunk/ompi/attribute/attribute.c2009-04-01 15:31:46 EDT  
(Wed, 01 Apr 2009)

@@ -249,9 +249,10 @@
 /* MPI-1 Fortran-style */ \
 if (0 != (keyval_obj->attr_flag & OMPI_KEYVAL_F77_MPI1)) { \
 MPI_Fint attr_val =  
translate_to_fortran_mpi1(attribute); \
+   MPI_Fint extra_state = (MPI_Fint)keyval_obj- 
>extra_state;   \
 (*((keyval_obj- 
>delete_attr_fn).attr_mpi1_fortran_delete_fn)) \

 (&(((ompi_##type##_t *)object)->attr_##type##_f), \
- &f_key, &attr_val, (int*)keyval_obj->extra_state,  
&f_err); \

+ &f_key, &attr_val, &extra_state, &f_err); \
 if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \
 if (need_lock) { \
 OPAL_THREAD_UNLOCK(&alock); \
@@ -262,9 +263,10 @@
 /* MPI-2 Fortran-style */ \
 else { \
 MPI_Aint attr_val =  
translate_to_fortran_mpi2(attribute); \
+   MPI_Aint extra_state = (MPI_Aint)keyval_obj- 
>extra_state;   \
 (*((keyval_obj- 
>delete_attr_fn).attr_mpi2_fortran_delete_fn)) \

 (&(((ompi_##type##_t *)object)->attr_##type##_f), \
- &f_key, (int*)&attr_val, (int*)keyval_obj- 
>extra_state, &f_err); \

+ &f_key, (int*)&attr_val, &extra_state, &f_err); \
 if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \
 if (need_lock) { \
 OPAL_THREAD_UNLOCK(&alock); \
@@ -297,11 +299,12 @@
 ompi_fortran_logical_t f_flag; \
 /* MPI-1 Fortran-style */ \
 if (0 != (keyval_obj->attr_flag & OMPI_KEYVAL_F77_MPI1)) { \
-MPI_Fint in, out; \
+   MPI_Fint in, out,  
extra_state; \

 in = translate_to_fortran_mpi1(in_attr); \
+   extra_state = (MPI_Fint)keyval_obj->extra_state; \
 (*((keyval_obj- 
>copy_attr_fn).attr_mpi1_fortran_copy_fn)) \
 (&(((ompi_##type##_t *)old_object)- 
>attr_##type##_f), \

- &f_key, (int*)keyval_obj->extra_state, \
+ &f_key, &extra_state, \
  &in, &out, &f_flag, &f_err); \
 if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \
 OPAL_THREAD_UNLOCK(&alock); \
@@ -313,11 +316,12 @@
 } \
 /* MPI-2 Fortran-style */ \
 else { \
-MPI_Aint in, out; \
+   MPI_Aint in, out, extra_state;   \
 in = translate_to_fortran_mpi2(in_attr); \
+   extra_state = (MPI_Aint)keyval_obj->extra_state; \
 (*((keyval_obj- 
>copy_attr_fn).attr_mpi2_fortran_copy_fn)) \
 (&(((ompi_##type##_t *)old_object)- 
>attr_##type##_f), \

- &f_key, keyval_obj->extra_state, &in, &out, \
+ &f_key, &extra_state, &in, &out, \
  &f_flag, &f_err); \
 if (MPI_SUCCESS != OMPI_FINT_2_INT(f_err)) { \
 OPAL_THREAD_UNLOCK(&alock); \

Modified: trunk/ompi/mpi/f77/comm_create_keyval_f.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mpi/f77/comm_create_keyval_f.c   (original)
+++ trunk/ompi/mpi/f77/comm_create_keyval_f.c   2009-04-01 15:31:46  
EDT (Wed, 01 Apr 2009)

@@ -79,7 +79,7 @@
to the old MPI-1 INTEGER-parameter functions). */

 ret = ompi_attr_create_keyval(COMM_ATTR, copy_fn, del_fn,
-  comm_keyval, extra_state,  
OMPI_KEYVAL_F77,
+  comm_keyval, (void*)*extra_state,  
OMPI_KEYVAL_F77,

   NULL);

 if (MPI_SUCCESS != ret) {

Modified: trunk/ompi/mpi/f77/keyval_create_f.c
= 
= 
= 
= 
= 
= 
= 
= 
==

--- trunk/ompi/mpi/f77/keyval_create_f.c(original)
+++ trunk/ompi/mpi/f77/keyval_create_f.c   

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r20926

2009-04-01 Thread Iain Bason


On Apr 1, 2009, at 4:29 PM, Jeff Squyres wrote:

Should the same fixes be applied to type_create_keyval_f.c and  
win_create_keyval_f.c?


Good question.  I'll have a look at them.

Iain



Re: [OMPI devel] SM init failures

2009-04-01 Thread Ashley Pittman
On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote:
> On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote:
> > System V shared memory used to be the main way to do shared memory on
> > MPICH and from my (little) experience, this was truly painful :
> >   - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ? (even  
> > kill
> > -9 ?)
> Indeed.  The one saving grace here is that the cleanup issues  
> apparently can be solved on Linux with a special flag that indicates  
> "automatically remove this shmem when all processes attaching to it  
> have died."  That was really the impetus for [re-]investigating sysv  
> shm.  I, too, remember the sysv pain because we used it in LAM, too...

Unless there is something newer than IPC_RMID that I haven't heard of
this is far from a complete solution, setting RMID causes it to be
deleted when the attach count becomes zero so it handles the kill -9
case however it has the down side that once it's been set no further
processes can attach to the memory so you have to leave a window during
init during which any crash will leave the memory.

I've always been of the opinion that mmaping shared files was a much
more advanced solution.

Ashley Pittman.



[OMPI devel] Open MPI 2009 released

2009-04-01 Thread George Bosilca

The Open MPI Team, representing a consortium of bailed-out banks, car
manufacturers, and insurance companies, is pleased to announce the
release of the "unbreakable" / bug-free version Open MPI 2009,
(expected to be available by mid-2011).  This release is essentially a
complete rewrite of Open MPI based on new technologies such as C#,
Java, and object-oriented Cobol (so say we all!).  Buffer overflows
and memory leaks are now things of the past.  We strongly recommend
that all users upgrade to Windows 7 to fully take advantage of the new
powers embedded in Open MPI.

This version can be downloaded from the The Onion web site or from
many BitTorrent networks (seeding now; the Open MPI ISO is
approximately 3.97GB -- please wait for the full upload).

Here is an abbreviated list of changes in Open MPI 2009 as compared to
the previous version:

- Dropped support for MPI 2 in favor of the newly enhanced MPI 11.7
 standard.  MPI_COOK_DINNER support is only available with additional
 equipment (some assembly may be required).  An experimental PVM-like
 API has been introduced to deal with the current limitations of the
 MPI 11.7 API.
- Added a Twitter network transport capable of achieving peta-scale
 per second bandwidth (but only on useless data).
- Dropped support for the barely-used x86 and x86_64 architectures in
 favor of the most recent ARM6 architecture.  As a direct result,
 several Top500 sites are planning to convert from their now obsolete
 peta-scale machines to high-reliability iPhone clusters using the
 low-latency AT&T 3G network.
- The iPhone iMPI app (powered by iOpen MPI) is now downloadable from
 the iTunes Store.  Blackberry support will be included in a future
 release.
- Fix all compiler errors related to the PGI 8.0 compiler by
 completely dropping support.
- Add some "green" features for energy savings.  The new "--bike"
 mpirun option will only run your parallel jobs only during the
 operation hours of the official Open MPI biking team.  The
 "--preload-result" option will directly embed the final result in
 the parallel execution, leading to more scalable and reliable runs
 and decreasing the execution time of any parallel application under
 the real-time limit of 1 second.  Open MPI is therefore EnergyStar
 compliant when used with these options.
- In addition to moving Open MPI's lowest point-to-point transports to
 be an external project, limited support will be offered for
 industry-standard platforms.  Our focus will now be to develop
 highly scalable transports based on widely distributed technologies
 such as SMTP, High Performance Gopher (v3.8 and later), OLE COMM,
 RSS/Atom, DNS, and Bonjour.
- Opportunistic integration with Conflicker in order to utilize free
 resources distributed world-wide.
- Support for all Fortran versions prior to Fortran 2020 has been
 dropped.

Make today an Open MPI day!




Re: [OMPI devel] SM init failures

2009-04-01 Thread Jeff Squyres

So everyone hates SYSV.  Ok.  :-)

Given that part of the problems we've been having with mmap have been  
due to filesystem issues, should we just unlink() the file once all  
processes have mapped it?  I believe we didn't do that originally for  
two reasons:


- leave it around for debugging purposes
- possibly supporting MPI-2 dynamics someday

We still don't support the sm BTL for dynamics, so why not unlink()?   
(I'm probably forgetting something obvious...?)




On Apr 1, 2009, at 5:12 PM, Ashley Pittman wrote:


On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote:
> On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote:
> > System V shared memory used to be the main way to do shared  
memory on

> > MPICH and from my (little) experience, this was truly painful :
> >   - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ?  
(even

> > kill
> > -9 ?)
> Indeed.  The one saving grace here is that the cleanup issues
> apparently can be solved on Linux with a special flag that indicates
> "automatically remove this shmem when all processes attaching to it
> have died."  That was really the impetus for [re-]investigating sysv
> shm.  I, too, remember the sysv pain because we used it in LAM,  
too...


Unless there is something newer than IPC_RMID that I haven't heard of
this is far from a complete solution, setting RMID causes it to be
deleted when the attach count becomes zero so it handles the kill -9
case however it has the down side that once it's been set no further
processes can attach to the memory so you have to leave a window  
during

init during which any crash will leave the memory.

I've always been of the opinion that mmaping shared files was a much
more advanced solution.

Ashley Pittman.

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] SM init failures

2009-04-01 Thread Ralph Castain
IIRC, we certainly used to unlink the file after init. Are you sure  
somebody changed that?



On Apr 1, 2009, at 4:29 PM, Jeff Squyres wrote:


So everyone hates SYSV.  Ok.  :-)

Given that part of the problems we've been having with mmap have  
been due to filesystem issues, should we just unlink() the file once  
all processes have mapped it?  I believe we didn't do that  
originally for two reasons:


- leave it around for debugging purposes
- possibly supporting MPI-2 dynamics someday

We still don't support the sm BTL for dynamics, so why not  
unlink()?  (I'm probably forgetting something obvious...?)




On Apr 1, 2009, at 5:12 PM, Ashley Pittman wrote:


On Tue, 2009-03-31 at 11:00 -0400, Jeff Squyres wrote:
> On Mar 31, 2009, at 3:45 AM, Sylvain Jeaugey wrote:
> > System V shared memory used to be the main way to do shared  
memory on

> > MPICH and from my (little) experience, this was truly painful :
> >   - Cleanup issues : does shmctl(IPC_RMID) solve _all_ cases ?  
(even

> > kill
> > -9 ?)
> Indeed.  The one saving grace here is that the cleanup issues
> apparently can be solved on Linux with a special flag that  
indicates

> "automatically remove this shmem when all processes attaching to it
> have died."  That was really the impetus for [re-]investigating  
sysv
> shm.  I, too, remember the sysv pain because we used it in LAM,  
too...


Unless there is something newer than IPC_RMID that I haven't heard of
this is far from a complete solution, setting RMID causes it to be
deleted when the attach count becomes zero so it handles the kill -9
case however it has the down side that once it's been set no further
processes can attach to the memory so you have to leave a window  
during

init during which any crash will leave the memory.

I've always been of the opinion that mmaping shared files was a much
more advanced solution.

Ashley Pittman.

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Open MPI 2009 released

2009-04-01 Thread Paul H. Hargrove

Bravo!! This is beautiful.
By far my favorite part is "Cobol (so say we all!)".
However, I question why ARM6 was targeted as opposed to ARM7 ;-)

-Paul

George Bosilca wrote:

The Open MPI Team, representing a consortium of bailed-out banks, car
manufacturers, and insurance companies, is pleased to announce the
release of the "unbreakable" / bug-free version Open MPI 2009,
(expected to be available by mid-2011).  This release is essentially a
complete rewrite of Open MPI based on new technologies such as C#,
Java, and object-oriented Cobol (so say we all!).  Buffer overflows
and memory leaks are now things of the past.  We strongly recommend
that all users upgrade to Windows 7 to fully take advantage of the new
powers embedded in Open MPI.

This version can be downloaded from the The Onion web site or from
many BitTorrent networks (seeding now; the Open MPI ISO is
approximately 3.97GB -- please wait for the full upload).

Here is an abbreviated list of changes in Open MPI 2009 as compared to
the previous version:

- Dropped support for MPI 2 in favor of the newly enhanced MPI 11.7
 standard.  MPI_COOK_DINNER support is only available with additional
 equipment (some assembly may be required).  An experimental PVM-like
 API has been introduced to deal with the current limitations of the
 MPI 11.7 API.
- Added a Twitter network transport capable of achieving peta-scale
 per second bandwidth (but only on useless data).
- Dropped support for the barely-used x86 and x86_64 architectures in
 favor of the most recent ARM6 architecture.  As a direct result,
 several Top500 sites are planning to convert from their now obsolete
 peta-scale machines to high-reliability iPhone clusters using the
 low-latency AT&T 3G network.
- The iPhone iMPI app (powered by iOpen MPI) is now downloadable from
 the iTunes Store.  Blackberry support will be included in a future
 release.
- Fix all compiler errors related to the PGI 8.0 compiler by
 completely dropping support.
- Add some "green" features for energy savings.  The new "--bike"
 mpirun option will only run your parallel jobs only during the
 operation hours of the official Open MPI biking team.  The
 "--preload-result" option will directly embed the final result in
 the parallel execution, leading to more scalable and reliable runs
 and decreasing the execution time of any parallel application under
 the real-time limit of 1 second.  Open MPI is therefore EnergyStar
 compliant when used with these options.
- In addition to moving Open MPI's lowest point-to-point transports to
 be an external project, limited support will be offered for
 industry-standard platforms.  Our focus will now be to develop
 highly scalable transports based on widely distributed technologies
 such as SMTP, High Performance Gopher (v3.8 and later), OLE COMM,
 RSS/Atom, DNS, and Bonjour.
- Opportunistic integration with Conflicker in order to utilize free
 resources distributed world-wide.
- Support for all Fortran versions prior to Fortran 2020 has been
 dropped.

Make today an Open MPI day!


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group Tel: +1-510-495-2352
HPC Research Department   Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory 



Re: [OMPI devel] Open MPI 2009 released

2009-04-01 Thread Jeff Squyres (jsquyres)
My wife thought it was frackin' brilliant.  :)

-jms
Sent from my PDA.  No type good.

- Original Message -
From: devel-boun...@open-mpi.org 
To: Open MPI Developers 
Sent: Wed Apr 01 18:58:55 2009
Subject: Re: [OMPI devel] Open MPI 2009 released

Bravo!! This is beautiful.
By far my favorite part is "Cobol (so say we all!)".
However, I question why ARM6 was targeted as opposed to ARM7 ;-)

-Paul

George Bosilca wrote:
> The Open MPI Team, representing a consortium of bailed-out banks, car
> manufacturers, and insurance companies, is pleased to announce the
> release of the "unbreakable" / bug-free version Open MPI 2009,
> (expected to be available by mid-2011).  This release is essentially a
> complete rewrite of Open MPI based on new technologies such as C#,
> Java, and object-oriented Cobol (so say we all!).  Buffer overflows
> and memory leaks are now things of the past.  We strongly recommend
> that all users upgrade to Windows 7 to fully take advantage of the new
> powers embedded in Open MPI.
>
> This version can be downloaded from the The Onion web site or from
> many BitTorrent networks (seeding now; the Open MPI ISO is
> approximately 3.97GB -- please wait for the full upload).
>
> Here is an abbreviated list of changes in Open MPI 2009 as compared to
> the previous version:
>
> - Dropped support for MPI 2 in favor of the newly enhanced MPI 11.7
>  standard.  MPI_COOK_DINNER support is only available with additional
>  equipment (some assembly may be required).  An experimental PVM-like
>  API has been introduced to deal with the current limitations of the
>  MPI 11.7 API.
> - Added a Twitter network transport capable of achieving peta-scale
>  per second bandwidth (but only on useless data).
> - Dropped support for the barely-used x86 and x86_64 architectures in
>  favor of the most recent ARM6 architecture.  As a direct result,
>  several Top500 sites are planning to convert from their now obsolete
>  peta-scale machines to high-reliability iPhone clusters using the
>  low-latency AT&T 3G network.
> - The iPhone iMPI app (powered by iOpen MPI) is now downloadable from
>  the iTunes Store.  Blackberry support will be included in a future
>  release.
> - Fix all compiler errors related to the PGI 8.0 compiler by
>  completely dropping support.
> - Add some "green" features for energy savings.  The new "--bike"
>  mpirun option will only run your parallel jobs only during the
>  operation hours of the official Open MPI biking team.  The
>  "--preload-result" option will directly embed the final result in
>  the parallel execution, leading to more scalable and reliable runs
>  and decreasing the execution time of any parallel application under
>  the real-time limit of 1 second.  Open MPI is therefore EnergyStar
>  compliant when used with these options.
> - In addition to moving Open MPI's lowest point-to-point transports to
>  be an external project, limited support will be offered for
>  industry-standard platforms.  Our focus will now be to develop
>  highly scalable transports based on widely distributed technologies
>  such as SMTP, High Performance Gopher (v3.8 and later), OLE COMM,
>  RSS/Atom, DNS, and Bonjour.
> - Opportunistic integration with Conflicker in order to utilize free
>  resources distributed world-wide.
> - Support for all Fortran versions prior to Fortran 2020 has been
>  dropped.
>
> Make today an Open MPI day!
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group Tel: +1-510-495-2352
HPC Research Department   Fax: +1-510-486-6900
Lawrence Berkeley National Laboratory 

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] SM init failures

2009-04-01 Thread Jeff Squyres

On Apr 1, 2009, at 6:58 PM, Ralph Castain wrote:


IIRC, we certainly used to unlink the file after init. Are you sure
somebody changed that?



It looks like we unlink() it during btl sm component close  
(effectively during MPI_FINALIZE), not before.


--
Jeff Squyres
Cisco Systems



[OMPI devel] trac 1857: SM btl hangs when msg >=4k

2009-04-01 Thread Eugene Loh
In osu_bw, process 0 pumps lots of Isend's to process 1, and process 1 
in turn sets up lots of matching Irecvs.  Many messages are in flight.  
The question is what happens when resources are exhausted and OMPI 
cannot handle so much in-flight traffic.  Let's specifically consider 
the case of long, rendezvous messages.  There are at least two situations.


1) When the sender no longer has any fragments (nor can grow its free 
list any more), it queues a send up with add_request_to_send_pending() 
and somehow life is good.  The PML seems to handle this case "correctly".


2) When the receiver -- specifically 
mca_pml_ob1_recv_request_ack_send_btl() -- no longer has any fragments 
to send ACKs back to confirm readiness for rendezvous, the 
resource-exhaustion signal travels up the call stack to 
mca_pml_ob1_recv_request_ack_send(), who does a 
MCA_PML_OB1_ADD_ACK_TO_PENDING().  In short, the PML adds the ACK to 
pckt_pending.  Somehow, this code path doesn't work.


The reason we see the problem now is that I added "autosizing" of the 
shared-memory area.  We used to mmap *WAY* too much shared-memory for 
small-np jobs.  (Yes, that's a subjective statement.)  Meanwhile, at 
large-np, we didn't mmap enough and jobs wouldn't start.  (Objective 
statement there.)  So, I added heuristics to size the shared area 
"appropriately".  The heuristics basically targetted the needs of 
MPI_Init().  If you want fragment free lists to grow on demand after 
MPI_Init(), you now basically have to bump mpool_sm_min_size up explicitly.


I'd like feedback on a fix.  Here are two options:

A) Someone (could be I) increases the default resources.  E.g., we could 
start with a larger eager free list.  Or, I could change those 
"heuristics" to allow some amount of headroom for free lists to grow on 
demand.  Either way, I'd appreciate feedback on how big to set these things.


B) Someone (not I, since I don't know how) fixes the ob1 PML to handle 
scenario 2 above correctly.