[OMPI devel] openmpi-1.2.4 compilation error in orte_abort.c on Fedora 8 - patch included

2007-12-10 Thread Sebastian Schmitzdorff

Hi,

on Fedora 8 x86_64 openmpi-1.2.4 doesn't compile.
A quick glance at the nightly openmpi snapshot leads me to the 
conclusion that

this is still the case.


In function 'open',
  inlined from 'orte_abort' at runtime/orte_abort.c:91:
/usr/include/bits/fcntl2.h:51: error: call to '__open_missing_mode' 
declared with attribute error: open with O_CREAT in second argument 
needs 3 arguments

make[1]: *** [runtime/orte_abort.lo] Error 1
make[1]: Leaving directory `/var/tmp/OFED_topdir/BUILD/openmpi-1.2.4/orte'
make: *** [all-recursive] Error 1


There is a missing filemode in "open" in orte_abort.c:91.
fcntl2.h doesnt allow this anymore.

please find the simple diff below.


--- runtime/orte_abort.c2007-12-10 00:01:50.0 +0100
+++ test2007-12-10 00:01:00.0 +0100
@@ -88,7 +88,7 @@
   ORTE_ERROR_LOG(ORTE_ERR_OUT_OF_RESOURCE);
   goto CLEANUP;
   }
-fd = open(abort_file, O_CREAT);
+fd = open(abort_file, O_CREAT, 0666);
   if (0 < fd) close(fd);
   }


Hope this is the right place for the diff.

regards
sebastian

--

Sebastian Schmitzdorff - Managing Director
Hamburgnet
http://www.hamburgnet.de
Kottwitzstrasse 49 D-20253 Hamburg
fon: +49 40 736 72-322 fax: +49 40 736 72-321



[OMPI devel] Print warning in v1.2 series if THREAD_MULTIPLE or progress threads are used

2007-12-10 Thread Jeff Squyres
Per my prior mail, I just filed a patch on CMR #1198 to print a big  
warning on MPI_COMM_WORLD rank 0 if MPI_THREAD_MULTIPLE and/or  
progress threads are used.  We know that this functionality basically  
doesn't work, so instead of getting lots more mail to the users list,  
let's at least put a disclaimer message at run time saying that it  
doesn't work.


Anyone have any objections to putting this in v1.2.5?

https://svn.open-mpi.org/trac/ompi/ticket/1198

--
Jeff Squyres
Cisco Systems


[OMPI devel] [PATCH] openib btl: correct help message error

2007-12-10 Thread Jon Mason
Slight word usage and grammar error in the openib btl help test.  I
believe the change below is the intended meaning.

Thanks,
Jon

Index: ompi/mca/btl/openib/help-mpi-btl-openib.txt
===
--- ompi/mca/btl/openib/help-mpi-btl-openib.txt (revision 16892)
+++ ompi/mca/btl/openib/help-mpi-btl-openib.txt (working copy)
@@ -164,7 +164,7 @@
   See the InfiniBand spec 1.2 (section 12.7.34) for more details.
 #
 [no active ports found]
-WARNING: There is at least on IB HCA found on host '%s', but there is
+WARNING: There is at least one IB HCA found on host '%s', but there are
 no active ports detected. This is most certainly not what you wanted.
 Check your cables and SM configuration.
 #


Re: [OMPI devel] [PATCH] openib btl: correct help message error

2007-12-10 Thread Jeff Squyres

Cool; thanks.  Go ahead and commit.

BTW, we work a bit differently here in OMPI as compared to the  
OpenFabrics community -- you don't need to mail all patches to the  
list before committing (especially for trivial fixes like this :-) ).




On Dec 10, 2007, at 4:05 PM, Jon Mason wrote:


Slight word usage and grammar error in the openib btl help test.  I
believe the change below is the intended meaning.

Thanks,
Jon

Index: ompi/mca/btl/openib/help-mpi-btl-openib.txt
===
--- ompi/mca/btl/openib/help-mpi-btl-openib.txt (revision 16892)
+++ ompi/mca/btl/openib/help-mpi-btl-openib.txt (working copy)
@@ -164,7 +164,7 @@
  See the InfiniBand spec 1.2 (section 12.7.34) for more details.
#
[no active ports found]
-WARNING: There is at least on IB HCA found on host '%s', but there is
+WARNING: There is at least one IB HCA found on host '%s', but there  
are

no active ports detected. This is most certainly not what you wanted.
Check your cables and SM configuration.
#
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] [PATCH] openib btl: correct help message error

2007-12-10 Thread Jon Mason
On Mon, Dec 10, 2007 at 04:14:57PM -0500, Jeff Squyres wrote:
> Cool; thanks.  Go ahead and commit.

Will Do.

> 
> BTW, we work a bit differently here in OMPI as compared to the  
> OpenFabrics community -- you don't need to mail all patches to the  
> list before committing (especially for trivial fixes like this :-) ).

Sorry, I was just trying to err on the side of caution and openness.  Do
you have a rule of thumb for what should be sent on the list versus
simply committed?

Thanks,
Jon

> 
> 
> On Dec 10, 2007, at 4:05 PM, Jon Mason wrote:
> 
> > Slight word usage and grammar error in the openib btl help test.  I
> > believe the change below is the intended meaning.
> >
> > Thanks,
> > Jon
> >
> > Index: ompi/mca/btl/openib/help-mpi-btl-openib.txt
> > ===
> > --- ompi/mca/btl/openib/help-mpi-btl-openib.txt (revision 16892)
> > +++ ompi/mca/btl/openib/help-mpi-btl-openib.txt (working copy)
> > @@ -164,7 +164,7 @@
> >   See the InfiniBand spec 1.2 (section 12.7.34) for more details.
> > #
> > [no active ports found]
> > -WARNING: There is at least on IB HCA found on host '%s', but there is
> > +WARNING: There is at least one IB HCA found on host '%s', but there  
> > are
> > no active ports detected. This is most certainly not what you wanted.
> > Check your cables and SM configuration.
> > #
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> -- 
> Jeff Squyres
> Cisco Systems
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] [PATCH] openib btl: correct help message error

2007-12-10 Thread Jeff Squyres

On Dec 10, 2007, at 4:49 PM, Jon Mason wrote:


BTW, we work a bit differently here in OMPI as compared to the
OpenFabrics community -- you don't need to mail all patches to the
list before committing (especially for trivial fixes like this :-) ).


Sorry, I was just trying to err on the side of caution and  
openness.  Do

you have a rule of thumb for what should be sent on the list versus
simply committed?


"Big" things?  Or when you're unsure if what you're doing will work,  
etc.  In those cases, if you need to go off and do some temporary  
development for a while, use /tmp or /tmp-public (SVN is a little  
different than git).  When you're ready to bring it back to the trunk,  
if it's a Big Change, ask others to test it, etc.


--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] openmpi-1.2.4 compilation error in orte_abort.c on Fedora 8 - patch included

2007-12-10 Thread Jeff Squyres

Yo Ralph --

I see you committed this to the ORTE-future branch.  Any objections to  
me committing to trunk/v1.2?


(Thanks Sebastian -- stupid Fedora! ;-) )


On Dec 10, 2007, at 11:02 AM, Sebastian Schmitzdorff wrote:


Hi,

on Fedora 8 x86_64 openmpi-1.2.4 doesn't compile.
A quick glance at the nightly openmpi snapshot leads me to the
conclusion that
this is still the case.


In function 'open',
  inlined from 'orte_abort' at runtime/orte_abort.c:91:
/usr/include/bits/fcntl2.h:51: error: call to '__open_missing_mode'
declared with attribute error: open with O_CREAT in second argument
needs 3 arguments
make[1]: *** [runtime/orte_abort.lo] Error 1
make[1]: Leaving directory `/var/tmp/OFED_topdir/BUILD/openmpi-1.2.4/ 
orte'

make: *** [all-recursive] Error 1


There is a missing filemode in "open" in orte_abort.c:91.
fcntl2.h doesnt allow this anymore.

please find the simple diff below.


--- runtime/orte_abort.c2007-12-10 00:01:50.0 +0100
+++ test2007-12-10 00:01:00.0 +0100
@@ -88,7 +88,7 @@
   ORTE_ERROR_LOG(ORTE_ERR_OUT_OF_RESOURCE);
   goto CLEANUP;
   }
-fd = open(abort_file, O_CREAT);
+fd = open(abort_file, O_CREAT, 0666);
   if (0 < fd) close(fd);
   }


Hope this is the right place for the diff.

regards
sebastian

--

Sebastian Schmitzdorff - Managing Director
Hamburgnet
http://www.hamburgnet.de
Kottwitzstrasse 49 D-20253 Hamburg
fon: +49 40 736 72-322 fax: +49 40 736 72-321

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] openmpi-1.2.4 compilation error in orte_abort.c on Fedora 8 - patch included

2007-12-10 Thread Ralph Castain
Nah, go ahead! Just change the permission to 0660 - that's a private file
that others shouldn't really perturb.

Ralph



On 12/10/07 2:59 PM, "Jeff Squyres"  wrote:

> Yo Ralph --
> 
> I see you committed this to the ORTE-future branch.  Any objections to
> me committing to trunk/v1.2?
> 
> (Thanks Sebastian -- stupid Fedora! ;-) )
> 
> 
> On Dec 10, 2007, at 11:02 AM, Sebastian Schmitzdorff wrote:
> 
>> Hi,
>> 
>> on Fedora 8 x86_64 openmpi-1.2.4 doesn't compile.
>> A quick glance at the nightly openmpi snapshot leads me to the
>> conclusion that
>> this is still the case.
>> 
>> 
>> In function 'open',
>>   inlined from 'orte_abort' at runtime/orte_abort.c:91:
>> /usr/include/bits/fcntl2.h:51: error: call to '__open_missing_mode'
>> declared with attribute error: open with O_CREAT in second argument
>> needs 3 arguments
>> make[1]: *** [runtime/orte_abort.lo] Error 1
>> make[1]: Leaving directory `/var/tmp/OFED_topdir/BUILD/openmpi-1.2.4/
>> orte'
>> make: *** [all-recursive] Error 1
>> 
>> 
>> There is a missing filemode in "open" in orte_abort.c:91.
>> fcntl2.h doesnt allow this anymore.
>> 
>> please find the simple diff below.
>> 
>> 
>> --- runtime/orte_abort.c2007-12-10 00:01:50.0 +0100
>> +++ test2007-12-10 00:01:00.0 +0100
>> @@ -88,7 +88,7 @@
>>ORTE_ERROR_LOG(ORTE_ERR_OUT_OF_RESOURCE);
>>goto CLEANUP;
>>}
>> -fd = open(abort_file, O_CREAT);
>> +fd = open(abort_file, O_CREAT, 0666);
>>if (0 < fd) close(fd);
>>}
>> 
>> 
>> Hope this is the right place for the diff.
>> 
>> regards
>> sebastian
>> 
>> -- 
>> 
>> Sebastian Schmitzdorff - Managing Director
>> Hamburgnet
>> http://www.hamburgnet.de
>> Kottwitzstrasse 49 D-20253 Hamburg
>> fon: +49 40 736 72-322 fax: +49 40 736 72-321
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 




[OMPI devel] Dynamically Turning On and Off Memory Manager of Open MPI at Runtime??

2007-12-10 Thread Peter Wong

Open MPI defines its own malloc (by default), so malloc of glibc
is not called.

But, without calling malloc of glibc, the allocator of libhugetlbfs
to back text and dynamic data by large pages, e.g., 16MB pages on
POWER systems, is not used.

Indeed, we can build Open MPI with --with-memory-manager=none.

I am wondering the feasibility of turning the memory manger on
and off dynamically at runtime as a new feature?

Thanks,
Peter Wong

Re: [OMPI devel] Dynamically Turning On and Off Memory Manager of Open MPI at Runtime??

2007-12-10 Thread Brian W. Barrett

On Mon, 10 Dec 2007, Peter Wong wrote:


Open MPI defines its own malloc (by default), so malloc of glibc
is not called.

But, without calling malloc of glibc, the allocator of libhugetlbfs
to back text and dynamic data by large pages, e.g., 16MB pages on
POWER systems, is not used.

Indeed, we can build Open MPI with --with-memory-manager=none.

I am wondering the feasibility of turning the memory manger on
and off dynamically at runtime as a new feature?


Hi Peter -

The problem is that we actually intercept the malloc() call, so once we've 
done that (which is a link-time thing), it's too late to use the 
underlying malloc to actually do its thing.


I was going to add some code to Open MPI to make it an application link 
time choice (rather than an OMPI-build time choice), but unfortunately 
my current day to day work is not on Open MPI, so unless someone else 
picks it up, it's unlikely this will get implemented in the near future. 
Of course, if someone has the time and desire, I can describe to them what 
I was thinking.


The only way I've found to do memory tracking at run-time is to use 
LD_PRELOAD tricks, which I believe there were some other (easy to 
overcome) problems with.


What would be really nice (although unlikely to occur) is if there was a 
thread-safe way to hook into the memory manager directly (rather than 
playing linking tricks).  GLIBC's malloc provides hooks, but they aren't 
thread safe (as in two user threads calling malloc at the same time would 
result in badness).  Darwin/Mac OS X provides thread-safe hooks that work 
very well (don't require linker tricks and can be turned off at run-time), 
but are slightly higher level than what we want -- there we can intercept 
malloc/free, but what we'd really like to know is when memory is being 
given back to the operating system.


Hope this helps,

Brian


Re: [OMPI devel] Dynamically Turning On and Off Memory Manager of Open MPI at Runtime??

2007-12-10 Thread Patrick Geoffray

Hi Peter,

Peter Wong wrote:

Open MPI defines its own malloc (by default), so malloc of glibc
is not called.

But, without calling malloc of glibc, the allocator of libhugetlbfs
to back text and dynamic data by large pages, e.g., 16MB pages on
POWER systems, is not used.


You could modify ptmalloc2 in OpenMPI to allocate Huge Pages directly. 
It would be a nice feature.


Patrick