Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 02:11:02AM +0200, Uwe Hermann wrote:

> > | The 1.2.3 release also works fine:
> I think Adrian used a tarball, not the Debian package?
> I'll try a local, manual install too, maybe the bug is Debian-related only?

I've tried both: the tarball works fine, the Debian package
segfaults. I suspect it's the threading support, so someone (Uwe?) could
try to remove it from debian/rules.

Ok, I'll check this for amd64, but it will take some time to compile in
the qemu ;)




-- 
mail: a...@thur.de  http://adi.thur.de  PGP/GPG: key via keyserver

Die Stosstange ist aller Laster Anfang.


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Jeff Squyres

On Aug 16, 2007, at 8:11 PM, Uwe Hermann wrote:

With the libc0.1 fix (and another small patch for Debian which I'll  
send soon)

both the kfreebsd-i386 and kfreebsd-amd64 packages build fine.

However, on my systems, both i386 and amd64 still segfault. I'm using
the openmpi Debian packages, version 1.2.3-3.

I'll try the stock tarballs soon, and/or wait for 1.2.4 to see if the
bug is already fixed there...


FWIW, if you've got the cycles, try a 1.2 branch nightly tarball  
(i.e., they're what will eventually become 1.2.4):


http://www.open-mpi.org/nightly/v1.2/

That way, if there's still a problem, we potentially still have [a  
little] time to fix it before 1.2.4.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn] svn:open-mpi r15881

2007-08-17 Thread Jeff Squyres

On Aug 16, 2007, at 1:13 PM, Tim Prins wrote:


So you're both right.  :-)  But Tim's falling back on an older (and
unfortunately bad) precedent.  It would be nice to not extend that
bad precedent, IMHO...


I really don't care where the constants are defined, but they do  
need to
be unique. I think it is easiest if all the constants are stored in  
one
file, but if someone else wants to chop them up, that's fine with  
me. We

would just have to be more careful when adding new constants to check
both files.


Ya, IIRC, this is a definite problem that we had: it's at the core of  
the "component" abstraction (a component should be wholly self- 
contained and not have any component-specific definitions outside of  
itself), but these tags are a central resource that need to be  
allocated in a distributed fashion.


That's why I think it was decided to simply leave them as they were,  
and/or use the (DYNAMIC-x) form.  I don't have any better suggestion;  
I'm just providing rationale for the reason it was the way it was...



True.  We will need a robust tag reservation system, though, to
guarantee that every process gets the same tag values (e.g., if udapl
is available on some nodes but not others, will that cause openib to
have different values on different nodes?  And so on).

Not really. All that is needed is a list of constants (similar to the
one in rml_types.h).


I was assuming a dynamic/run-time tag assignment (which is obviously  
problematic for the reason I cited, and others).  But static is also  
problematic for the breaking-abstraction reasons.  Stalemate.



If a rsl component doesn't like the particular
constant tag values, they can do whatever they want in their
implementation, as long as a messages sent on a tag is received on the
same tag.


Sure.

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 09:25:05AM +0200, Adrian Knoth wrote:

> I've tried both: the tarball works fine, the Debian package
> segfaults. I suspect it's the threading support, so someone (Uwe?) could
> try to remove it from debian/rules.

Ok, --enable-progress-threads and --enable-mpi-threads cause the
segfaults. If you compile without, everything works.

I'll now try if it's mpi-threads or the progress-threads, and also check
the upcoming v1.2.4.


How does Debian feel about disabling threads on kFreeBSD? Are there
known issues with pthreads on kFreeBSD?

-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de


Re: [OMPI devel] Public tmp branches

2007-08-17 Thread Jeff Squyres
I didn't really put this in RFC format with a timeout, but no one  
objected, so I have created:


http://svn.open-mpi.org/svn/ompi/public

Developers should feel free to use this tree for public temporary  
branches.  Specifically:


- use /tmp if your branch is intended to be private
- use /public if your branch is intended to be public

Enjoy.


On Aug 10, 2007, at 9:50 AM, Jeff Squyres wrote:


Right now all branches under /tmp are private to the OMPI core group
(e.g., to allow unpublished academic work).  However, there are
definitely cases where it would be useful to allow public branches
when there's development work that is public but not yet ready for
the trunk.  Periodically, we go an assign individual permissions to /
tmp branches (like I just did to /tmp/vt-integration), but it would
be easier if we had a separate tree for public "tmp" branches.

Would anyone have an objection if I added /public (or any better name
that someone can think of) for tmp-style branches, but that are open
for read-only access to the public?

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn] svn:open-mpi r15881

2007-08-17 Thread Sven Stork
On Friday 17 August 2007 13:58, Jeff Squyres wrote:
> On Aug 16, 2007, at 1:13 PM, Tim Prins wrote:
> 
> >> So you're both right.  :-)  But Tim's falling back on an older (and
> >> unfortunately bad) precedent.  It would be nice to not extend that
> >> bad precedent, IMHO...
> >
> > I really don't care where the constants are defined, but they do  
> > need to
> > be unique. I think it is easiest if all the constants are stored in  
> > one
> > file, but if someone else wants to chop them up, that's fine with  
> > me. We
> > would just have to be more careful when adding new constants to check
> > both files.
> 
> Ya, IIRC, this is a definite problem that we had: it's at the core of  
> the "component" abstraction (a component should be wholly self- 
> contained and not have any component-specific definitions outside of  
> itself), but these tags are a central resource that need to be  
> allocated in a distributed fashion.
> 
> That's why I think it was decided to simply leave them as they were,  
> and/or use the (DYNAMIC-x) form.  I don't have any better suggestion;  
> I'm just providing rationale for the reason it was the way it was...
> 
> >> True.  We will need a robust tag reservation system, though, to
> >> guarantee that every process gets the same tag values (e.g., if udapl
> >> is available on some nodes but not others, will that cause openib to
> >> have different values on different nodes?  And so on).
> > Not really. All that is needed is a list of constants (similar to the
> > one in rml_types.h).
> 
> I was assuming a dynamic/run-time tag assignment (which is obviously  
> problematic for the reason I cited, and others).  But static is also  
> problematic for the breaking-abstraction reasons.  Stalemate.

What's about this. Every component choose its own tag independent from the 
others. Before a component can use the tag it must register with its full 
name and the tag at a small (process intern) database. If 2 components try to 
register the same tag we emit a warning, terminate the processes, ... .

If 2 components (CompA and CompB) want to register the same tag and we assume 
that process A loads _only_ CompA while processes B loads _only_ CompB than 
both components will be loaded without any error.
I assume that it's rather unusual that CompA send a message to process B as 
there is no counter component. But there is still some probability.

For more safety (and of course less performance) we could :
- add a parameter that cause this tag database to sync. across all processes.
- add a parameter that turns a check for every send/receive, if the specified 
tag has been used or not

Just my 0.02 $
   Sven

> > If a rsl component doesn't like the particular
> > constant tag values, they can do whatever they want in their
> > implementation, as long as a messages sent on a tag is received on the
> > same tag.
> 
> Sure.
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Jeff Squyres

On Aug 17, 2007, at 8:10 AM, Adrian Knoth wrote:


Ok, --enable-progress-threads and --enable-mpi-threads cause the
segfaults. If you compile without, everything works.


I'll now try if it's mpi-threads or the progress-threads, and also  
check

the upcoming v1.2.4.


The --enable-progress-threads and --enable-mpi-threads configure  
options result in broken-ness on the v1.2 branch; you should not use  
them.  There is ongoing development work in the trunk to fix the code  
associated with these options.  The current goal is to have them  
working for the v1.3 release.


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [OMPI svn] svn:open-mpi r15881

2007-08-17 Thread Jeff Squyres

On Aug 17, 2007, at 8:22 AM, Sven Stork wrote:

What's about this. Every component choose its own tag independent  
from the
others. Before a component can use the tag it must register with  
its full
name and the tag at a small (process intern) database. If 2  
components try to
register the same tag we emit a warning, terminate the  
processes, ... .


My knee-jerk reaction to this is: no!  How could we ship code that  
might abort?!


But upon further reflection, I'm guessing that you assume that we  
would catch such tag conflicts during QA testing and therefore only  
ship components that use distinct tags.  That might be tolerable.


However, it does raise another place where we would have to have  
central coordination between all MPI processes -- something we've  
actively been trying to shed for scalability reasons...


If 2 components (CompA and CompB) want to register the same tag and  
we assume
that process A loads _only_ CompA while processes B loads _only_  
CompB than

both components will be loaded without any error.
I assume that it's rather unusual that CompA send a message to  
process B as

there is no counter component. But there is still some probability.


*Assumedly* we would never ship components that use the same tag (per  
above), but it doesn't address the possibility of 3rd party  
components, etc.



For more safety (and of course less performance) we could :
- add a parameter that cause this tag database to sync. across all  
processes.
- add a parameter that turns a check for every send/receive, if the  
specified

tag has been used or not


Another thought (that was long-ago discarded) would be to use string  
tags.  If you follow the prefix rule, it's easy to guarantee that  
there won't be conflicts.  But:


a) this is the moral equivalent of the modex -- which currently  
utilizes the one-time blast-o-gram from the HNP during MPI_INIT to do  
all the transport


b) to use this for regular RML/OOB/RSL/whatever communication in the  
MPI layer would be rather expensive (which is why it was discarded)


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer

2007-08-17 Thread Jeff Squyres
I am definitely interested to see what the RSL turns out to be; I  
think it has many potential benefits.  There are also some obvious  
issues to be worked out (e.g., mpirun and friends).


As for whether this should go in v1.3, I don't know if it's possible  
to say yet -- it will depend on when RSL becomes [at least close to]  
ready, what the exact schedule for v1.3 is (which we've been skittish  
to define, since we're going for a feature-driven release), etc.



On Aug 16, 2007, at 9:47 PM, Tim Prins wrote:


WHAT: Solicitation of feedback on the possibility of adding a runtime
services layer to Open MPI to abstract out the runtime.

WHY: To solidify the interface between OMPI and the runtime  
environment,

and to allow the use of different runtime systems, including different
versions of ORTE.

WHERE: Addition of a new framework to OMPI, and changes to many of the
files in OMPI to funnel all runtime request through this framework.  
Few

changes should be required in OPAL and ORTE.

WHEN: Development has started in tmp/rsl, but is still in its  
infancy. We hope

to have a working system in the next month.

TIMEOUT: 8/29/07

--
Short version:

I am working on creating an interface between OMPI and the runtime  
system.
This would make a RSL framework in OMPI which all runtime services  
would be

accessed from. Attached is a graphic depicting this.

This change would be invasive to the OMPI layer. Few (if any) changes
will be required of the ORTE and OPAL layers.

At this point I am soliciting feedback as to whether people are
supportive or not of this change both in general and for v1.3.


Long version:

The current model used in Open MPI assumes that one runtime system is
the best for all environments. However, in many environments it may be
beneficial to have specialized runtime systems. With our current  
system this

is not easy to do.

With this in mind, the idea of creating a 'runtime services layer' was
hatched. This would take the form of a framework within OMPI,  
through which

all runtime functionality would be accessed. This would allow new or
different runtime systems to be used with Open MPI. Additionally,  
with such a
system it would be possible to have multiple versions of open rte  
coexisting,
which may facilitate development and testing. Finally, this would  
solidify the

interface between OMPI and the runtime system, as well as provide
documentation and side effects of each interface function.

However, such a change would be fairly invasive to the OMPI layer, and
needs a buy-in from everyone for it to be possible.

Here is a summary of the changes required for the RSL (at least how  
it is

currently envisioned):

1. Add a framework to ompi for the rsl, and a component to support  
orte.

2. Change ompi so that it uses the new interface. This involves:
 a. Moving runtime specific code into the orte rsl component.
 b. Changing the process names in ompi to an opaque object.
 c. change all references to orte in ompi to be to the rsl.
3. Change the configuration code so that open-rte is only linked  
where needed.


Of course, all this would happen on a tmp branch.

The design of the rsl is not solidified. I have been playing in a  
tmp branch
(located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which  
everyone is

welcome to look at and comment on, but be advised that things here are
subject to change (I don't think it even compiles right now). There  
are

some fairly large open questions on this, including:

1. How to handle mpirun (that is, when a user types 'mpirun', do they
always get ORTE, or do they sometimes get a system specific  
runtime). Most
likely mpirun will always use ORTE, and alternative launching  
programs would

be used for other runtimes.
2. Whether there will be any performance implications. My guess is  
not,

but am not quite sure of this yet.

Again, I am interested in people's comments on whether they think  
adding
such abstraction is good or not, and whether it is reasonable to do  
such a

thing for v1.3.

Thanks,

Tim Prins

___
devel-core mailing list
devel-c...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel-core



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 08:26:50AM -0400, Jeff Squyres wrote:

> > Ok, --enable-progress-threads and --enable-mpi-threads cause the
> > segfaults. If you compile without, everything works.
> 
> > I'll now try if it's mpi-threads or the progress-threads, and also  
> > check
> > the upcoming v1.2.4.
> The --enable-progress-threads and --enable-mpi-threads configure  
> options result in broken-ness on the v1.2 branch; you should not use  
> them.

That's why I wondered why Debian has enabled them.

Dirk: Do you mind removing them from debian/rules, thus fixing the
issue?



-- 
Cluster and Metacomputing Working Group
Friedrich-Schiller-Universität Jena, Germany

private: http://adi.thur.de


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Manuel Prinz
Am Freitag, den 17.08.2007, 14:49 +0200 schrieb Adrian Knoth:
> On Fri, Aug 17, 2007 at 08:26:50AM -0400, Jeff Squyres wrote:
> 
> > > Ok, --enable-progress-threads and --enable-mpi-threads cause the
> > > segfaults. If you compile without, everything works.
> > 
> > > I'll now try if it's mpi-threads or the progress-threads, and also  
> > > check
> > > the upcoming v1.2.4.
> > The --enable-progress-threads and --enable-mpi-threads configure  
> > options result in broken-ness on the v1.2 branch; you should not use  
> > them.
> 
> That's why I wondered why Debian has enabled them.

We enabled it because it was requested (http://bugs.debian.org/419867).

> Dirk: Do you mind removing them from debian/rules, thus fixing the
> issue?

I personally think it's best to disable it for now then and document it
in README.Debian. We can enable it again as soon as it works correctly.

Jeff, do you know for which architectures it's (not) working? I haven't
experienced problems so far, or at least didn't notice them.

Best regards
Manuel



Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Jeff Squyres

On Aug 17, 2007, at 8:57 AM, Manuel Prinz wrote:

Jeff, do you know for which architectures it's (not) working? I  
haven't

experienced problems so far, or at least didn't notice them.


I don't think those options are safe on any architecture.  We're  
working on the trunk to make them actually function properly; we  
decided to give up on the 1.2 branch and focus our efforts on the  
v1.3 series (where "we" doesn't actively include me -- others are  
doing the threaded work).


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Manuel Prinz
Am Freitag, den 17.08.2007, 09:02 -0400 schrieb Jeff Squyres:
> I don't think those options are safe on any architecture.

I'll disable them in debian/rules then and document it.

Dirk, are you fine with that?

Best regards
Manuel



Re: [OMPI devel] Public tmp branches

2007-08-17 Thread Brian Barrett
ugh, sorry, I've been busy this week and didn't see a timeout, so a  
response got delayed.


I really don't like this format.  public doesn't have any meaning to  
it (tmp suggests, well, it's temporary).  I'd rather have /tmp/ and / 
tmp/private or something like that.  Or /tmp/ and /tmp/public/.   
Either way :/.


Brian


On Aug 17, 2007, at 6:21 AM, Jeff Squyres wrote:


I didn't really put this in RFC format with a timeout, but no one
objected, so I have created:

http://svn.open-mpi.org/svn/ompi/public

Developers should feel free to use this tree for public temporary
branches.  Specifically:

- use /tmp if your branch is intended to be private
- use /public if your branch is intended to be public

Enjoy.


On Aug 10, 2007, at 9:50 AM, Jeff Squyres wrote:


Right now all branches under /tmp are private to the OMPI core group
(e.g., to allow unpublished academic work).  However, there are
definitely cases where it would be useful to allow public branches
when there's development work that is public but not yet ready for
the trunk.  Periodically, we go an assign individual permissions to /
tmp branches (like I just did to /tmp/vt-integration), but it would
be easier if we had a separate tree for public "tmp" branches.

Would anyone have an objection if I added /public (or any better name
that someone can think of) for tmp-style branches, but that are open
for read-only access to the public?

--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [OMPI svn] svn:open-mpi r15903

2007-08-17 Thread George Bosilca
This patch break the trunk. It looks like the LT_PACKAGE_VERSION  
wasn't defined before the 2.x version. The autogen fails with the  
following error:


*** Running GNU tools
[Running] autom4te --language=m4sh ompi_get_version.m4sh -o  
ompi_get_version.sh

[Running] aclocal
configure.ac:998: error: m4_defn: undefined macro: LT_PACKAGE_VERSION
configure.ac:998: the top level
autom4te: /usr/bin/m4 failed with exit status: 1
aclocal: autom4te failed with exit status: 1

  george.

On Aug 17, 2007, at 12:08 AM, brbar...@osl.iu.edu wrote:


Author: brbarret
Date: 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007)
New Revision: 15903
URL: https://svn.open-mpi.org/trac/ompi/changeset/15903

Log:
Support versions of the Libtool 2.1a snapshots after the  
lt_dladvise code
was brought in.  This supercedes the GLOBL patch that we had been  
using

with Libtool 2.1a versions prior to the lt_dladvise code.  Autogen
tries to figure out which version you're on, so either will now  
work with

the trunk.

Text files modified:
   trunk/configure.ac  |18 + 
+++--
   trunk/opal/mca/base/mca_base_component_find.c   | 8 + 
+++
   trunk/opal/mca/base/mca_base_component_repository.c |24 + 
+++

   3 files changed, 48 insertions(+), 2 deletions(-)

Modified: trunk/configure.ac
== 


--- trunk/configure.ac  (original)
+++ trunk/configure.ac  2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007)
@@ -995,10 +995,15 @@

 ompi_show_subtitle "Libtool configuration"

+m4_if(m4_version_compare(m4_defn([LT_PACKAGE_VERSION]), 2.0), -1, [
 AC_LIBLTDL_CONVENIENCE(opal/libltdl)
 AC_LIBTOOL_DLOPEN
 AC_PROG_LIBTOOL
-
+], [
+LT_CONFIG_LTDL_DIR([opal/libltdl], [subproject])
+LTDL_CONVENIENCE
+LT_INIT([dlopen win32-dll])
+])
 ompi_show_subtitle "GNU libltdl setup"

 # AC_CONFIG_SUBDIRS appears to be broken for non-gcc compilers (i.e.,
@@ -1038,6 +1043,13 @@
 if test "$HAPPY" = "1"; then
 LIBLTDL_SUBDIR=libltdl

+CPPFLAGS_save="$CPPFLAGS"
+CPPFLAGS="-I."
+AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h],
+[OPAL_HAVE_LTDL_ADVISE=1],
+[OPAL_HAVE_LTDL_ADVISE=0])
+CPPFLAGS="$CPPFLAGS"
+
 # Arrgh.  This is gross.  But I can't think of any other  
way to do

 # it.  :-(

@@ -1057,7 +1069,7 @@
 AC_MSG_WARN([libltdl support disabled (by --disable-dlopen)])

 LIBLTDL_SUBDIR=
-LIBLTDL=
+OPAL_HAVE_LTDL_ADVISE=0

 # append instead of prepend, since LIBS are going to be system
 # type things needed by everyone.  Normally, libltdl will push
@@ -1073,6 +1085,8 @@
 AC_DEFINE_UNQUOTED(OMPI_WANT_LIBLTDL, $OMPI_ENABLE_DLOPEN_SUPPORT,
 [Whether to include support for libltdl or not])

+AC_DEFINE_UNQUOTED(OPAL_HAVE_LTDL_ADVISE, $OPAL_HAVE_LTDL_ADVISE,
+[Whether libltdl appears to have the lt_dladvise interface])

 ##
 # visibility

Modified: trunk/opal/mca/base/mca_base_component_find.c
== 


--- trunk/opal/mca/base/mca_base_component_find.c   (original)
+++ trunk/opal/mca/base/mca_base_component_find.c	2007-08-17  
00:08:23 EDT (Fri, 17 Aug 2007)

@@ -75,6 +75,10 @@
   char name[MCA_BASE_MAX_COMPONENT_NAME_LEN];
 };
 typedef struct ltfn_data_holder_t ltfn_data_holder_t;
+
+#if OPAL_HAVE_LTDL_ADVISE
+extern lt_dladvise opal_mca_dladvise;
+#endif
 #endif /* OMPI_WANT_LIBLTDL */


@@ -387,7 +391,11 @@

   /* Now try to load the component */

+#if OPAL_HAVE_LTDL_ADVISE
+  component_handle = lt_dlopenadvise(target_file->filename,  
opal_mca_dladvise);

+#else
   component_handle = lt_dlopenext(target_file->filename);
+#endif
   if (NULL == component_handle) {
 err = strdup(lt_dlerror());
 if (0 != show_errors) {

Modified: trunk/opal/mca/base/mca_base_component_repository.c
== 


--- trunk/opal/mca/base/mca_base_component_repository.c (original)
+++ trunk/opal/mca/base/mca_base_component_repository.c	2007-08-17  
00:08:23 EDT (Fri, 17 Aug 2007)

@@ -85,6 +85,10 @@
 static repository_item_t *find_component(const char *type, const  
char *name);
 static int link_items(repository_item_t *src, repository_item_t  
*depend);


+#if OPAL_HAVE_LTDL_ADVISE
+lt_dladvise opal_mca_dladvise;
+#endif
+
 #endif /* OMPI_WANT_LIBLTDL */


@@ -103,6 +107,20 @@
   return OPAL_ERR_OUT_OF_RESOURCE;
 }

+#if OPAL_HAVE_LTDL_ADVISE
+if (lt_dladvise_init(&opal_mca_dladvise)) {
+return OPAL_ERR_OUT_OF_RESOURCE;
+}
+
+if (lt_dladvise_ext(&opal_mca_dladvise)) {
+return OPAL_ERROR;
+}
+
+if (lt_dladvise_global(&opal_mca_dladvise)) {
+return OPAL_ERROR;
+}
+#endif
+
 OBJ_CONSTRUCT(&repository, opal_list_t);
 #endif
 initialized = true;
@@ -255,6 +273,12 @@

Re: [OMPI devel] [OMPI svn] svn:open-mpi r15903

2007-08-17 Thread Brian Barrett

Oh, crud.  I forgot to fix that issue.  Will fix asap.

Brian

On Aug 17, 2007, at 10:12 AM, George Bosilca wrote:


This patch break the trunk. It looks like the LT_PACKAGE_VERSION
wasn't defined before the 2.x version. The autogen fails with the
following error:

*** Running GNU tools
[Running] autom4te --language=m4sh ompi_get_version.m4sh -o
ompi_get_version.sh
[Running] aclocal
configure.ac:998: error: m4_defn: undefined macro: LT_PACKAGE_VERSION
configure.ac:998: the top level
autom4te: /usr/bin/m4 failed with exit status: 1
aclocal: autom4te failed with exit status: 1

   george.

On Aug 17, 2007, at 12:08 AM, brbar...@osl.iu.edu wrote:


Author: brbarret
Date: 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007)
New Revision: 15903
URL: https://svn.open-mpi.org/trac/ompi/changeset/15903

Log:
Support versions of the Libtool 2.1a snapshots after the
lt_dladvise code
was brought in.  This supercedes the GLOBL patch that we had been
using
with Libtool 2.1a versions prior to the lt_dladvise code.  Autogen
tries to figure out which version you're on, so either will now
work with
the trunk.

Text files modified:
   trunk/configure.ac  |18 +
+++--
   trunk/opal/mca/base/mca_base_component_find.c   | 8 +
+++
   trunk/opal/mca/base/mca_base_component_repository.c |24 +
+++
   3 files changed, 48 insertions(+), 2 deletions(-)

Modified: trunk/configure.ac
= 
=


--- trunk/configure.ac  (original)
+++ trunk/configure.ac  2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007)
@@ -995,10 +995,15 @@

 ompi_show_subtitle "Libtool configuration"

+m4_if(m4_version_compare(m4_defn([LT_PACKAGE_VERSION]), 2.0), -1, [
 AC_LIBLTDL_CONVENIENCE(opal/libltdl)
 AC_LIBTOOL_DLOPEN
 AC_PROG_LIBTOOL
-
+], [
+LT_CONFIG_LTDL_DIR([opal/libltdl], [subproject])
+LTDL_CONVENIENCE
+LT_INIT([dlopen win32-dll])
+])
 ompi_show_subtitle "GNU libltdl setup"

 # AC_CONFIG_SUBDIRS appears to be broken for non-gcc compilers  
(i.e.,

@@ -1038,6 +1043,13 @@
 if test "$HAPPY" = "1"; then
 LIBLTDL_SUBDIR=libltdl

+CPPFLAGS_save="$CPPFLAGS"
+CPPFLAGS="-I."
+AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h],
+[OPAL_HAVE_LTDL_ADVISE=1],
+[OPAL_HAVE_LTDL_ADVISE=0])
+CPPFLAGS="$CPPFLAGS"
+
 # Arrgh.  This is gross.  But I can't think of any other
way to do
 # it.  :-(

@@ -1057,7 +1069,7 @@
 AC_MSG_WARN([libltdl support disabled (by --disable-dlopen)])

 LIBLTDL_SUBDIR=
-LIBLTDL=
+OPAL_HAVE_LTDL_ADVISE=0

 # append instead of prepend, since LIBS are going to be system
 # type things needed by everyone.  Normally, libltdl will push
@@ -1073,6 +1085,8 @@
 AC_DEFINE_UNQUOTED(OMPI_WANT_LIBLTDL, $OMPI_ENABLE_DLOPEN_SUPPORT,
 [Whether to include support for libltdl or not])

+AC_DEFINE_UNQUOTED(OPAL_HAVE_LTDL_ADVISE, $OPAL_HAVE_LTDL_ADVISE,
+[Whether libltdl appears to have the lt_dladvise interface])

 ##
 # visibility

Modified: trunk/opal/mca/base/mca_base_component_find.c
= 
=


--- trunk/opal/mca/base/mca_base_component_find.c   (original)
+++ trunk/opal/mca/base/mca_base_component_find.c   2007-08-17
00:08:23 EDT (Fri, 17 Aug 2007)
@@ -75,6 +75,10 @@
   char name[MCA_BASE_MAX_COMPONENT_NAME_LEN];
 };
 typedef struct ltfn_data_holder_t ltfn_data_holder_t;
+
+#if OPAL_HAVE_LTDL_ADVISE
+extern lt_dladvise opal_mca_dladvise;
+#endif
 #endif /* OMPI_WANT_LIBLTDL */


@@ -387,7 +391,11 @@

   /* Now try to load the component */

+#if OPAL_HAVE_LTDL_ADVISE
+  component_handle = lt_dlopenadvise(target_file->filename,
opal_mca_dladvise);
+#else
   component_handle = lt_dlopenext(target_file->filename);
+#endif
   if (NULL == component_handle) {
 err = strdup(lt_dlerror());
 if (0 != show_errors) {

Modified: trunk/opal/mca/base/mca_base_component_repository.c
= 
=


--- trunk/opal/mca/base/mca_base_component_repository.c (original)
+++ trunk/opal/mca/base/mca_base_component_repository.c 2007-08-17
00:08:23 EDT (Fri, 17 Aug 2007)
@@ -85,6 +85,10 @@
 static repository_item_t *find_component(const char *type, const
char *name);
 static int link_items(repository_item_t *src, repository_item_t
*depend);

+#if OPAL_HAVE_LTDL_ADVISE
+lt_dladvise opal_mca_dladvise;
+#endif
+
 #endif /* OMPI_WANT_LIBLTDL */


@@ -103,6 +107,20 @@
   return OPAL_ERR_OUT_OF_RESOURCE;
 }

+#if OPAL_HAVE_LTDL_ADVISE
+if (lt_dladvise_init(&opal_mca_dladvise)) {
+return OPAL_ERR_OUT_OF_RESOURCE;
+}
+
+if (lt_dladvise_ext(&opal_mca_dladvise)) {
+return OPAL_ERROR;
+}
+
+if (lt_dladvise_global(&opal_mca_dladvise)) {
+return OPAL_ERROR;
+}
+#endif
+
 OBJ_C

Re: [OMPI devel] [OMPI svn] svn:open-mpi r15903

2007-08-17 Thread Brian Barrett
Fixed.  Sorry about the configure change mid-day, but it seemed like  
the right thing to do.


Brian


On Aug 17, 2007, at 10:37 AM, Brian Barrett wrote:


Oh, crud.  I forgot to fix that issue.  Will fix asap.

Brian

On Aug 17, 2007, at 10:12 AM, George Bosilca wrote:


This patch break the trunk. It looks like the LT_PACKAGE_VERSION
wasn't defined before the 2.x version. The autogen fails with the
following error:

*** Running GNU tools
[Running] autom4te --language=m4sh ompi_get_version.m4sh -o
ompi_get_version.sh
[Running] aclocal
configure.ac:998: error: m4_defn: undefined macro: LT_PACKAGE_VERSION
configure.ac:998: the top level
autom4te: /usr/bin/m4 failed with exit status: 1
aclocal: autom4te failed with exit status: 1

   george.

On Aug 17, 2007, at 12:08 AM, brbar...@osl.iu.edu wrote:


Author: brbarret
Date: 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007)
New Revision: 15903
URL: https://svn.open-mpi.org/trac/ompi/changeset/15903

Log:
Support versions of the Libtool 2.1a snapshots after the
lt_dladvise code
was brought in.  This supercedes the GLOBL patch that we had been
using
with Libtool 2.1a versions prior to the lt_dladvise code.  Autogen
tries to figure out which version you're on, so either will now
work with
the trunk.

Text files modified:
   trunk/configure.ac  |18 +
+++--
   trunk/opal/mca/base/mca_base_component_find.c   | 8 +
+++
   trunk/opal/mca/base/mca_base_component_repository.c |24 +
+++
   3 files changed, 48 insertions(+), 2 deletions(-)

Modified: trunk/configure.ac
 
=

=

--- trunk/configure.ac  (original)
+++ trunk/configure.ac  2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007)
@@ -995,10 +995,15 @@

 ompi_show_subtitle "Libtool configuration"

+m4_if(m4_version_compare(m4_defn([LT_PACKAGE_VERSION]), 2.0), -1, [
 AC_LIBLTDL_CONVENIENCE(opal/libltdl)
 AC_LIBTOOL_DLOPEN
 AC_PROG_LIBTOOL
-
+], [
+LT_CONFIG_LTDL_DIR([opal/libltdl], [subproject])
+LTDL_CONVENIENCE
+LT_INIT([dlopen win32-dll])
+])
 ompi_show_subtitle "GNU libltdl setup"

 # AC_CONFIG_SUBDIRS appears to be broken for non-gcc compilers
(i.e.,
@@ -1038,6 +1043,13 @@
 if test "$HAPPY" = "1"; then
 LIBLTDL_SUBDIR=libltdl

+CPPFLAGS_save="$CPPFLAGS"
+CPPFLAGS="-I."
+AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h],
+[OPAL_HAVE_LTDL_ADVISE=1],
+[OPAL_HAVE_LTDL_ADVISE=0])
+CPPFLAGS="$CPPFLAGS"
+
 # Arrgh.  This is gross.  But I can't think of any other
way to do
 # it.  :-(

@@ -1057,7 +1069,7 @@
 AC_MSG_WARN([libltdl support disabled (by --disable-dlopen)])

 LIBLTDL_SUBDIR=
-LIBLTDL=
+OPAL_HAVE_LTDL_ADVISE=0

 # append instead of prepend, since LIBS are going to be system
 # type things needed by everyone.  Normally, libltdl will push
@@ -1073,6 +1085,8 @@
 AC_DEFINE_UNQUOTED(OMPI_WANT_LIBLTDL, $OMPI_ENABLE_DLOPEN_SUPPORT,
 [Whether to include support for libltdl or not])

+AC_DEFINE_UNQUOTED(OPAL_HAVE_LTDL_ADVISE, $OPAL_HAVE_LTDL_ADVISE,
+[Whether libltdl appears to have the lt_dladvise interface])

 ##
 # visibility

Modified: trunk/opal/mca/base/mca_base_component_find.c
 
=

=

--- trunk/opal/mca/base/mca_base_component_find.c   (original)
+++ trunk/opal/mca/base/mca_base_component_find.c   2007-08-17
00:08:23 EDT (Fri, 17 Aug 2007)
@@ -75,6 +75,10 @@
   char name[MCA_BASE_MAX_COMPONENT_NAME_LEN];
 };
 typedef struct ltfn_data_holder_t ltfn_data_holder_t;
+
+#if OPAL_HAVE_LTDL_ADVISE
+extern lt_dladvise opal_mca_dladvise;
+#endif
 #endif /* OMPI_WANT_LIBLTDL */


@@ -387,7 +391,11 @@

   /* Now try to load the component */

+#if OPAL_HAVE_LTDL_ADVISE
+  component_handle = lt_dlopenadvise(target_file->filename,
opal_mca_dladvise);
+#else
   component_handle = lt_dlopenext(target_file->filename);
+#endif
   if (NULL == component_handle) {
 err = strdup(lt_dlerror());
 if (0 != show_errors) {

Modified: trunk/opal/mca/base/mca_base_component_repository.c
 
=

=

--- trunk/opal/mca/base/mca_base_component_repository.c (original)
+++ trunk/opal/mca/base/mca_base_component_repository.c 2007-08-17
00:08:23 EDT (Fri, 17 Aug 2007)
@@ -85,6 +85,10 @@
 static repository_item_t *find_component(const char *type, const
char *name);
 static int link_items(repository_item_t *src, repository_item_t
*depend);

+#if OPAL_HAVE_LTDL_ADVISE
+lt_dladvise opal_mca_dladvise;
+#endif
+
 #endif /* OMPI_WANT_LIBLTDL */


@@ -103,6 +107,20 @@
   return OPAL_ERR_OUT_OF_RESOURCE;
 }

+#if OPAL_HAVE_LTDL_ADVISE
+if (lt_dladvise_init(&opal_mca_dladvise)) {
+return OPAL_ERR_OUT_OF_RESOURCE;
+}
+
+if (lt_dladvise_ext(&opal_mca_

Re: [OMPI devel] Public tmp branches

2007-08-17 Thread Jeff Squyres (jsquyres)
I thought about both of those (/tmp/private and /tmp/public), but couldn't 
think of a way to make them work.

1. If we do /tmp/private, we have to svn mv all the existing trees there which 
will annoy the developers (but is not a deal-breaker) and make /tmp publicly 
readable.  But that makes the history of all the private branches public.

2. If we do /tmp/public, I'm not quite sure how to setup the perms in SH to do 
that - if we setup /tmp to be 'no read access' for * and /tmp/public to have 
'read access' for *, will a non authenticated user be able to reach 
/tmp/private?

-jms

 -Original Message-
From:   Brian Barrett [mailto:bbarr...@lanl.gov]
Sent:   Friday, August 17, 2007 11:51 AM Eastern Standard Time
To: Open MPI Developers
Subject:Re: [OMPI devel] Public tmp branches

ugh, sorry, I've been busy this week and didn't see a timeout, so a  
response got delayed.

I really don't like this format.  public doesn't have any meaning to  
it (tmp suggests, well, it's temporary).  I'd rather have /tmp/ and / 
tmp/private or something like that.  Or /tmp/ and /tmp/public/.   
Either way :/.

Brian


On Aug 17, 2007, at 6:21 AM, Jeff Squyres wrote:

> I didn't really put this in RFC format with a timeout, but no one
> objected, so I have created:
>
>   http://svn.open-mpi.org/svn/ompi/public
>
> Developers should feel free to use this tree for public temporary
> branches.  Specifically:
>
> - use /tmp if your branch is intended to be private
> - use /public if your branch is intended to be public
>
> Enjoy.
>
>
> On Aug 10, 2007, at 9:50 AM, Jeff Squyres wrote:
>
>> Right now all branches under /tmp are private to the OMPI core group
>> (e.g., to allow unpublished academic work).  However, there are
>> definitely cases where it would be useful to allow public branches
>> when there's development work that is public but not yet ready for
>> the trunk.  Periodically, we go an assign individual permissions to /
>> tmp branches (like I just did to /tmp/vt-integration), but it would
>> be easier if we had a separate tree for public "tmp" branches.
>>
>> Would anyone have an objection if I added /public (or any better name
>> that someone can think of) for tmp-style branches, but that are open
>> for read-only access to the public?
>>
>> -- 
>> Jeff Squyres
>> Cisco Systems
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
> -- 
> Jeff Squyres
> Cisco Systems
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Dirk Eddelbuettel
On Fri, Aug 17, 2007 at 03:08:12PM +0200, Manuel Prinz wrote:
> Am Freitag, den 17.08.2007, 09:02 -0400 schrieb Jeff Squyres:
> > I don't think those options are safe on any architecture.
> 
> I'll disable them in debian/rules then and document it.
> 
> Dirk, are you fine with that?

Sure thing. We simply didn't know abput the brokenness re threads in
1.2.

Dirk, on vacation

> 
> Best regards
> Manuel
> 
> 
> ___
> Pkg-openmpi-maintainers mailing list
> pkg-openmpi-maintain...@lists.alioth.debian.org
> http://lists.alioth.debian.org/mailman/listinfo/pkg-openmpi-maintainers

-- 
Three out of two people have difficulties with fractions.


Re: [OMPI devel] [OMPI users] Possible Memcpy bug in MPI_Comm_split

2007-08-17 Thread Lisandro Dalcin
On 8/16/07, George Bosilca  wrote:
> Well, finally someone discovered it :) I know about this problem for
> quite a while now, it pop up during our own valgrind test of the
> collective module in Open MPI. However, it never create any problems
> in the applications, at least not as far as I know. That's why I'm
> reticent to replace the memcpy by a memmove (where the arguments are
> allowed to overlap) as there is a performance penalty.

George, I believe I also reported this some time ago, and your
comments were the same :-).

No time to dive into the internals, but for my the question is: What's
going on in Comm::Split() that it falls to copy everlapping memory? It
is expected, or it is perhaps a bug?

Regards,

>
>george.
>
> On Aug 16, 2007, at 9:31 AM, Allen Barnett wrote:
>
> > Hi:
> > I was running my OpenMPI 1.2.3 application under Valgrind and I
> > observed
> > this error message:
> >
> > ==14322== Source and destination overlap in memcpy(0x41F5BD0,
> > 0x41F5BD8,
> > 16)
> > ==14322==at 0x49070AD: memcpy (mc_replace_strmem.c:116)
> > ==14322==by 0x4A45CF4: ompi_ddt_copy_content_same_ddt
> > (in /home/scratch/DMP/RHEL4-GCC4/lib/libmpi.so.0.0.0)
> > ==14322==by 0x7A6C386: ompi_coll_tuned_allgather_intra_bruck
> > (in /home/scratch/DMP/RHEL4-GCC4/lib/openmpi/mca_coll_tuned.so)
> > ==14322==by 0x4A29FFE: ompi_comm_split
> > (in /home/scratch/DMP/RHEL4-GCC4/lib/libmpi.so.0.0.0)
> > ==14322==by 0x4A4E322: MPI_Comm_split
> > (in /home/scratch/DMP/RHEL4-GCC4/lib/libmpi.so.0.0.0)
> > ==14322==by 0x400A26: main
> > (in /home/scratch/DMP/severian_tests/ompi/a.out)
> >
> > Attached is a reduced code example. I run it like:
> >
> > mpirun -np 3 valgrind ./a.out
> >
> > I only see this error if there are an odd number of processes! I don't
> > know if this is really a problem or not, though. My OMPI application
> > seems to work OK. However, the linux man page for memcpy says
> > overlapping range copying is undefined.
> >
> > Other details: x86_64 (one box, two dual-core opterons), RHEL 4.5,
> > OpenMPI-1.2.3 compiled with the RHEL-supplied GCC 4 (gcc4 (GCC) 4.1.1
> > 20070105 (Red Hat 4.1.1-53)), valgrind 3.2.3.
> >
> > Thanks,
> > Allen
> >
> >
> > --
> > Allen Barnett
> > Transpire, Inc.
> > e-mail: al...@transpireinc.com
> > Ph: 518-887-2930
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


-- 
Lisandro Dalcín
---
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594