Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Fri, Aug 17, 2007 at 02:11:02AM +0200, Uwe Hermann wrote: > > | The 1.2.3 release also works fine: > I think Adrian used a tarball, not the Debian package? > I'll try a local, manual install too, maybe the bug is Debian-related only? I've tried both: the tarball works fine, the Debian package segfaults. I suspect it's the threading support, so someone (Uwe?) could try to remove it from debian/rules. Ok, I'll check this for amd64, but it will take some time to compile in the qemu ;) -- mail: a...@thur.de http://adi.thur.de PGP/GPG: key via keyserver Die Stosstange ist aller Laster Anfang.
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Aug 16, 2007, at 8:11 PM, Uwe Hermann wrote: With the libc0.1 fix (and another small patch for Debian which I'll send soon) both the kfreebsd-i386 and kfreebsd-amd64 packages build fine. However, on my systems, both i386 and amd64 still segfault. I'm using the openmpi Debian packages, version 1.2.3-3. I'll try the stock tarballs soon, and/or wait for 1.2.4 to see if the bug is already fixed there... FWIW, if you've got the cycles, try a 1.2 branch nightly tarball (i.e., they're what will eventually become 1.2.4): http://www.open-mpi.org/nightly/v1.2/ That way, if there's still a problem, we potentially still have [a little] time to fix it before 1.2.4. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [OMPI svn] svn:open-mpi r15881
On Aug 16, 2007, at 1:13 PM, Tim Prins wrote: So you're both right. :-) But Tim's falling back on an older (and unfortunately bad) precedent. It would be nice to not extend that bad precedent, IMHO... I really don't care where the constants are defined, but they do need to be unique. I think it is easiest if all the constants are stored in one file, but if someone else wants to chop them up, that's fine with me. We would just have to be more careful when adding new constants to check both files. Ya, IIRC, this is a definite problem that we had: it's at the core of the "component" abstraction (a component should be wholly self- contained and not have any component-specific definitions outside of itself), but these tags are a central resource that need to be allocated in a distributed fashion. That's why I think it was decided to simply leave them as they were, and/or use the (DYNAMIC-x) form. I don't have any better suggestion; I'm just providing rationale for the reason it was the way it was... True. We will need a robust tag reservation system, though, to guarantee that every process gets the same tag values (e.g., if udapl is available on some nodes but not others, will that cause openib to have different values on different nodes? And so on). Not really. All that is needed is a list of constants (similar to the one in rml_types.h). I was assuming a dynamic/run-time tag assignment (which is obviously problematic for the reason I cited, and others). But static is also problematic for the breaking-abstraction reasons. Stalemate. If a rsl component doesn't like the particular constant tag values, they can do whatever they want in their implementation, as long as a messages sent on a tag is received on the same tag. Sure. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Fri, Aug 17, 2007 at 09:25:05AM +0200, Adrian Knoth wrote: > I've tried both: the tarball works fine, the Debian package > segfaults. I suspect it's the threading support, so someone (Uwe?) could > try to remove it from debian/rules. Ok, --enable-progress-threads and --enable-mpi-threads cause the segfaults. If you compile without, everything works. I'll now try if it's mpi-threads or the progress-threads, and also check the upcoming v1.2.4. How does Debian feel about disabling threads on kFreeBSD? Are there known issues with pthreads on kFreeBSD? -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de
Re: [OMPI devel] Public tmp branches
I didn't really put this in RFC format with a timeout, but no one objected, so I have created: http://svn.open-mpi.org/svn/ompi/public Developers should feel free to use this tree for public temporary branches. Specifically: - use /tmp if your branch is intended to be private - use /public if your branch is intended to be public Enjoy. On Aug 10, 2007, at 9:50 AM, Jeff Squyres wrote: Right now all branches under /tmp are private to the OMPI core group (e.g., to allow unpublished academic work). However, there are definitely cases where it would be useful to allow public branches when there's development work that is public but not yet ready for the trunk. Periodically, we go an assign individual permissions to / tmp branches (like I just did to /tmp/vt-integration), but it would be easier if we had a separate tree for public "tmp" branches. Would anyone have an objection if I added /public (or any better name that someone can think of) for tmp-style branches, but that are open for read-only access to the public? -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [OMPI svn] svn:open-mpi r15881
On Friday 17 August 2007 13:58, Jeff Squyres wrote: > On Aug 16, 2007, at 1:13 PM, Tim Prins wrote: > > >> So you're both right. :-) But Tim's falling back on an older (and > >> unfortunately bad) precedent. It would be nice to not extend that > >> bad precedent, IMHO... > > > > I really don't care where the constants are defined, but they do > > need to > > be unique. I think it is easiest if all the constants are stored in > > one > > file, but if someone else wants to chop them up, that's fine with > > me. We > > would just have to be more careful when adding new constants to check > > both files. > > Ya, IIRC, this is a definite problem that we had: it's at the core of > the "component" abstraction (a component should be wholly self- > contained and not have any component-specific definitions outside of > itself), but these tags are a central resource that need to be > allocated in a distributed fashion. > > That's why I think it was decided to simply leave them as they were, > and/or use the (DYNAMIC-x) form. I don't have any better suggestion; > I'm just providing rationale for the reason it was the way it was... > > >> True. We will need a robust tag reservation system, though, to > >> guarantee that every process gets the same tag values (e.g., if udapl > >> is available on some nodes but not others, will that cause openib to > >> have different values on different nodes? And so on). > > Not really. All that is needed is a list of constants (similar to the > > one in rml_types.h). > > I was assuming a dynamic/run-time tag assignment (which is obviously > problematic for the reason I cited, and others). But static is also > problematic for the breaking-abstraction reasons. Stalemate. What's about this. Every component choose its own tag independent from the others. Before a component can use the tag it must register with its full name and the tag at a small (process intern) database. If 2 components try to register the same tag we emit a warning, terminate the processes, ... . If 2 components (CompA and CompB) want to register the same tag and we assume that process A loads _only_ CompA while processes B loads _only_ CompB than both components will be loaded without any error. I assume that it's rather unusual that CompA send a message to process B as there is no counter component. But there is still some probability. For more safety (and of course less performance) we could : - add a parameter that cause this tag database to sync. across all processes. - add a parameter that turns a check for every send/receive, if the specified tag has been used or not Just my 0.02 $ Sven > > If a rsl component doesn't like the particular > > constant tag values, they can do whatever they want in their > > implementation, as long as a messages sent on a tag is received on the > > same tag. > > Sure. > > -- > Jeff Squyres > Cisco Systems > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel >
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Aug 17, 2007, at 8:10 AM, Adrian Knoth wrote: Ok, --enable-progress-threads and --enable-mpi-threads cause the segfaults. If you compile without, everything works. I'll now try if it's mpi-threads or the progress-threads, and also check the upcoming v1.2.4. The --enable-progress-threads and --enable-mpi-threads configure options result in broken-ness on the v1.2 branch; you should not use them. There is ongoing development work in the trunk to fix the code associated with these options. The current goal is to have them working for the v1.3 release. -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [OMPI svn] svn:open-mpi r15881
On Aug 17, 2007, at 8:22 AM, Sven Stork wrote: What's about this. Every component choose its own tag independent from the others. Before a component can use the tag it must register with its full name and the tag at a small (process intern) database. If 2 components try to register the same tag we emit a warning, terminate the processes, ... . My knee-jerk reaction to this is: no! How could we ship code that might abort?! But upon further reflection, I'm guessing that you assume that we would catch such tag conflicts during QA testing and therefore only ship components that use distinct tags. That might be tolerable. However, it does raise another place where we would have to have central coordination between all MPI processes -- something we've actively been trying to shed for scalability reasons... If 2 components (CompA and CompB) want to register the same tag and we assume that process A loads _only_ CompA while processes B loads _only_ CompB than both components will be loaded without any error. I assume that it's rather unusual that CompA send a message to process B as there is no counter component. But there is still some probability. *Assumedly* we would never ship components that use the same tag (per above), but it doesn't address the possibility of 3rd party components, etc. For more safety (and of course less performance) we could : - add a parameter that cause this tag database to sync. across all processes. - add a parameter that turns a check for every send/receive, if the specified tag has been used or not Another thought (that was long-ago discarded) would be to use string tags. If you follow the prefix rule, it's easy to guarantee that there won't be conflicts. But: a) this is the moral equivalent of the modex -- which currently utilizes the one-time blast-o-gram from the HNP during MPI_INIT to do all the transport b) to use this for regular RML/OOB/RSL/whatever communication in the MPI layer would be rather expensive (which is why it was discarded) -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [devel-core] [RFC] Runtime Services Layer
I am definitely interested to see what the RSL turns out to be; I think it has many potential benefits. There are also some obvious issues to be worked out (e.g., mpirun and friends). As for whether this should go in v1.3, I don't know if it's possible to say yet -- it will depend on when RSL becomes [at least close to] ready, what the exact schedule for v1.3 is (which we've been skittish to define, since we're going for a feature-driven release), etc. On Aug 16, 2007, at 9:47 PM, Tim Prins wrote: WHAT: Solicitation of feedback on the possibility of adding a runtime services layer to Open MPI to abstract out the runtime. WHY: To solidify the interface between OMPI and the runtime environment, and to allow the use of different runtime systems, including different versions of ORTE. WHERE: Addition of a new framework to OMPI, and changes to many of the files in OMPI to funnel all runtime request through this framework. Few changes should be required in OPAL and ORTE. WHEN: Development has started in tmp/rsl, but is still in its infancy. We hope to have a working system in the next month. TIMEOUT: 8/29/07 -- Short version: I am working on creating an interface between OMPI and the runtime system. This would make a RSL framework in OMPI which all runtime services would be accessed from. Attached is a graphic depicting this. This change would be invasive to the OMPI layer. Few (if any) changes will be required of the ORTE and OPAL layers. At this point I am soliciting feedback as to whether people are supportive or not of this change both in general and for v1.3. Long version: The current model used in Open MPI assumes that one runtime system is the best for all environments. However, in many environments it may be beneficial to have specialized runtime systems. With our current system this is not easy to do. With this in mind, the idea of creating a 'runtime services layer' was hatched. This would take the form of a framework within OMPI, through which all runtime functionality would be accessed. This would allow new or different runtime systems to be used with Open MPI. Additionally, with such a system it would be possible to have multiple versions of open rte coexisting, which may facilitate development and testing. Finally, this would solidify the interface between OMPI and the runtime system, as well as provide documentation and side effects of each interface function. However, such a change would be fairly invasive to the OMPI layer, and needs a buy-in from everyone for it to be possible. Here is a summary of the changes required for the RSL (at least how it is currently envisioned): 1. Add a framework to ompi for the rsl, and a component to support orte. 2. Change ompi so that it uses the new interface. This involves: a. Moving runtime specific code into the orte rsl component. b. Changing the process names in ompi to an opaque object. c. change all references to orte in ompi to be to the rsl. 3. Change the configuration code so that open-rte is only linked where needed. Of course, all this would happen on a tmp branch. The design of the rsl is not solidified. I have been playing in a tmp branch (located at https://svn.open-mpi.org/svn/ompi/tmp/rsl) which everyone is welcome to look at and comment on, but be advised that things here are subject to change (I don't think it even compiles right now). There are some fairly large open questions on this, including: 1. How to handle mpirun (that is, when a user types 'mpirun', do they always get ORTE, or do they sometimes get a system specific runtime). Most likely mpirun will always use ORTE, and alternative launching programs would be used for other runtimes. 2. Whether there will be any performance implications. My guess is not, but am not quite sure of this yet. Again, I am interested in people's comments on whether they think adding such abstraction is good or not, and whether it is reasonable to do such a thing for v1.3. Thanks, Tim Prins ___ devel-core mailing list devel-c...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel-core -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Fri, Aug 17, 2007 at 08:26:50AM -0400, Jeff Squyres wrote: > > Ok, --enable-progress-threads and --enable-mpi-threads cause the > > segfaults. If you compile without, everything works. > > > I'll now try if it's mpi-threads or the progress-threads, and also > > check > > the upcoming v1.2.4. > The --enable-progress-threads and --enable-mpi-threads configure > options result in broken-ness on the v1.2 branch; you should not use > them. That's why I wondered why Debian has enabled them. Dirk: Do you mind removing them from debian/rules, thus fixing the issue? -- Cluster and Metacomputing Working Group Friedrich-Schiller-Universität Jena, Germany private: http://adi.thur.de
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
Am Freitag, den 17.08.2007, 14:49 +0200 schrieb Adrian Knoth: > On Fri, Aug 17, 2007 at 08:26:50AM -0400, Jeff Squyres wrote: > > > > Ok, --enable-progress-threads and --enable-mpi-threads cause the > > > segfaults. If you compile without, everything works. > > > > > I'll now try if it's mpi-threads or the progress-threads, and also > > > check > > > the upcoming v1.2.4. > > The --enable-progress-threads and --enable-mpi-threads configure > > options result in broken-ness on the v1.2 branch; you should not use > > them. > > That's why I wondered why Debian has enabled them. We enabled it because it was requested (http://bugs.debian.org/419867). > Dirk: Do you mind removing them from debian/rules, thus fixing the > issue? I personally think it's best to disable it for now then and document it in README.Debian. We can enable it again as soon as it works correctly. Jeff, do you know for which architectures it's (not) working? I haven't experienced problems so far, or at least didn't notice them. Best regards Manuel
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Aug 17, 2007, at 8:57 AM, Manuel Prinz wrote: Jeff, do you know for which architectures it's (not) working? I haven't experienced problems so far, or at least didn't notice them. I don't think those options are safe on any architecture. We're working on the trunk to make them actually function properly; we decided to give up on the 1.2 branch and focus our efforts on the v1.3 series (where "we" doesn't actively include me -- others are doing the threaded work). -- Jeff Squyres Cisco Systems
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
Am Freitag, den 17.08.2007, 09:02 -0400 schrieb Jeff Squyres: > I don't think those options are safe on any architecture. I'll disable them in debian/rules then and document it. Dirk, are you fine with that? Best regards Manuel
Re: [OMPI devel] Public tmp branches
ugh, sorry, I've been busy this week and didn't see a timeout, so a response got delayed. I really don't like this format. public doesn't have any meaning to it (tmp suggests, well, it's temporary). I'd rather have /tmp/ and / tmp/private or something like that. Or /tmp/ and /tmp/public/. Either way :/. Brian On Aug 17, 2007, at 6:21 AM, Jeff Squyres wrote: I didn't really put this in RFC format with a timeout, but no one objected, so I have created: http://svn.open-mpi.org/svn/ompi/public Developers should feel free to use this tree for public temporary branches. Specifically: - use /tmp if your branch is intended to be private - use /public if your branch is intended to be public Enjoy. On Aug 10, 2007, at 9:50 AM, Jeff Squyres wrote: Right now all branches under /tmp are private to the OMPI core group (e.g., to allow unpublished academic work). However, there are definitely cases where it would be useful to allow public branches when there's development work that is public but not yet ready for the trunk. Periodically, we go an assign individual permissions to / tmp branches (like I just did to /tmp/vt-integration), but it would be easier if we had a separate tree for public "tmp" branches. Would anyone have an objection if I added /public (or any better name that someone can think of) for tmp-style branches, but that are open for read-only access to the public? -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [OMPI svn] svn:open-mpi r15903
This patch break the trunk. It looks like the LT_PACKAGE_VERSION wasn't defined before the 2.x version. The autogen fails with the following error: *** Running GNU tools [Running] autom4te --language=m4sh ompi_get_version.m4sh -o ompi_get_version.sh [Running] aclocal configure.ac:998: error: m4_defn: undefined macro: LT_PACKAGE_VERSION configure.ac:998: the top level autom4te: /usr/bin/m4 failed with exit status: 1 aclocal: autom4te failed with exit status: 1 george. On Aug 17, 2007, at 12:08 AM, brbar...@osl.iu.edu wrote: Author: brbarret Date: 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) New Revision: 15903 URL: https://svn.open-mpi.org/trac/ompi/changeset/15903 Log: Support versions of the Libtool 2.1a snapshots after the lt_dladvise code was brought in. This supercedes the GLOBL patch that we had been using with Libtool 2.1a versions prior to the lt_dladvise code. Autogen tries to figure out which version you're on, so either will now work with the trunk. Text files modified: trunk/configure.ac |18 + +++-- trunk/opal/mca/base/mca_base_component_find.c | 8 + +++ trunk/opal/mca/base/mca_base_component_repository.c |24 + +++ 3 files changed, 48 insertions(+), 2 deletions(-) Modified: trunk/configure.ac == --- trunk/configure.ac (original) +++ trunk/configure.ac 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -995,10 +995,15 @@ ompi_show_subtitle "Libtool configuration" +m4_if(m4_version_compare(m4_defn([LT_PACKAGE_VERSION]), 2.0), -1, [ AC_LIBLTDL_CONVENIENCE(opal/libltdl) AC_LIBTOOL_DLOPEN AC_PROG_LIBTOOL - +], [ +LT_CONFIG_LTDL_DIR([opal/libltdl], [subproject]) +LTDL_CONVENIENCE +LT_INIT([dlopen win32-dll]) +]) ompi_show_subtitle "GNU libltdl setup" # AC_CONFIG_SUBDIRS appears to be broken for non-gcc compilers (i.e., @@ -1038,6 +1043,13 @@ if test "$HAPPY" = "1"; then LIBLTDL_SUBDIR=libltdl +CPPFLAGS_save="$CPPFLAGS" +CPPFLAGS="-I." +AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h], +[OPAL_HAVE_LTDL_ADVISE=1], +[OPAL_HAVE_LTDL_ADVISE=0]) +CPPFLAGS="$CPPFLAGS" + # Arrgh. This is gross. But I can't think of any other way to do # it. :-( @@ -1057,7 +1069,7 @@ AC_MSG_WARN([libltdl support disabled (by --disable-dlopen)]) LIBLTDL_SUBDIR= -LIBLTDL= +OPAL_HAVE_LTDL_ADVISE=0 # append instead of prepend, since LIBS are going to be system # type things needed by everyone. Normally, libltdl will push @@ -1073,6 +1085,8 @@ AC_DEFINE_UNQUOTED(OMPI_WANT_LIBLTDL, $OMPI_ENABLE_DLOPEN_SUPPORT, [Whether to include support for libltdl or not]) +AC_DEFINE_UNQUOTED(OPAL_HAVE_LTDL_ADVISE, $OPAL_HAVE_LTDL_ADVISE, +[Whether libltdl appears to have the lt_dladvise interface]) ## # visibility Modified: trunk/opal/mca/base/mca_base_component_find.c == --- trunk/opal/mca/base/mca_base_component_find.c (original) +++ trunk/opal/mca/base/mca_base_component_find.c 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -75,6 +75,10 @@ char name[MCA_BASE_MAX_COMPONENT_NAME_LEN]; }; typedef struct ltfn_data_holder_t ltfn_data_holder_t; + +#if OPAL_HAVE_LTDL_ADVISE +extern lt_dladvise opal_mca_dladvise; +#endif #endif /* OMPI_WANT_LIBLTDL */ @@ -387,7 +391,11 @@ /* Now try to load the component */ +#if OPAL_HAVE_LTDL_ADVISE + component_handle = lt_dlopenadvise(target_file->filename, opal_mca_dladvise); +#else component_handle = lt_dlopenext(target_file->filename); +#endif if (NULL == component_handle) { err = strdup(lt_dlerror()); if (0 != show_errors) { Modified: trunk/opal/mca/base/mca_base_component_repository.c == --- trunk/opal/mca/base/mca_base_component_repository.c (original) +++ trunk/opal/mca/base/mca_base_component_repository.c 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -85,6 +85,10 @@ static repository_item_t *find_component(const char *type, const char *name); static int link_items(repository_item_t *src, repository_item_t *depend); +#if OPAL_HAVE_LTDL_ADVISE +lt_dladvise opal_mca_dladvise; +#endif + #endif /* OMPI_WANT_LIBLTDL */ @@ -103,6 +107,20 @@ return OPAL_ERR_OUT_OF_RESOURCE; } +#if OPAL_HAVE_LTDL_ADVISE +if (lt_dladvise_init(&opal_mca_dladvise)) { +return OPAL_ERR_OUT_OF_RESOURCE; +} + +if (lt_dladvise_ext(&opal_mca_dladvise)) { +return OPAL_ERROR; +} + +if (lt_dladvise_global(&opal_mca_dladvise)) { +return OPAL_ERROR; +} +#endif + OBJ_CONSTRUCT(&repository, opal_list_t); #endif initialized = true; @@ -255,6 +273,12 @@
Re: [OMPI devel] [OMPI svn] svn:open-mpi r15903
Oh, crud. I forgot to fix that issue. Will fix asap. Brian On Aug 17, 2007, at 10:12 AM, George Bosilca wrote: This patch break the trunk. It looks like the LT_PACKAGE_VERSION wasn't defined before the 2.x version. The autogen fails with the following error: *** Running GNU tools [Running] autom4te --language=m4sh ompi_get_version.m4sh -o ompi_get_version.sh [Running] aclocal configure.ac:998: error: m4_defn: undefined macro: LT_PACKAGE_VERSION configure.ac:998: the top level autom4te: /usr/bin/m4 failed with exit status: 1 aclocal: autom4te failed with exit status: 1 george. On Aug 17, 2007, at 12:08 AM, brbar...@osl.iu.edu wrote: Author: brbarret Date: 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) New Revision: 15903 URL: https://svn.open-mpi.org/trac/ompi/changeset/15903 Log: Support versions of the Libtool 2.1a snapshots after the lt_dladvise code was brought in. This supercedes the GLOBL patch that we had been using with Libtool 2.1a versions prior to the lt_dladvise code. Autogen tries to figure out which version you're on, so either will now work with the trunk. Text files modified: trunk/configure.ac |18 + +++-- trunk/opal/mca/base/mca_base_component_find.c | 8 + +++ trunk/opal/mca/base/mca_base_component_repository.c |24 + +++ 3 files changed, 48 insertions(+), 2 deletions(-) Modified: trunk/configure.ac = = --- trunk/configure.ac (original) +++ trunk/configure.ac 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -995,10 +995,15 @@ ompi_show_subtitle "Libtool configuration" +m4_if(m4_version_compare(m4_defn([LT_PACKAGE_VERSION]), 2.0), -1, [ AC_LIBLTDL_CONVENIENCE(opal/libltdl) AC_LIBTOOL_DLOPEN AC_PROG_LIBTOOL - +], [ +LT_CONFIG_LTDL_DIR([opal/libltdl], [subproject]) +LTDL_CONVENIENCE +LT_INIT([dlopen win32-dll]) +]) ompi_show_subtitle "GNU libltdl setup" # AC_CONFIG_SUBDIRS appears to be broken for non-gcc compilers (i.e., @@ -1038,6 +1043,13 @@ if test "$HAPPY" = "1"; then LIBLTDL_SUBDIR=libltdl +CPPFLAGS_save="$CPPFLAGS" +CPPFLAGS="-I." +AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h], +[OPAL_HAVE_LTDL_ADVISE=1], +[OPAL_HAVE_LTDL_ADVISE=0]) +CPPFLAGS="$CPPFLAGS" + # Arrgh. This is gross. But I can't think of any other way to do # it. :-( @@ -1057,7 +1069,7 @@ AC_MSG_WARN([libltdl support disabled (by --disable-dlopen)]) LIBLTDL_SUBDIR= -LIBLTDL= +OPAL_HAVE_LTDL_ADVISE=0 # append instead of prepend, since LIBS are going to be system # type things needed by everyone. Normally, libltdl will push @@ -1073,6 +1085,8 @@ AC_DEFINE_UNQUOTED(OMPI_WANT_LIBLTDL, $OMPI_ENABLE_DLOPEN_SUPPORT, [Whether to include support for libltdl or not]) +AC_DEFINE_UNQUOTED(OPAL_HAVE_LTDL_ADVISE, $OPAL_HAVE_LTDL_ADVISE, +[Whether libltdl appears to have the lt_dladvise interface]) ## # visibility Modified: trunk/opal/mca/base/mca_base_component_find.c = = --- trunk/opal/mca/base/mca_base_component_find.c (original) +++ trunk/opal/mca/base/mca_base_component_find.c 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -75,6 +75,10 @@ char name[MCA_BASE_MAX_COMPONENT_NAME_LEN]; }; typedef struct ltfn_data_holder_t ltfn_data_holder_t; + +#if OPAL_HAVE_LTDL_ADVISE +extern lt_dladvise opal_mca_dladvise; +#endif #endif /* OMPI_WANT_LIBLTDL */ @@ -387,7 +391,11 @@ /* Now try to load the component */ +#if OPAL_HAVE_LTDL_ADVISE + component_handle = lt_dlopenadvise(target_file->filename, opal_mca_dladvise); +#else component_handle = lt_dlopenext(target_file->filename); +#endif if (NULL == component_handle) { err = strdup(lt_dlerror()); if (0 != show_errors) { Modified: trunk/opal/mca/base/mca_base_component_repository.c = = --- trunk/opal/mca/base/mca_base_component_repository.c (original) +++ trunk/opal/mca/base/mca_base_component_repository.c 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -85,6 +85,10 @@ static repository_item_t *find_component(const char *type, const char *name); static int link_items(repository_item_t *src, repository_item_t *depend); +#if OPAL_HAVE_LTDL_ADVISE +lt_dladvise opal_mca_dladvise; +#endif + #endif /* OMPI_WANT_LIBLTDL */ @@ -103,6 +107,20 @@ return OPAL_ERR_OUT_OF_RESOURCE; } +#if OPAL_HAVE_LTDL_ADVISE +if (lt_dladvise_init(&opal_mca_dladvise)) { +return OPAL_ERR_OUT_OF_RESOURCE; +} + +if (lt_dladvise_ext(&opal_mca_dladvise)) { +return OPAL_ERROR; +} + +if (lt_dladvise_global(&opal_mca_dladvise)) { +return OPAL_ERROR; +} +#endif + OBJ_C
Re: [OMPI devel] [OMPI svn] svn:open-mpi r15903
Fixed. Sorry about the configure change mid-day, but it seemed like the right thing to do. Brian On Aug 17, 2007, at 10:37 AM, Brian Barrett wrote: Oh, crud. I forgot to fix that issue. Will fix asap. Brian On Aug 17, 2007, at 10:12 AM, George Bosilca wrote: This patch break the trunk. It looks like the LT_PACKAGE_VERSION wasn't defined before the 2.x version. The autogen fails with the following error: *** Running GNU tools [Running] autom4te --language=m4sh ompi_get_version.m4sh -o ompi_get_version.sh [Running] aclocal configure.ac:998: error: m4_defn: undefined macro: LT_PACKAGE_VERSION configure.ac:998: the top level autom4te: /usr/bin/m4 failed with exit status: 1 aclocal: autom4te failed with exit status: 1 george. On Aug 17, 2007, at 12:08 AM, brbar...@osl.iu.edu wrote: Author: brbarret Date: 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) New Revision: 15903 URL: https://svn.open-mpi.org/trac/ompi/changeset/15903 Log: Support versions of the Libtool 2.1a snapshots after the lt_dladvise code was brought in. This supercedes the GLOBL patch that we had been using with Libtool 2.1a versions prior to the lt_dladvise code. Autogen tries to figure out which version you're on, so either will now work with the trunk. Text files modified: trunk/configure.ac |18 + +++-- trunk/opal/mca/base/mca_base_component_find.c | 8 + +++ trunk/opal/mca/base/mca_base_component_repository.c |24 + +++ 3 files changed, 48 insertions(+), 2 deletions(-) Modified: trunk/configure.ac = = --- trunk/configure.ac (original) +++ trunk/configure.ac 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -995,10 +995,15 @@ ompi_show_subtitle "Libtool configuration" +m4_if(m4_version_compare(m4_defn([LT_PACKAGE_VERSION]), 2.0), -1, [ AC_LIBLTDL_CONVENIENCE(opal/libltdl) AC_LIBTOOL_DLOPEN AC_PROG_LIBTOOL - +], [ +LT_CONFIG_LTDL_DIR([opal/libltdl], [subproject]) +LTDL_CONVENIENCE +LT_INIT([dlopen win32-dll]) +]) ompi_show_subtitle "GNU libltdl setup" # AC_CONFIG_SUBDIRS appears to be broken for non-gcc compilers (i.e., @@ -1038,6 +1043,13 @@ if test "$HAPPY" = "1"; then LIBLTDL_SUBDIR=libltdl +CPPFLAGS_save="$CPPFLAGS" +CPPFLAGS="-I." +AC_EGREP_HEADER([lt_dladvise_init], [opal/libltdl/ltdl.h], +[OPAL_HAVE_LTDL_ADVISE=1], +[OPAL_HAVE_LTDL_ADVISE=0]) +CPPFLAGS="$CPPFLAGS" + # Arrgh. This is gross. But I can't think of any other way to do # it. :-( @@ -1057,7 +1069,7 @@ AC_MSG_WARN([libltdl support disabled (by --disable-dlopen)]) LIBLTDL_SUBDIR= -LIBLTDL= +OPAL_HAVE_LTDL_ADVISE=0 # append instead of prepend, since LIBS are going to be system # type things needed by everyone. Normally, libltdl will push @@ -1073,6 +1085,8 @@ AC_DEFINE_UNQUOTED(OMPI_WANT_LIBLTDL, $OMPI_ENABLE_DLOPEN_SUPPORT, [Whether to include support for libltdl or not]) +AC_DEFINE_UNQUOTED(OPAL_HAVE_LTDL_ADVISE, $OPAL_HAVE_LTDL_ADVISE, +[Whether libltdl appears to have the lt_dladvise interface]) ## # visibility Modified: trunk/opal/mca/base/mca_base_component_find.c = = --- trunk/opal/mca/base/mca_base_component_find.c (original) +++ trunk/opal/mca/base/mca_base_component_find.c 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -75,6 +75,10 @@ char name[MCA_BASE_MAX_COMPONENT_NAME_LEN]; }; typedef struct ltfn_data_holder_t ltfn_data_holder_t; + +#if OPAL_HAVE_LTDL_ADVISE +extern lt_dladvise opal_mca_dladvise; +#endif #endif /* OMPI_WANT_LIBLTDL */ @@ -387,7 +391,11 @@ /* Now try to load the component */ +#if OPAL_HAVE_LTDL_ADVISE + component_handle = lt_dlopenadvise(target_file->filename, opal_mca_dladvise); +#else component_handle = lt_dlopenext(target_file->filename); +#endif if (NULL == component_handle) { err = strdup(lt_dlerror()); if (0 != show_errors) { Modified: trunk/opal/mca/base/mca_base_component_repository.c = = --- trunk/opal/mca/base/mca_base_component_repository.c (original) +++ trunk/opal/mca/base/mca_base_component_repository.c 2007-08-17 00:08:23 EDT (Fri, 17 Aug 2007) @@ -85,6 +85,10 @@ static repository_item_t *find_component(const char *type, const char *name); static int link_items(repository_item_t *src, repository_item_t *depend); +#if OPAL_HAVE_LTDL_ADVISE +lt_dladvise opal_mca_dladvise; +#endif + #endif /* OMPI_WANT_LIBLTDL */ @@ -103,6 +107,20 @@ return OPAL_ERR_OUT_OF_RESOURCE; } +#if OPAL_HAVE_LTDL_ADVISE +if (lt_dladvise_init(&opal_mca_dladvise)) { +return OPAL_ERR_OUT_OF_RESOURCE; +} + +if (lt_dladvise_ext(&opal_mca_
Re: [OMPI devel] Public tmp branches
I thought about both of those (/tmp/private and /tmp/public), but couldn't think of a way to make them work. 1. If we do /tmp/private, we have to svn mv all the existing trees there which will annoy the developers (but is not a deal-breaker) and make /tmp publicly readable. But that makes the history of all the private branches public. 2. If we do /tmp/public, I'm not quite sure how to setup the perms in SH to do that - if we setup /tmp to be 'no read access' for * and /tmp/public to have 'read access' for *, will a non authenticated user be able to reach /tmp/private? -jms -Original Message- From: Brian Barrett [mailto:bbarr...@lanl.gov] Sent: Friday, August 17, 2007 11:51 AM Eastern Standard Time To: Open MPI Developers Subject:Re: [OMPI devel] Public tmp branches ugh, sorry, I've been busy this week and didn't see a timeout, so a response got delayed. I really don't like this format. public doesn't have any meaning to it (tmp suggests, well, it's temporary). I'd rather have /tmp/ and / tmp/private or something like that. Or /tmp/ and /tmp/public/. Either way :/. Brian On Aug 17, 2007, at 6:21 AM, Jeff Squyres wrote: > I didn't really put this in RFC format with a timeout, but no one > objected, so I have created: > > http://svn.open-mpi.org/svn/ompi/public > > Developers should feel free to use this tree for public temporary > branches. Specifically: > > - use /tmp if your branch is intended to be private > - use /public if your branch is intended to be public > > Enjoy. > > > On Aug 10, 2007, at 9:50 AM, Jeff Squyres wrote: > >> Right now all branches under /tmp are private to the OMPI core group >> (e.g., to allow unpublished academic work). However, there are >> definitely cases where it would be useful to allow public branches >> when there's development work that is public but not yet ready for >> the trunk. Periodically, we go an assign individual permissions to / >> tmp branches (like I just did to /tmp/vt-integration), but it would >> be easier if we had a separate tree for public "tmp" branches. >> >> Would anyone have an objection if I added /public (or any better name >> that someone can think of) for tmp-style branches, but that are open >> for read-only access to the public? >> >> -- >> Jeff Squyres >> Cisco Systems >> >> ___ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > -- > Jeff Squyres > Cisco Systems > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]
On Fri, Aug 17, 2007 at 03:08:12PM +0200, Manuel Prinz wrote: > Am Freitag, den 17.08.2007, 09:02 -0400 schrieb Jeff Squyres: > > I don't think those options are safe on any architecture. > > I'll disable them in debian/rules then and document it. > > Dirk, are you fine with that? Sure thing. We simply didn't know abput the brokenness re threads in 1.2. Dirk, on vacation > > Best regards > Manuel > > > ___ > Pkg-openmpi-maintainers mailing list > pkg-openmpi-maintain...@lists.alioth.debian.org > http://lists.alioth.debian.org/mailman/listinfo/pkg-openmpi-maintainers -- Three out of two people have difficulties with fractions.
Re: [OMPI devel] [OMPI users] Possible Memcpy bug in MPI_Comm_split
On 8/16/07, George Bosilca wrote: > Well, finally someone discovered it :) I know about this problem for > quite a while now, it pop up during our own valgrind test of the > collective module in Open MPI. However, it never create any problems > in the applications, at least not as far as I know. That's why I'm > reticent to replace the memcpy by a memmove (where the arguments are > allowed to overlap) as there is a performance penalty. George, I believe I also reported this some time ago, and your comments were the same :-). No time to dive into the internals, but for my the question is: What's going on in Comm::Split() that it falls to copy everlapping memory? It is expected, or it is perhaps a bug? Regards, > >george. > > On Aug 16, 2007, at 9:31 AM, Allen Barnett wrote: > > > Hi: > > I was running my OpenMPI 1.2.3 application under Valgrind and I > > observed > > this error message: > > > > ==14322== Source and destination overlap in memcpy(0x41F5BD0, > > 0x41F5BD8, > > 16) > > ==14322==at 0x49070AD: memcpy (mc_replace_strmem.c:116) > > ==14322==by 0x4A45CF4: ompi_ddt_copy_content_same_ddt > > (in /home/scratch/DMP/RHEL4-GCC4/lib/libmpi.so.0.0.0) > > ==14322==by 0x7A6C386: ompi_coll_tuned_allgather_intra_bruck > > (in /home/scratch/DMP/RHEL4-GCC4/lib/openmpi/mca_coll_tuned.so) > > ==14322==by 0x4A29FFE: ompi_comm_split > > (in /home/scratch/DMP/RHEL4-GCC4/lib/libmpi.so.0.0.0) > > ==14322==by 0x4A4E322: MPI_Comm_split > > (in /home/scratch/DMP/RHEL4-GCC4/lib/libmpi.so.0.0.0) > > ==14322==by 0x400A26: main > > (in /home/scratch/DMP/severian_tests/ompi/a.out) > > > > Attached is a reduced code example. I run it like: > > > > mpirun -np 3 valgrind ./a.out > > > > I only see this error if there are an odd number of processes! I don't > > know if this is really a problem or not, though. My OMPI application > > seems to work OK. However, the linux man page for memcpy says > > overlapping range copying is undefined. > > > > Other details: x86_64 (one box, two dual-core opterons), RHEL 4.5, > > OpenMPI-1.2.3 compiled with the RHEL-supplied GCC 4 (gcc4 (GCC) 4.1.1 > > 20070105 (Red Hat 4.1.1-53)), valgrind 3.2.3. > > > > Thanks, > > Allen > > > > > > -- > > Allen Barnett > > Transpire, Inc. > > e-mail: al...@transpireinc.com > > Ph: 518-887-2930 > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594