Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Gilles Gouaillardet
Larry, currently, OpenMPI generate mpif-sizeof.h with up to 15 dimensions with intel compilers, but up to 7 dimensions with "recent" gcc (for example gcc 5.2 and higher) so i guess the logic behind this is "give the compiler all it can handle", so if intel somehow "extended" the standard to

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Gilles Gouaillardet
Dave, you should not expect anything when mixing Fortran compilers (and to be on the safe side, you might not expect much when mixing C/C++ compilers too, for example, if you built ompi with intel and use gcc for your app, gcc might complain about unresolved symbols from the intel runtime) if

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-02 Thread Gilles Gouaillardet
ng about debug builds being used for performance testing > +1 I’m increasingly feeling that we shouldn’t output that message every time > someone executes a debug-based operation, even if we add a param to turn > off the warning. > +1 > > On Mar 2, 2016, at 5:48 AM, Gilles Gouaill

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-02 Thread Gilles Gouaillardet
pers who > don’t would be nice. > > > On Mar 2, 2016, at 4:51 AM, Jeff Squyres (jsquyres) > wrote: > > > > On Mar 2, 2016, at 6:30 AM, Mark Santcroos > wrote: > >> > >>> On 02 Mar 2016, at 5:06 , Gilles Gouaillardet > wrote: > >>

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
performance benchmark, then i will not get the warning i need (and yes, i will be the only one to blame ... but isn't it something we want to avoid here ?) Cheers, Gilles On 3/2/2016 1:43 PM, George Bosilca wrote: On Mar 1, 2016, at 22:27 , Gilles Gouaillardet wrote: be "me-frien

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
ng opinion, and i am fine with setting a parameter (i will likely soon forget i set that) in a config file. Cheers, Gilles On 3/2/2016 1:21 PM, Jeff Squyres (jsquyres) wrote: On Mar 1, 2016, at 10:17 PM, Gilles Gouaillardet wrote: In this case, should we only display the warning if debug

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
(jsquyres) wrote: On Mar 1, 2016, at 10:06 PM, Gilles Gouaillardet wrote: what about *not* issuing this warning if OpenMPI is built from git ? that would be friendlier for OMPI developers, and should basically *not* affect endusers, since they would rather build OMPI from a tarball. We&#x

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
Jeff, what about *not* issuing this warning if OpenMPI is built from git ? that would be friendlier for OMPI developers, and should basically *not* affect endusers, since they would rather build OMPI from a tarball. Cheers, Gilles On 3/2/2016 1:00 PM, Jeff Squyres (jsquyres) wrote: WHAT: Ha

Re: [OMPI devel] MTT setup updated to gcc-6.0 (pre)

2016-03-01 Thread Gilles Gouaillardet
fwiw in a previous thread, Jeff Hammond explained this is why mpich is relying on C89 instead of C99, since C89 appears to be a subset of C++11. Cheers, Gilles On 3/2/2016 1:02 AM, Nathan Hjelm wrote: I will add to how crazy this is. The C standard has been very careful to not break existin

Re: [OMPI devel] Segmentation fault in opal_fifo (MTT)

2016-03-01 Thread Gilles Gouaillardet
Adrian, About bitness, it is correctly set when MPI install successes See https://mtt.open-mpi.org/index.php?do_redir or even your successful install on x86_64 I suspect it is queried once the installation is successful, and I ll try to have a look at it. Cheers, Gilles On Tuesday, March 1, 20

Re: [OMPI devel] Confused topic for developer's meeting

2016-02-26 Thread Gilles Gouaillardet
Ralph, The goal here is to allow vendor to distribute binary orte frameworks (on top of binary components they can already distribute) that can be used by a user compiled "stock" openmpi library). Did I get it right so far ? I gave it some thoughts and found that could be simplified. My unders

Re: [OMPI devel] error while compiling openmpi

2016-02-26 Thread Gilles Gouaillardet
Monika, Can you send all the information listed here: https://www.open-mpi.org/community/help/ btw, are you using a cross-compiler ? can you try to compile this simple program : typedef struct xxx xxx; struct xxx { int i; xxx *p; }; void yyy(xxx *x) { x->i = 0; x->p =

Re: [OMPI devel] [OMPI users] Adding a new BTL

2016-02-25 Thread Gilles Gouaillardet
t_sources = $(sources) >> else >> lib = libmca_btl_lf.la >> lib_sources = $(sources) >> component = >> component_sources = >> endif >> >> mcacomponentdir = $(opallibdir) >> mcacomponent_LTLIBRARIES = $(component) >> mca_btl_lf_la_SOURCES = $(co

Re: [OMPI devel] use-mpi mpiext?

2016-02-24 Thread Gilles Gouaillardet
Aurelien, I guess you should also have noinst_LTLIBRARIES += libmpiext_blabla_usempi.la in your Makefile.am is your extension available somewhere in github so we can have a look ? Cheers, Gilles On Wednesday, February 24, 2016, Aurélien Bouteiller wrote: > I am making an MPI extension in la

Re: [OMPI devel] ORTED process group

2016-02-23 Thread Gilles Gouaillardet
Ralph, my 0.02 US$ if i understand correctly, we put non-ORTE processes into a different process group because ORTE *might* have grand-children and their progeny, and ORTE does not / cannot know about. /* note we assume here these processes are all well raised and do not create yet an other

Re: [OMPI devel] Trunk is broken

2016-02-17 Thread Gilles Gouaillardet
Folks, i made https://github.com/open-mpi/ompi/pull/1376 to fix this issue note it also revert the changes previously introduced in ompi/runtime/ompi_mpi_init.c Cheers, Gilles On 2/18/2016 8:37 AM, Gilles Gouaillardet wrote: Jeff, this commit only fixes MPI_Init() and not the openib btl

Re: [OMPI devel] Trunk is broken

2016-02-17 Thread Gilles Gouaillardet
problem. So whatever was done missed those precautions and introduced this symbol regardless of the configuration. On Feb 15, 2016, at 8:39 PM, Gilles Gouaillardet > wrote: Ralph, this is being discussed at https://github.com/open-mpi/ompi/pull/1351 btw, how do you get this warning ? i do not s

Re: [OMPI devel] Trunk is broken

2016-02-15 Thread Gilles Gouaillardet
Ralph, this is being discussed at https://github.com/open-mpi/ompi/pull/1351 btw, how do you get this warning ? i do not see it. fwiw, the abstraction violation was kind of already here, so i am surprised it pops up now only Cheers, Gilles On 2/16/2016 1:17 PM, Ralph Castain wrote: Looks l

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Gilles Gouaillardet
geneous system and sender and receiver will always > be using their native format. > i.e, exactly the same as MPI_Pack and MPI_Unpack. > > kindest regards > Mike > > On 12/02/2016, at 9:25 PM, Gilles Gouaillardet wrote: > > Michael, > > byte swapping only occurs if y

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Gilles Gouaillardet
ks for your prompt and most helpful responses. > > warmest regards > MIke > > On 12/02/2016, at 7:03 PM, Gilles Gouaillardet wrote: > > Michael, > > i'd like to correct what i wrote earlier > > in heterogeneous clusters, data is sent "as is" (e.g.

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-12 Thread Gilles Gouaillardet
onfigure'd with --enable-debug, you would have ran into an assert error (e.g. crash). i will work on a fix, but it might take some time before it is ready Cheers, Gilles On 2/11/2016 6:16 PM, Gilles Gouaillardet wrote: Michael, MPI_Pack_external must convert data to big endian, so it can be dump

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Gilles Gouaillardet
I am now at home, this same problem also exists with > the Ubuntu 15.10 OpenMP packages > which surprisingly are still at 1.6.5, same as 14.04. > > Again, downloading, building, and using the latest stable version of > OpenMP solved the problem. > > kindest regards > Mike >

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-11 Thread Gilles Gouaillardet
cked ints are correct. >> >> So, this problem still exists in heterogeneous builds with OpenMPI >> version 1.10.2. >> >> kindest regards >> Mike >> >> On 11 February 2016 at 14:48, Gilles Gouaillardet < >> gilles.gouaillar...@gmail.com >> >

Re: [OMPI devel] Error using MPI_Pack_external / MPI_Unpack_external

2016-02-10 Thread Gilles Gouaillardet
Michael, does your two systems have the same endianness ? do you know how openmpi was configure'd on both systems ? (is --enable-heterogeneous enabled or disabled on both systems ?) fwiw, openmpi 1.6.5 is old now and no more maintained. I strongly encourage you to use openmpi 1.10.2 Cheers, Gi

Re: [OMPI devel] ompi_procs_cutoff, jobid and vpid

2016-02-06 Thread Gilles Gouaillardet
ere is another macro for reassembling the > jobid from the two pieces. If you use those, we’ll avoid any issues with > future modifications to the fields. > > > On Feb 5, 2016, at 8:17 PM, Gilles Gouaillardet > wrote: > > Thanks Ralph, > > I will implement the second o

Re: [OMPI devel] ompi_procs_cutoff, jobid and vpid

2016-02-05 Thread Gilles Gouaillardet
tely > wouldn’t advise it. > > > On Feb 5, 2016, at 7:48 PM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > > wrote: > > Thanks George, > > I will definitely try that ! > > back to the initial question, has someone any thoughts on which bit(s) we &g

Re: [OMPI devel] ompi_procs_cutoff, jobid and vpid

2016-02-05 Thread Gilles Gouaillardet
to cope > with the MSB during the shifting operations? > > George > On Feb 5, 2016 10:08 AM, "Jeff Squyres (jsquyres)" > wrote: > >> On Feb 5, 2016, at 9:26 AM, Gilles Gouaillardet < >> gilles.gouaillar...@gmail.com >> > wrote: >> > >>

Re: [OMPI devel] ompi_procs_cutoff, jobid and vpid

2016-02-05 Thread Gilles Gouaillardet
, February 6, 2016, Jeff Squyres (jsquyres) wrote: > On Feb 5, 2016, at 9:26 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > wrote: > > > > static inline opal_process_name_t ompi_proc_sentinel_to_name (intptr_t > sentinel) > > { > > sentinel >

[OMPI devel] ompi_procs_cutoff, jobid and vpid

2016-02-05 Thread Gilles Gouaillardet
Folks, i was unable to start a simple MPI job using the TCP btl on an heterogeneous cluster and using --mca mpi_procs_cutoff 0. The root cause was the most significant bit of the jobid was set on some nodes but not on others. This is what we have : from opal/dss/dss_types.h typedef uint32_t opa

Re: [OMPI devel] RFC: set MCA param mpi_add_procs_cutoff default to 32

2016-02-04 Thread Gilles Gouaillardet
+1 should we also enable sparse groups by default ? (or at least on master, and then v2.x later) Cheers, Gilles On Thursday, February 4, 2016, Joshua Ladd wrote: > +1 > > > On Wed, Feb 3, 2016 at 9:54 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com > > wrote: > >> WHAT: Decrease default va

Re: [OMPI devel] Use OMPI on another network interface

2016-02-04 Thread Gilles Gouaillardet
Hi, this is difficult to answer such a generic request. MPI symbols (MPI_Bcast, ...) are defined as weak symbols, so the simplest option is to redefine them an implement them the way you like. you are always able to invoke PMPI_Bcast if you want to invoke the openmpi implementation. a more ompi-

Re: [OMPI devel] Porting the underlying fabric interface

2016-02-04 Thread Gilles Gouaillardet
Durga, did you confuse PML and MTL ? basically, a BTL (Byte Transport Layer ?) is used with "primitive" interconnects that can only send bytes. (e.g. if you need to transmit a tagged message, it is up to you send/recv the tag and manually match the tag on the receiver side so you can put the

Re: [OMPI devel] Fwd: [OMPI users] shared memory under fortran, bug?

2016-02-02 Thread Gilles Gouaillardet
It should be sufficient to add the PID of the current process to the filename to ensure it is unique. -Nathan On Tue, Feb 02, 2016 at 09:33:29PM +0900, Gilles Gouaillardet wrote: Nathan, the sm osc component uses communicator CID to name the file that will be used to create sha

Re: [OMPI devel] Fwd: [OMPI users] shared memory under fortran, bug?

2016-02-02 Thread Gilles Gouaillardet
sufficient to add the PID of the current process to the filename to ensure it is unique. -Nathan On Tue, Feb 02, 2016 at 09:33:29PM +0900, Gilles Gouaillardet wrote: Nathan, the sm osc component uses communicator CID to name the file that will be used to create shared memory segments

[OMPI devel] Fwd: [OMPI users] shared memory under fortran, bug?

2016-02-02 Thread Gilles Gouaillardet
Nathan, the sm osc component uses communicator CID to name the file that will be used to create shared memory segments. if I understand and correctly, two different communicators coming from the same MPI_Comm_split might share the same CID, so CID (alone) cannot be used to generate a unique per co

Re: [OMPI devel] malloc(0) warnings in post/wait and start/complete calls with GROUP_EMPTY

2016-02-02 Thread Gilles Gouaillardet
Lisandro, here is attached a patch (master does things differently, so this has to be a one-off patch anyway) could you please give it a try ? btw, how do you get these warnings automatically ? Cheers, Gilles On 2/2/2016 12:02 AM, Lisandro Dalcin wrote: You might argue that the attached te

[OMPI devel] Fwd: Re: [OMPI users] New libmpi.so dependency on libibverbs.so?

2016-02-01 Thread Gilles Gouaillardet
-Post: devel@lists.open-mpi.org Date: Tue, 2 Feb 2016 10:26:53 +0900 From: Gilles Gouaillardet To: Open MPI Users Simon, this is an usnic requirement (mca/common/verbs_usnic to be more specific) as a workaround (and assuming you do not need usnic stuff on any of your nodes) you can

Re: [OMPI devel] orted-children communication

2016-01-26 Thread Gilles Gouaillardet
iirc, there are pipes between orted and app for IOF (I/O forwarding) (stdin, stdout and stderr) On Tuesday, January 26, 2016, Gianmario Pozzi wrote: > Thank you, Ralph. > > What about ORTE_DAEMON_MESSAGE_LOCAL_PROCS case into orte_comm.c? I see it > calls orte_odls.deliver_message() and the com

Re: [OMPI devel] configure warning from master

2016-01-25 Thread Gilles Gouaillardet
Thanks Paul, it seems a "git add" was missed in the upstream pmix repo, i will make a PR for that Cheers, Gilles On 1/26/2016 9:50 AM, Paul Hargrove wrote: Using last night's master tarball I am seeing the following at configure time: [path-to]/openmpi-dev-3397-g70787d1/opal/mca/pmix/pmix12

Re: [OMPI devel] Benchmark with multiple orteds

2016-01-25 Thread Gilles Gouaillardet
enough to take an optimized path when > doing a loopback as opposed to inter-node communication. > > > On Mon, Jan 25, 2016 at 4:28 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > > wrote: > >> Federico, >> >> I did not expect 0% degradation,

Re: [OMPI devel] Benchmark with multiple orteds

2016-01-25 Thread Gilles Gouaillardet
MPI_Alltoall, MPI_Gather, MPI_Scatter, MPI_Scan, MPI_Send/Recv > > Cheers, > Federico > __ > Federico Reghenzani > M.Eng. Student @ Politecnico di Milano > Computer Science and Engineering > > > > 2016-01-25 12:17 GMT+01:00 Gilles Gouaillardet < > gilles.goua

Re: [OMPI devel] Benchmark with multiple orteds

2016-01-25 Thread Gilles Gouaillardet
Federico, unless you already took care of that, I would guess all 16 orted bound their children MPI tasks on socket 0 can you try mpirun --bind-to none ... btw, is your benchmark application cpu bound ? memory bound ? MPI bound ? Cheers, Gilles On Monday, January 25, 2016, Federico Reghenzani

[OMPI devel] tm-less tm module

2016-01-24 Thread Gilles Gouaillardet
Folks, there was a question about mtt on the mtt mailing list http://www.open-mpi.org/community/lists/mtt-users/2016/01/0840.php after a few emails (some offline) it seems that was a configuration issue. the user is running PBSPro and it seems OpenMPI was not configured with the tm module (e.

[OMPI devel] file desciptor leak in master

2016-01-14 Thread Gilles Gouaillardet
Ralph, i noticed a file descriptor leak with current master. that can be easily reproduced with the loop_spawn test from the ibm/dynamic test suite mpirun -np 1 ./loop_spawn after a few seconds, you can see the leak via lsof -p $(pidof mpirun) there is a bunch of files such as mpirun 20791

Re: [OMPI devel] Compilation error on master

2016-01-09 Thread Gilles Gouaillardet
This is now fixed in master Thanks for the report ! Gilles On Saturday, January 9, 2016, Shamis, Pavel wrote: > Hey Folks > > OpenMPI master appears to be broken for a non-debug build: > --- > make[2]: Entering directory `ompi/build/opal' > CC runtime/opal_progress.lo > ../../opal/runt

Re: [OMPI devel] OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-3330-g213b2ab

2016-01-07 Thread Gilles Gouaillardet
; >> Those revisions listed above that are new to this repository have >> not appeared on any other notification email; so we list those >> revisions in full, below. >> >> - Log - >> https://github.co

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-3330-g213b2ab

2016-01-06 Thread Gilles Gouaillardet
i did forget that indeed ... and i just pushed it Cheers, Gilles On 1/7/2016 12:33 AM, Ralph Castain wrote: Hmmm…I don’t see a second commit message anywhere. Did you perhaps forget to push it? Thanks for the explanation! Ralph On Jan 6, 2016, at 2:30 AM, Gilles Gouaillardet

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-3330-g213b2ab

2016-01-06 Thread Gilles Gouaillardet
https://github.com/open-mpi/ompi/commit/213b2abde47cf02ba3152a301d3ec0ffeec54438 > > > > commit 213b2abde47cf02ba3152a301d3ec0ffeec54438 > > Author: Gilles Gouaillardet > > > Date: Wed Jan 6 16:21:13 2016 +0900 > > > >dpm: correctly handle procs_cutoff in

Re: [OMPI devel] Problem using --with-hcoll on Mellanox DMZ cluster

2015-12-30 Thread Gilles Gouaillardet
Paul, generally speaking, when using mellanox stuff (mxm, hcoll, fca) these libraries must be accessible, either via LD_LIBRARY_PATH or via ld.so.conf I do not the config of these cluster, but you might have to use the mellanox libraries tha could be in a non standard location. Cheers, Gilles

Re: [OMPI devel] PMIX on 2.0.0rc1 and cygwin build

2015-12-24 Thread Gilles Gouaillardet
Marco, If I understand correctly, pmix is mandatory, regardless you run on a laptop or an exascale system. Cheers, Gilles On Thursday, December 24, 2015, Marco Atzeri wrote: > On 24/12/2015 06:10, Gilles Gouaillardet wrote: > >> Marco, >> >> Thanks for the patch,

Re: [OMPI devel] PMIX on 2.0.0rc1 and cygwin build

2015-12-24 Thread Gilles Gouaillardet
Marco, Thanks for the patch, i will apply the changes related to the missing include files to master and PR to v2.x on linux, libpmix.so does not depend on libopen-pal. that being said, libpmix.so has undefined symbols related to hwloc and libevent, and these symbols are defined in libopen-pa

Re: [OMPI devel] v1.10: mpirun --host xxx behaviour

2015-12-23 Thread Gilles Gouaillardet
and 2.x branch all work that way too. On Dec 22, 2015, at 12:49 AM, Gilles Gouaillardet wrote: Ralph, i (re)discovered an old and odd behaviour in v1.10, which was discussed in https://github.com/open-mpi/ompi-release/pull/664 when running mpirun --host xxx ... mpirun v1.10 assumes one slo

Re: [OMPI devel] v1.10: mpirun --host xxx behaviour

2015-12-22 Thread Gilles Gouaillardet
at the master and 2.x branch all work that way too. > > > > On Dec 22, 2015, at 12:49 AM, Gilles Gouaillardet > wrote: > > > > Ralph, > > > > i (re)discovered an old and odd behaviour in v1.10, which was discussed > in https://github.com/open-mpi/ompi-release

[OMPI devel] v1.10: mpirun --host xxx behaviour

2015-12-22 Thread Gilles Gouaillardet
Ralph, i (re)discovered an old and odd behaviour in v1.10, which was discussed in https://github.com/open-mpi/ompi-release/pull/664 when running mpirun --host xxx ... mpirun v1.10 assumes one slot per host. consequently, on my vm with 4 cores mpirun -np 2 ./helloworld_mpi works fine but mpiru

Re: [OMPI devel] [1.10.2rc1] alloc link failure on Solaris

2015-12-21 Thread Gilles Gouaillardet
Thanks Paul ! i will review this and make the PRs Cheers, Gilles On 12/20/2015 9:44 AM, Paul Hargrove wrote: On my Solaris 11.2 system, alloca() is a macro defined in alloca.h. So, the following is needed to avoid link failures: --- ompi/mca/pml/cm/pml_cm.h~ Sat Dec 19 16:25:54 2015 +++ om

Re: [OMPI devel] Open MPI v1.10.2rc1 available

2015-12-21 Thread Gilles Gouaillardet
, Gilles On 12/22/2015 7:38 AM, Paul Hargrove wrote: Gilles, It looked to me like PR 857 includes this fix. Are you saying you are going to spilt if off from that one (to speed up the review)? -Paul On Mon, Dec 21, 2015 at 2:26 PM, Gilles Gouaillardet mailto:gilles.gouaillar...@gmail.com

Re: [OMPI devel] Open MPI v1.10.2rc1 available

2015-12-21 Thread Gilles Gouaillardet
Paul and Orion, the fix has been merged into v1.10. I will issue a separate pr for v2.x since this issue is impacting quite a lot of openmpi users Sorry for the inconvenience, Gilles On Tuesday, December 22, 2015, Paul Hargrove wrote: > Orion, > > The FCFLAGS_save issue was been fixed in mast

Re: [OMPI devel] vader and mmap_shmem module cleanup problem

2015-12-16 Thread Gilles Gouaillardet
ors are still shared. > > BR Justin > > On 15. 12. 2015 14:55, Gilles Gouaillardet wrote: > > Justin, > > at first glance, vader should be symmetric (e.g. > call opal_shmem_segment_dettach() instead of munmap() > Nathan, can you please comment ? > > using tid instea

Re: [OMPI devel] OMPI devel] vader and mmap_shmem module cleanup problem

2015-12-16 Thread Gilles Gouaillardet
>were really small and elegant). So while there are no real processes, new >binary / ELF file is loaded at different address then the rest of OS - so it >has separate global variables, and separate environ too. Other resources like >file descriptors are still shared. > >BR Just

Re: [OMPI devel] vader and mmap_shmem module cleanup problem

2015-12-15 Thread Gilles Gouaillardet
Justin, at first glance, vader should be symmetric (e.g. call opal_shmem_segment_dettach() instead of munmap() Nathan, can you please comment ? using tid instead of pid should also do the trick that being said, a more elegant approach would be to create a new module in the shmem framework basica

Re: [OMPI devel] RFC: remove internal copies of libevent and hwloc

2015-12-10 Thread Gilles Gouaillardet
Ralph, iirc, we are using a slightly patched version of libevent, is this correct ? I guess removing the internal versions is the way to go, that being said, could/should we do this one step at a time ? I mean a first step could be to update the configure default option to configure --with-hwloc=

Re: [OMPI devel] Add an orte tool

2015-12-09 Thread Gilles Gouaillardet
Federico, you also need to update orte/tools/Makefile.am Cheers, Gilles On Wednesday, December 9, 2015, Federico Reghenzani < federico1.reghenz...@mail.polimi.it> wrote: > Hi! > > I'm trying to add a new tool under /orte/tools/, I've followed as example > the orte-ps and created my orted-resto

Re: [OMPI devel] Problem with the 1.8.8 version

2015-12-09 Thread Gilles Gouaillardet
Folks, as discussed off-list, and for the records https://github.com/open-mpi/ompi-release/commit/8d658a734f352dfa104d1794330f44e3c52c4a76 must also be applied in order to fix v1.8 Cheers, Gilles On Wed, Dec 9, 2015 at 2:08 AM, Baldassari Caroline wrote: > Gilles, Chris, > > Thank you for yo

[OMPI devel] issue with MPI_DISPLACEMENT_CURRENT and use mpi

2015-12-03 Thread Gilles Gouaillardet
Jeff, the following program does not compile : $ mpifort -c mpi_displacement_current_usempi.f90 mpi_displacement_current_usempi.f90:6:64: & ,MPI_DATATYPE_NULL , "native", MPI_INFO_NULL, ierr ) 1 Error: There is no specifi

Re: [OMPI devel] RFC: Jenkins testing

2015-11-25 Thread Gilles Gouaillardet
However, what Howard is doing helps resolve > it by breaking out the Jenkins runs into categories. So instead of one > massive test session, setup one Jenkins server for each category. Then we > can set the specific tags according to the test category. > > Make sense? > Ralph > >

Re: [OMPI devel] RFC: Jenkins testing

2015-11-25 Thread Gilles Gouaillardet
Ralph and all, My 0.02US$ We are kind of limited by the github API https://developer.github.com/v3/repos/statuses/ Basically, a status is pending, success, error or failure plus a string. A possible work around is to have Jenkins set labels on the PR. If only valgrind fails, the status could be

Re: [OMPI devel] ompi_win_create hangs on a non uniform cluster

2015-11-14 Thread Gilles Gouaillardet
Thanks, > > Howard > > Von meinem iPhone gesendet > > > Am 10.11.2015 um 19:57 schrieb Gilles Gouaillardet >: > > > > Nathan, > > > > a simple MPI_Win_create test hangs on my non uniform cluster > (ibm/onesided/c_create) > > > > one node ha

[OMPI devel] ompi_win_create hangs on a non uniform cluster

2015-11-10 Thread Gilles Gouaillardet
Nathan, a simple MPI_Win_create test hangs on my non uniform cluster (ibm/onesided/c_create) one node has an IB card but not the other one. the node with the IB card select the rdma osc module, but the other node select the pt2pt module. and then it hangs because both ends do no try to initi

Re: [OMPI devel] Open MPI autogen.pl in tarball

2015-10-30 Thread Gilles Gouaillardet
Jeff, OK, will do Cheers, Gilles On Saturday, October 31, 2015, Jeff Squyres (jsquyres) wrote: > On Oct 30, 2015, at 12:09 PM, Barrett, Brian > wrote: > > > > However, I do like Gilles' suggestion to make autogen.pl be a little > smarter. If I recall correctly (and it's been a couple years

Re: [OMPI devel] Fwd: [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2921-gb603307

2015-10-28 Thread Gilles Gouaillardet
hat the pmix_server.h includes pmix/pmix_common.h and not pmix_common.h. If you want to figure this one, that a good starting point. Btw, why do we have 3 headers with the same name (it's s confusing) ? George. On Wed, Oct 28, 2015 at 1:08 AM, Gilles Gouaillardet mailto:gil...@rist.or.jp&

Re: [OMPI devel] Fwd: [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2921-gb603307

2015-10-28 Thread Gilles Gouaillardet
wonder how your compiler gets to know the definition of the PMIX_ERR_SILENT without the pmix_common.h. I just pushed a fix. George. On Wed, Oct 28, 2015 at 12:43 AM, Gilles Gouaillardet mailto:gil...@rist.or.jp>> wrote: George, i am unable to reproduce the issue. if build

Re: [OMPI devel] Fwd: [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2921-gb603307

2015-10-28 Thread Gilles Gouaillardet
George, i am unable to reproduce the issue. if build still breaks for you, could you send me your configure command line ? Cheers, Gilles On 10/28/2015 1:04 PM, Gilles Gouaillardet wrote: George, PMIX_ERR_SILENT is defined in opal/mca/pmix/pmix1xx/pmix/include/pmix/pmix_common.h.in i

Re: [OMPI devel] Fwd: [OMPI commits] Git: open-mpi/ompi branch master updated. dev-2921-gb603307

2015-10-28 Thread Gilles Gouaillardet
George, PMIX_ERR_SILENT is defined in opal/mca/pmix/pmix1xx/pmix/include/pmix/pmix_common.h.in i ll have a look at it from now Cheers, Gilles On 10/28/2015 12:02 PM, George Bosilca wrote: We get a nice compiler complaint: ../../../../../../ompi/opal/mca/pmix/pmix1xx/pmix/src/server/pmix_s

Re: [OMPI devel] master build fails

2015-10-27 Thread Gilles Gouaillardet
FWIW before Jeff fixed that, build was successful on my RHEL7 box (stdio.h is included from verbs_exp.h that is included from verbs.h) but failed on my RHEL6 box (verbs.h does *not* include stdio.h) so there was some room for Jenkins not to fail Cheers, Gilles On 10/27/2015 9:17 PM, Jeff Squy

Re: [OMPI devel] Open MPI autogen.pl in tarball

2015-10-27 Thread Gilles Gouaillardet
Jeff and all, my 0.02 US$ ... - autogen.pl was recently used with v1.10 on a PowerPC Little Endian arch (that was mandatory since the libtool we use to generate v1.10 series do not yet support PPC LE) - if we remove (from the tarball) autogen.pl, should we also remove configure.ac ? and w

Re: [OMPI devel] Compile only one framework/component

2015-10-27 Thread Gilles Gouaillardet
Federico, in order to build one component, just cd into the component directory (in build directory if you are using VPATH) and run make (install) components and frameworks depend on other framework, so it is generally safer to run make from the top build directory Cheers, Gilles On Tuesday, O

Re: [OMPI devel] OMPI devel] Checkpoint/restart + migration

2015-10-26 Thread Gilles Gouaillardet
? Cheers, Federico __ Federico Reghenzani M.Eng. Student @ Politecnico di Milano Computer Science and Engineering 2015-10-23 11:45 GMT+02:00 Gilles Gouaillardet mailto:gilles.gouaillar...@gmail.com>>: Gianmario, Iirc, there is one pipe between orted and each children stderr.

Re: [OMPI devel] OMPI devel] mtt-submit, etc.

2015-10-23 Thread Gilles Gouaillardet
George, Then you cannot use https otherwise certificate check will fail, Note if you have a proxy, you can tunnel to the proxy and that should be fine. The main drawback is the ssh connection must be active when contacting IU, and if a batch manager is used, no one knows when that will be needed.

Re: [OMPI devel] OMPI devel] Checkpoint/restart + migration

2015-10-23 Thread Gilles Gouaillardet
al ip, you could reuse existing infrastructure at least to migrate orted and its tcp/ip connections Cheers, Gilles Federico Reghenzani wrote: >Hi Adrian and Gilles, > > >first of all thank you for your responses. I'm working with Gianmario on this >ambitious project. > >

Re: [OMPI devel] mtt-submit, etc.

2015-10-23 Thread Gilles Gouaillardet
Howard, that has already been raised in http://www.open-mpi.org/community/lists/mtt-users/2014/10/0820.php at the end, Christoph claimed he could achieve that with mtt-relay (but provided no detail on how ...) You might want to check the full thread and/or ask Christoph directly Ralph, IIRC

Re: [OMPI devel] 1.10.1 overnight failures - Fortran

2015-10-22 Thread Gilles Gouaillardet
Ralph, i made PR #711 https://github.com/open-mpi/ompi-release/pull/711 to fix this issue Cheers, Gilles On 10/23/2015 7:39 AM, Gilles Gouaillardet wrote: Ralph, these are MPI 3 functions that did not land yet into the v1.10 series. only MPI_Aint arithmetic functions landed into v1.10 so

Re: [OMPI devel] 1.10.1 overnight failures - Fortran

2015-10-22 Thread Gilles Gouaillardet
Ralph, these are MPI 3 functions that did not land yet into the v1.10 series. only MPI_Aint arithmetic functions landed into v1.10 so it seems configure is confused (e.g. this test was previously not built, and now it is ...) I ll try to back port the missing functions Cheers, Gilles On Friday

Re: [OMPI devel] Checkpoint/restart + migration

2015-10-22 Thread Gilles Gouaillardet
Gianmario, there was c/r support in the v1.6 series but it has been removed. the current trend is to do application level checkpointing (much more efficient and much smaller checkpoint file size) iirc, ompi took care of closing/restoring all communication, and a third party checkpoint was require

Re: [OMPI devel] Specifying networks/APIs for OMPI (was: topic for agenda)

2015-10-21 Thread Gilles Gouaillardet
Scott and all, two btl are optimized (and work only) for intra node communications : sm and vader by "sm" I am not sure you mean the sm btl, or any/both sm and vader btl. from an user point of view, and to disambiguate this, maybe we should use the term "shm" (which means sm and/or vader btl for

Re: [OMPI devel] Problem running openmpi on nodes connected via eth

2015-10-21 Thread Gilles Gouaillardet
Andrej, a load average of 700 is very curious. i guess you already made sure load average is zero when the system is idle ... are you running an hybrid app (e.g. MPI + OpenMP) ? one possible explanation is you run 48 MPI tasks and each task has 48 OpenMP threads, and that kills performances. wh

Re: [OMPI devel] Problem running openmpi on nodes connected via eth

2015-10-20 Thread Gilles Gouaillardet
Andrej, by "running on the head node", shall i understand you mean "running mpirun command *and* all mpi tasks on the head node" ? by "running on the compute node", shall i understand you mean "running mpirun on the compute node *and* all mpi tasks on the *same* compute node" ? or do you mean

Re: [OMPI devel] Issue with OpenMPI 1.8.4 + Xcode 7.0.1

2015-10-20 Thread Gilles Gouaillardet
>> wrote: >> >> Hi Gilles, >> >> as for that, recompiling OpenMPI works, but causes no change here. >> >> -Tobias >> >> -- >> Dr.-Ing. Tobias Hilbrich >> Research Assistant >> >> Technische Universitaet Dresden, Germany &

Re: [OMPI devel] Issue with OpenMPI 1.8.4 + Xcode 7.0.1

2015-10-20 Thread Gilles Gouaillardet
Tobias, Btw, did you recompile ompi with this xcode ? Iirc, we do similar comparisons in ompi itself Cheers, Gilles Tobias Hilbrich wrote: >Hi all, > >a wonderful puzzle for the OSX folks in your team (Reproducer attached): > >Attached source file builds with Xcode 7.0.0, but fails since the r

Re: [OMPI devel] Issue with OpenMPI 1.8.4 + Xcode 7.0.1

2015-10-20 Thread Gilles Gouaillardet
Tobias, Fwiw, MPI_Comm_compare can be used to compare communicators. Hopefully, this is also compiler friendly. Cheers, Gilles On Tuesday, October 20, 2015, Tobias Hilbrich wrote: > Hi all, > > a wonderful puzzle for the OSX folks in your team (Reproducer attached): > > Attached source file b

Re: [OMPI devel] how to run OpenMPI in OSv container

2015-10-16 Thread Gilles Gouaillardet
amatic mechanism to launch a process in your containers? I.e., can > mpirun programmatically launch MPI processes in OSv containers? > > > > > On Oct 16, 2015, at 6:48 AM, Justin Cinkelj > wrote: > > > > Thank you. At least its clear now that for the immediate pr

Re: [OMPI devel] how to run OpenMPI in OSv container

2015-10-15 Thread Gilles Gouaillardet
Justin, IOF stands for Input/Output (aka I/O) Forwarding here is a very high level overview of a quite simple case. on host A, you run mpirun -host B,C -np 2 a.out without any batch manager and TCP interconnect first, mpirun will fork&exec ssh B orted ... ssh C orted ... the orted daemons will

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

2015-10-14 Thread Gilles Gouaillardet
one (this is the one used) is set by opal/mca/pmix/pmix1xx/configure.m4) Cheers, Gilles On 10/14/2015 3:37 PM, Gilles Gouaillardet wrote: Folks, i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my configure command line. here is what happens : opal/mca/pmix/pmix1xx/configure.m4

Re: [OMPI devel] [OMPI users] fatal error: openmpi-v2.x-dev-415-g5c9b192 andopenmpi-dev-2696-gd579a07

2015-10-14 Thread Gilles Gouaillardet
Folks, i was able to reproduce the issue by adding CPPFLAGS=-I/tmp to my configure command line. here is what happens : opal/mca/pmix/pmix1xx/configure.m4 set the CPPFLAGS environment variable with -I/tmp and include paths for hwloc and libevent then opal/mca/pmix/pmix1xx/pmix/configure is invoked

Re: [OMPI devel] trace the openmpi internal function calls in MPI user program

2015-10-06 Thread Gilles Gouaillardet
be great if you can point out some main OMPI files and functions that are involved in the process. You might want to step through the selection process with a debugger to see what happens. Set a breakpoint on mca_coll_base_comm_select() and step through from there. > Dahai > > &g

Re: [OMPI devel] trace the openmpi internal function calls in MPI user program

2015-10-06 Thread Gilles Gouaillardet
at first, you can check the priorities of the various coll modules with ompi_info $ ompi_info --all | grep \"coll_ | grep priority MCA coll: parameter "coll_basic_priority" (current value: "10", data source: default, level: 9 dev/all, type: int) MCA coll: parameter

Re: [OMPI devel] Access to old users@ and devel@ Open MPI mails

2015-10-03 Thread Gilles Gouaillardet
Jeff, the minor distinction includes the fact that the web archive does not include email addresses, but the mbox does. I am fine handing them the mbox, with a note asking them not to redistribute it, and keeping it in a secure place because no one like being spammed. Cheers, Gilles On Saturday

Re: [OMPI devel] OMPI devel] PMIX vs Solaris

2015-09-29 Thread Gilles Gouaillardet
Paul, the latest master nightly snapshot does include the fix, and i made PRs for v2.x and v1.10 Cheers, Gilles On 9/28/2015 6:29 PM, Gilles Gouaillardet wrote: Thanks Brice, I will do the PR for the various ompi branches from tomorrow Cheers, Gilles Brice Goglin wrote: Sorry, I

Re: [OMPI devel] OMPI devel] PMIX vs Solaris

2015-09-28 Thread Gilles Gouaillardet
il client to avoid missing hwloc-related things >among OMPI mails. > >Brice > > > > >Le 28/09/2015 06:23, Gilles Gouaillardet a écrit : > >Paul and Brice, > >the error message is displayed by libpciaccess when hwloc invokes >pci_system_init > >on Solaris

Re: [OMPI devel] PMIX vs Solaris

2015-09-28 Thread Gilles Gouaillardet
what the doctor ordered! On Sep 23, 2015, at 5:45 PM, Gilles Gouaillardet mailto:gil...@rist.or.jp>> wrote: Ralph, the root cause is getsockopt(..., SOL_SOCKET, SO_RCVTIMEO,...) fails with errno ENOPROTOOPT on solaris 11.2 the attached patch is a proof of conc

Re: [OMPI devel] PMIX vs Solaris

2015-09-23 Thread Gilles Gouaillardet
mpirun a level of 10 for the pmix_base_verbose param? This output isn’t what I would have expected from that level - it looks more like the verbosity was set to 5, and so the error number isn’t printed. Thanks Ralph

<    1   2   3   4   5   6   7   8   9   >