Re: [OMPI users] mpirun works with cmd line call , but not with app context file arg

2016-10-16 Thread Gilles Gouaillardet
Out of curiosity, why do you specify both --hostfile and -H ? Do you observe the same behavior without --hostfile ~/.mpihosts ? Also, do you have at least 4 cores on both A.lan and B.lan ? Cheers, Gilles On Sunday, October 16, 2016, MM wrote: > Hi, > > openmpi 1.10.3 >

Re: [OMPI users] communications groups

2016-10-17 Thread Gilles Gouaillardet
n-mpi.org > <javascript:_e(%7B%7D,'cvml','users-boun...@lists.open-mpi.org');>] *On > Behalf Of *Gilles Gouaillardet > *Sent:* Monday, October 17, 2016 9:30 AM > *To:* Open MPI Users > *Subject:* Re: [OMPI users] communications groups > > > > Rick, > > > > I r

Re: [OMPI users] communications groups

2016-10-17 Thread Gilles Gouaillardet
Rick, In my understanding, sensorgroup is a group with only task 1 Consequently, sensorComm is - similar to MPI_COMM_SELF on task 1 - MPI_COMM_NULL on other tasks, and hence the barrier fails I suggest you double check sensorgroup is never MPI_GROUP_EMPTY and add a test not to call MPI_Barrier

Re: [OMPI users] communications groups

2016-10-17 Thread Gilles Gouaillardet
Rick, I re-read the MPI standard and was unable to figure out if sensorgroup is MPI_GROUP_EMPTY or a group with task 1 on tasks except task 1 (A group that does not contain the current task makes little sense to me, but I do not see any reason why this group have to be MPI_GROUP_EMPTY)

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-08 Thread Gilles Gouaillardet
20x/signals > > Unfortunately it changes nothing. The root rank stops and all other > ranks (and mpirun) just stay, the remaining ranks at 100 % CPU waiting > apparently in that allreduce. The stack trace looks a bit more > interesting (git is always debug build ?), so I include it at t

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-09 Thread Gilles Gouaillardet
Folks, the problem is indeed pretty trivial to reproduce i opened https://github.com/open-mpi/ompi/issues/2550 (and included a reproducer) Cheers, Gilles On Fri, Dec 9, 2016 at 5:15 AM, Noam Bernstein <noam.bernst...@nrl.navy.mil> wrote: > On Dec 8, 2016, at 6:05 AM, Gilles Gou

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-07 Thread Gilles Gouaillardet
Christoph, can you please try again with mpirun --mca btl tcp,self --mca pml ob1 ... that will help figuring out whether pml/cm and/or mtl/psm2 is involved or not. if that causes a crash, then can you please try mpirun --mca btl tcp,self --mca pml ob1 --mca coll ^tuned ... that will help

Re: [OMPI users] epoll add error with OpenMPI 2.0.1 and SGE

2016-12-17 Thread Gilles Gouaillardet
Dave, thanks for the info for what it's worth, it is generally a bad idea to --with-xxx=/usr since you might inadvertently use some other external components. in your case, --with-libevent=external is what you need if you want to use an external libevent library installed in /usr i guess the

Re: [OMPI users] Abort/ Deadlock issue in allreduce

2016-12-11 Thread Gilles Gouaillardet
estigating this ! Cheers Christof On Thu, Dec 08, 2016 at 03:15:47PM -0500, Noam Bernstein wrote: On Dec 8, 2016, at 6:05 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: Christof, There is something really odd with this stack trace. count is zero, and some pointers do not point to

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
Siegmar, I was able to reproduce the issue on my vm (No need for a real heterogeneous cluster here) I will keep digging tomorrow. Note that if you specify an incorrect slot list, MPI_Comm_spawn fails with a very unfriendly error message. Right now, the 4th spawn'ed task crashes, so this is a

Re: [OMPI users] how to specify OSHMEM component from Mellanox in configure

2017-01-11 Thread Gilles Gouaillardet
Juan, Open MPI has its own implementation of OpenSHMEM. The Mellanox software is very likely yet an other implementation of OpenSHMEM. So you can consider these as independent libraries Cheers, Gilles On Wednesday, January 11, 2017, Juan A. Cordero Varelaq < bioinformatica-i...@us.es> wrote:

Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
s? > > > Kind regards > > Siegmar > > Am 11.01.2017 um 10:04 schrieb Gilles Gouaillardet: > >> Siegmar, >> >> I was able to reproduce the issue on my vm >> (No need for a real heterogeneous cluster here) >> >> I will keep digging tomorrow. >

Re: [OMPI users] mca_based_component warnings when running simple example code

2017-01-06 Thread Gilles Gouaillardet
Hi, it looks like you installed Open MPI 2.0.1 at the same location than previous Open MPI 1.10, but you did not uninstall v1.10. the faulty modules have very been likely removed from 2.0.1, hence the error. you can simply remove the openmpi plugins directory and reinstall openmpi rm -rf

Re: [OMPI users] mpirun with ssh tunneling

2016-12-25 Thread Gilles Gouaillardet
Adam, there are several things here with an up-to-date master, you can specify an alternate ssh port via a hostfile see https://github.com/open-mpi/ompi/issues/2224 Open MPI requires more than just ssh. - remote nodes (orted) need to call back mpirun (oob/tcp) - nodes (MPI tasks) need

Re: [OMPI users] "Warning :: opal_list_remove_item" with openmpi-2.1.0rc4

2017-03-22 Thread Gilles Gouaillardet
Roland, the easiest way is to use an external hwloc that is configured with --disable-nvml an other option is to hack the embedded hwloc configure.m4 and pass --disable-nvml to the embedded hwloc configure. note this requires you run autogen.sh and you hence needs recent autotools. i guess Open

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-23 Thread Gilles Gouaillardet
Can you please try mpirun --mca btl tcp,self ... And if it works mpirun --mca btl openib,self ... Then can you try mpirun --mca coll ^tuned --mca btl tcp,self ... That will help figuring out whether the error is in the pml or the coll framework/module Cheers, Gilles On Thursday, March 23,

Re: [OMPI users] Help with Open MPI 2.1.0 and PGI 16.10: Configure and C++

2017-03-23 Thread Gilles Gouaillardet
Matt, a C++ compiler is required to configure Open MPI. That being said, C++ compiler is only used if you build the C++ bindings (That were removed from MPI-3) And unless you plan to use the mpic++ wrapper (with or without the C++ bindings), a valid C++ compiler is not required at all. /*

Re: [OMPI users] migrating to the MPI_F08 module

2017-03-22 Thread Gilles Gouaillardet
Tom, what if you use type(mpi_datatype) :: mpiint Cheers, Gilles On Thursday, March 23, 2017, Tom Rosmond wrote: > > Hello; > > I am converting some fortran 90/95 programs from the 'mpif.h' include file > to the 'mpi_f08' model and have encountered a problem. Here is a

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-29 Thread Gilles Gouaillardet
Hi, yes, please open an issue on github, and post your configure and mpirun command lines. ideally, could you try the latest v1.10.6 and v2.1.0 ? if you can reproduce the issue with a smaller number of MPI tasks, that would be great too Cheers, Gilles On 3/28/2017 11:19 PM, Götz

Re: [OMPI users] Failed to create a queue pair (QP) error

2017-03-25 Thread Gilles Gouaillardet
Iirc, there used to be a bug in Open MPI leading to such a false positive, but I cannot remember the details. I recommend you use at least the latest 1.10 (which is really a 1.8 + a few more features and several bug fixes) An other option is to simply +1 a mtt parameter and see if it helps

Re: [OMPI users] tuning sm/vader for large messages

2017-03-20 Thread Gilles Gouaillardet
Joshua, George previously explained you are limited by the size of your level X cache. that means that you might get optimal performance for a given message size, let's say when everything fits in the L2 cache. when you increase the message size, L2 cache is too small, and you have to

Re: [OMPI users] How to specify the use of RDMA?

2017-03-20 Thread Gilles Gouaillardet
hosts_eth ... (With IB interfaces down) mpirun --mca btl openib,self,sm -hostfile hosts_ib0 ... Regards, Rodrigo On Mon, Mar 20, 2017 at 8:29 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote: You will get similar results with

Re: [OMPI users] How to specify the use of RDMA?

2017-03-20 Thread Gilles Gouaillardet
You will get similar results with hosts_ib and hosts_eth If you want to use tcp over ethernet, you have to mpirun --mca btl tcp,self,sm --mca btl_tcp_if_include eth0 ... If you want to use tcp over ib, then mpirun --mca btl tcp,self,sm --mca btl_tcp_if_include ib0 ... Keep in mind that IMB calls

Re: [OMPI users] Install openmpi.2.0.2 with certain option

2017-04-04 Thread Gilles Gouaillardet
Note that might not be enough if hwloc detects nvml. unfortunatly, there are only workarounds available for this : 1) edit opal/mca/hwloc/hwloc*/configure.m4 and add enable_nvml=no for example after enable_xml=yes note you need recent autotools, and re-run autogen.pl --force 2) build Open

Re: [OMPI users] Compiler error with PGI: pgcc-Error-Unknown switch: -pthread

2017-04-03 Thread Gilles Gouaillardet
Hi, The -pthread flag is likely pulled by libtool from the slurm libmpi.la and/or libslurm.la Workarounds are - rebuild slurm with PGI - remove the .la files (*.so and/or *.a are enough) - wrap the PGI compiler to ignore the -pthread option Hope this helps Gilles On Monday, April 3, 2017,

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-12 Thread Gilles Gouaillardet
That should be a two steps tango - Open MPI bind a MPI task to a socket - the OpenMP runtime bind OpenMP threads to cores (or hyper threads) inside the socket assigned by Open MPI which compiler are you using ? do you set some environment variables to direct OpenMP to bind threads ? Also, how do

Re: [OMPI users] Failed to create a queue pair (QP) error

2017-04-08 Thread Gilles Gouaillardet
What happens is mpirun does under the hood orted And your remote_exec does not propagate LD_LIBRARY_PATH one option is to configure your remote_exec to do so, but I'd rather suggest you re-configure ompi with --enable-orterun-prefix-by-default If your remote_exec is ssh (if you are not running

Re: [OMPI users] Build Failed - OpenMPI 1.10.6 / Ubuntu 16.04 / Oracle Studio 12.5

2017-04-08 Thread Gilles Gouaillardet
So it seems OPAL_HAVE_POSIX_THREADS is not defined, and that should never happen ! Can you please compress and post (or upload into gist or similar) your - config.log - opal/include/opal_config.h Cheers, Gilles On Sunday, April 9, 2017, Travis W. Drayna wrote: > Gilles,

Re: [OMPI users] Cannot run mpirun on ubuntu 16.10

2017-04-20 Thread Gilles Gouaillardet
John, can you run free before the first command and make sure you have all the physical and available memory you expect ? then, after a failed mpirun -np 1 ./helloWorld can you run dmesg and look for messages from the OOM killer ? that would indicate you are running out of memory. maybe some

Re: [OMPI users] fatal error for openmpi-master-201704200300-ded63c with SuSE Linux and gcc-6.3.0

2017-04-20 Thread Gilles Gouaillardet
The PR simply disables nvml in hwloc is CUDA is disabled in Open MPI. it also add cuda directory to CPPFLAGS, so there should be no need to manually add -I/usr/local/cuda/include to CPPFLAGS. Siegmar, could you please post your config.log also, is there a nvml.h file in

Re: [OMPI users] scheduling to real cores, not using hyperthreading (openmpi-2.1.0)

2017-04-13 Thread Gilles Gouaillardet
there any hints how to cleanly transfer the OpenMPI binding to the OpenMP tasks? Thanks and kind regards, Ado On 12.04.2017 15:40, Gilles Gouaillardet wrote: That should be a two steps tango - Open MPI bind a MPI task to a socket - the OpenMP runtime bind OpenMP threads to cores (or hyper thread

Re: [OMPI users] Run-time issues with openmpi-2.0.2 and gcc

2017-04-13 Thread Gilles Gouaillardet
Vincent, Can you try a small program such as examples/ring_c.c ? Does your app do MPI_Comm_spawn and friends ? Can you post your mpirun command line ? Are you using a batch manager ? This error message is typical of unresolved libraries. (E.g. "ssh host ldd orted" fails to resolve some libs

Re: [OMPI users] openmpi-2.0.2

2017-04-19 Thread Gilles Gouaillardet
Jim, can you please post your configure command line and test output on both systems ? fwiw, Open MPI strictly sticks to the (current) MPI standard regarding MPI_DATATYPE_NULL (see http://lists.mpi-forum.org/pipermail/mpi-forum/2016-January/006417.html) there have been some attempts

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Gilles Gouaillardet
Angel, i suggest you get an xml topo with hwloc --of xml on both your "exotic" POWER platform and a more standard and recent one. then you can manually edit the xml topology and add the missing objects. finally, you can pass this to Open MPI like this mpirun --mca hwloc_base_topo_file

Re: [OMPI users] "No objects of the specified type were found on at least one node"?

2017-03-09 Thread Gilles Gouaillardet
which version of ompi are you running ? this error can occur on systems with no NUMA object (e.g. single socket with hwloc < 2) as a workaround, you can mpirun --map-by socket ... iirc, this has been fixed Cheers, Gilles On Thursday, March 9, 2017, Angel de Vicente wrote: >

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread Gilles Gouaillardet
Can you run lstopo in your machine, and post the output ? can you also try mpirun --map-by socket --bind-to socket ... and see if it helps ? Cheers, Gilles On Thursday, March 9, 2017, Angel de Vicente <ang...@iac.es> wrote: > Hi, > > Gilles Gouaillardet <gilles.goua

Re: [OMPI users] OpenMPI Segfault when psm is enabled?

2017-03-11 Thread Gilles Gouaillardet
PSM is the infinipath driver, so unless you have some infinipath hardware, you can safely disable it Cheers, Gilles On Sunday, March 12, 2017, Saliya Ekanayake wrote: > Hi, > > I've been trying to resolve a segfault that kept occurring with OpenMPI > Java binding. I found

Re: [OMPI users] users Digest, Vol 3729, Issue 2

2017-03-02 Thread Gilles Gouaillardet
Hi, there is likely something wrong in Open MPI (i will follow up in the devel ML) meanwhile, you can mpirun --mca opal_set_max_sys_limits core:unlimited ... Cheers, Gilles On 3/3/2017 1:01 PM, gzzh...@buaa.edu.cn wrote: Hi Jeff: Thanks for your suggestions. 1. I have

Re: [OMPI users] Using mpicc to cross-compile MPI applications

2017-03-02 Thread Gilles Gouaillardet
Graham, you can configure Open MPI with '--enable-script-wrapper-compilers' that will make wrappers as scripts instead of binaries. Cheers, Gilles On 3/3/2017 10:23 AM, Graham Holland wrote: Hello, I am using OpenMPI version 1.10.2 on an arm development board and have successfully

Re: [OMPI users] Questions about integration with resource distribution systems

2017-07-31 Thread Gilles Gouaillardet
Dave, unless you are doing direct launch (for example, use 'srun' instead of 'mpirun' under SLURM), this is the way Open MPI is working : mpirun will use whatever the resource manager provides in order to spawn the remote orted (tm with PBS, qrsh with SGE, srun with SLURM, ...). then

Re: [OMPI users] -host vs -hostfile

2017-08-03 Thread Gilles Gouaillardet
Mahmood, you might want to have a look at OpenHPC (which comes with a recent Open MPI) Cheers, Gilles On Thu, Aug 3, 2017 at 9:48 PM, Mahmood Naderan wrote: > Well, it seems that the default Rocks-openmpi dominates the systems. So, at > the moment, I stick with that

Re: [OMPI users] Enforce TCP with mpirun

2017-08-16 Thread Gilles Gouaillardet
addr show eth0 > 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen > 1000 > link/ether 08:00:38:3c:4e:65 brd ff:ff:ff:ff:ff:ff > inet 172.24.44.190/23 brd 172.24.45.255 scope global eth0 > inet6 fe80::a00:38ff:fe3c:4e65/64 scope link >

Re: [OMPI users] MPI running in Unikernels

2017-08-11 Thread Gilles Gouaillardet
Keith, MPI is running on both shared memory (e.g. one single node) and distributed memory (e.g. several independent nodes). here is what happens when you mpirun -np a.out 1. an orted process is remotely spawned to each node 2. mpirun and orted fork a.out unless a batch manager is used, remote

Re: [OMPI users] Q: Basic invoking of InfiniBand with OpenMPI

2017-07-13 Thread Gilles Gouaillardet
Boris, Open MPI should automatically detect the infiniband hardware, and use openib (and *not* tcp) for inter node communications and a shared memory optimized btl (e.g. sm or vader) for intra node communications. note if you "-mca btl openib,self", you tell Open MPI to use the openib

Re: [OMPI users] weird issue with output redirection to a file when using different compilers

2017-07-13 Thread Gilles Gouaillardet
Fabricio, the fortran runtime might (or not) use buffering for I/O. as a consequence, data might be written immediatly to disk, or at a later time (e.g. the file is closed, the buffer is full or the buffer is flushed) you might want to manually flush the file, or there might be an option not to

Re: [OMPI users] Q: Basic invoking of InfiniBand with OpenMPI

2017-07-17 Thread Gilles Gouaillardet
fffe5abb31 > Link layer: Ethernet > -bash-4.1$ > %%%%%% > > On Fri, Jul 14, 2017 at 12:37 AM, John Hearns via users > <users@lists.open-mpi.org> wrote: >> >> ABoris, as Gil

Re: [OMPI users] Network performance over TCP

2017-07-09 Thread Gilles Gouaillardet
Adam, at first, you need to change the default send and receive socket buffers : mpirun --mca btl_tcp_sndbuf 0 --mca btl_tcp_rcvbuf 0 ... /* note this will be the default from Open MPI 2.1.2 */ hopefully, that will be enough to greatly improve the bandwidth for large messages. generally

Re: [OMPI users] configure in version 2.1.1 doesn't use some necessary LDFLAGS

2017-07-10 Thread Gilles Gouaillardet
Hi Petr, thanks for the report. could you please configure Open MPI with the previously working command line and compress and post the generated config.log ? Cheers, Gilles On 7/11/2017 12:52 AM, Petr Hanousek wrote: Dear developers, I am using for a long time the proved configure

Re: [OMPI users] Network performance over TCP

2017-07-09 Thread Gilles Gouaillardet
on the command line. :o) -Adam On Sun, Jul 9, 2017 at 9:26 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote: Adam, at first, you need to change the default send and receive socket buffers : mpirun --mca btl_tcp_sn

Re: [OMPI users] NUMA interaction with Open MPI

2017-07-16 Thread Gilles Gouaillardet
Adam, keep in mind that by default, recent Open MPI bind MPI tasks - to cores if -np 2 - to NUMA domain otherwise (which is a socket in most cases, unless you are running on a Xeon Phi) so unless you specifically asked mpirun to do a binding consistent with your needs, you might simply try to

Re: [OMPI users] How to configure the message size in openMPI (over RDMA)?

2017-07-18 Thread Gilles Gouaillardet
Hi, i cannot comment for the openib specific part. the coll/tuned collective module is very likely to split messages in order to use a more efficient algorithm. an other way to put it is you probably do not want to use large messages. but if this is really what you want, then one option

Re: [OMPI users] Message reception not getting pipelined with TCP

2017-07-20 Thread Gilles Gouaillardet
Sam, this example is using 8 MB size messages if you are fine with using more memory, and your application should not generate too much unexpected messages, then you can bump the eager_limit for example mpirun --mca btl_tcp_eager_limit $((8*1024*1024+128)) ... worked for me George, in

Re: [OMPI users] Message reception not getting pipelined with TCP

2017-07-21 Thread Gilles Gouaillardet
thought the progress thread would have helped here. just to be 100% sure, could you please confirm this is the intended behavior and not a bug ? Cheers, Gilles On Sat, Jul 22, 2017 at 5:00 AM, George Bosilca <bosi...@icl.utk.edu> wrote: > > > On Thu, Jul 20, 2017 at 8:57 PM, Gill

Re: [OMPI users] MPI_IN_PLACE

2017-07-27 Thread Gilles Gouaillardet
posted. Best wishes Volker On Jul 27, 2017, at 7:50 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: Thanks Jeff for your offer, i will contact you off-list later i tried a gcc+gfortran and gcc+ifort on both linux and OS X so far, only gcc+ifort on OS X is failing i will t

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
did not trigger on a more standard > platform - that would have simplified things. > > Best wishes > Volker > >> On Jul 27, 2017, at 3:56 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: >> >> Folks, >> >> >> I am able to reproduce the issue on

Re: [OMPI users] NUMA interaction with Open MPI

2017-07-27 Thread Gilles Gouaillardet
Dave, On 7/28/2017 12:54 AM, Dave Love wrote: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> writes: Adam, keep in mind that by default, recent Open MPI bind MPI tasks - to cores if -np 2 - to NUMA domain otherwise Not according to ompi_info from the latest release; it says

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
ars to work. > > Hopefully, no trivial mistakes in the testcase. I just spent a few days > tracing this issue through a fairly large code, which is where the issue > originally arose (and leads to wrong numbers). > > Best wishes > Volker > > > > >> On Jul 26,

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
Volker, i was unable to reproduce this issue on linux can you please post your full configure command line, your gnu compiler version and the full test program ? also, how many mpi tasks are you running ? Cheers, Gilles On Wed, Jul 26, 2017 at 4:25 PM, Volker Blum

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
tching specific >subroutine for this generic subroutine call. [MPI_ALLREDUCE] > call MPI_ALLREDUCE(check_conventional_mpi, aux_check_success, 1, > MPI_LOGICAL, & >--------^ >compilation aborted for check_mpi_in_place_08.f90 (code 1) > >This is an interesti

Re: [OMPI users] MPI_IN_PLACE

2017-07-26 Thread Gilles Gouaillardet
ame result as with ‘include mpif.h', in that the output is > > > > * MPI_IN_PLACE does not appear to work as intended. > > * Checking whether MPI_ALLREDUCE works at all. > > * Without MPI_IN_PLACE, MPI_ALLREDUCE appears to work. > > > > Hm

Re: [OMPI users] Open MPI in a Infiniband dual-rail configuration issues

2017-07-19 Thread Gilles Gouaillardet
Ludovic, what happens here is that by default, a MPI task will only use the closest IB device. since tasks are bound to a socket, that means that tasks on socket 0 will only use mlx4_0, and tasks on socket 1 will only use mlx4_1. because these are on independent subnets, that also means that

Re: [OMPI users] PMIx + OpenMPI

2017-08-06 Thread Gilles Gouaillardet
Charles, did you build Open MPI with the external PMIx ? iirc, Open MPI 2.0.x does not support cross version PMIx Cheers, Gilles On Sun, Aug 6, 2017 at 7:59 PM, Charles A Taylor wrote: > >> On Aug 6, 2017, at 6:53 AM, Charles A Taylor wrote: >> >> >> Anyone

Re: [OMPI users] question about run-time of a small program

2017-07-31 Thread Gilles Gouaillardet
Siegmar, a noticeable difference is hello_1 does *not* sleep, whereas hello_2_slave *does* simply comment out the sleep(...) line, and performances will be identical Cheers, Gilles On 7/31/2017 9:16 PM, Siegmar Gross wrote: Hi, I have two versions of a small program. In the first one

Re: [OMPI users] Setting LD_LIBRARY_PATH for orted

2017-08-22 Thread Gilles Gouaillardet
2, 2017 at 11:55 AM, Jackson, Gary L. > > <gary.jack...@jhuapl.edu> wrote: > >> I’m using a build of OpenMPI provided by a third party. > >> > >> -- > >> Gary Jackson, Ph.D. > >> Johns Hopkins University Applied Physics

Re: [OMPI users] MPI_Init() failure

2017-05-17 Thread Gilles Gouaillardet
Folks, for the records, this was investigated off-list - the root cause was bad permissions on the /.../lib/openmpi directory (no components could be found) - then it was found tm support was not built-in, so mpirun did not behave as expected under torque/pbs Cheers, Gilles On

Re: [OMPI users] (no subject)

2017-05-16 Thread Gilles Gouaillardet
Thanks for all the information, what i meant by mpirun --mca shmem_base_verbose 100 ... is really you modify your mpirun command line (or your torque script if applicable) and add --mca shmem_base_verbose 100 right after mpirun Cheers, Gilles On 5/16/2017 3:59 AM, Ioannis Botsis

Re: [OMPI users] mpi_scatterv problem in fortran

2017-05-15 Thread Gilles Gouaillardet
Hi, if you run this under a debugger and look at how MPI_Scatterv is invoked, you will find that - sendcounts = {1, 1, 1} - resizedtype has size 32 - recvcount*sizeof(MPI_INTEGER) = 32 on task 0, but 16 on task 1 and 2 => too much data is sent to tasks 1 and 2, hence the error. in this

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet
found no active IB device ports Hello world from rank 0 out of 1 processors So it seems to work apart the error message. 2017-05-19 9:10 GMT+02:00 Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>>: Gabriele, so it seems

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet
date choice On 19 May 2017 at 09:10, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: Gabriele, so it seems pml/pami assumes there is an infiniband card available (!) i guess IBM folks will comment on that shortly.

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-18 Thread Gilles Gouaillardet
Gabriele, can you ompi_info --all | grep pml also, make sure there is nothing in your environment pointing to an other Open MPI install for example ldd a.out should only point to IBM libraries Cheers, Gilles On Thursday, May 18, 2017, Gabriele Fatigati wrote: > Dear

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-05-23 Thread Gilles Gouaillardet
Tim, On 5/18/2017 2:44 PM, Tim Jim wrote: In summary, I have attempted to install OpenMPI on Ubuntu 16.04 to the following prefix: /opt/openmpi-openmpi-2.1.0. I have also manually added the following to my .bashrc: export PATH="/opt/openmpi/openmpi-2.1.0/bin:$PATH"

Re: [OMPI users] Build problem

2017-05-24 Thread Gilles Gouaillardet
Andy, it looks like some MPI libraries are being mixed in your environment from the test/datatype directory, what if you ldd .libs/lt-external32 does it resolve the the libmpi.so you expect ? Cheers, Gilles On 5/25/2017 11:02 AM, Andy Riebs wrote: Hi, I'm trying to build OMPI on

Re: [OMPI users] Hello world Runtime error: Primary job terminated normally, but 1 process returned a non-zero exit code.

2017-05-22 Thread Gilles Gouaillardet
Hi, what if you mpirun -np 4 ./test Cheers, Gilles On Monday, May 22, 2017, Pranav Sumanth wrote: > Hello All, > > I'm able to successfully compile my code when I execute the make command. > However, when I run the code as: > > mpirun -np 4 test > > The error

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Gilles Gouaillardet
Allan, - on which node is mpirun invoked ? - are you running from a batch manager ? - is there any firewall running on your nodes ? - how many interfaces are part of bond0 ? the error is likely occuring when wiring-up mpirun/orted what if you mpirun -np 2 --hostfile nodes --mca

Re: [OMPI users] Problems with IPoIB and Openib

2017-05-28 Thread Gilles Gouaillardet
Allan, the "No route to host" error indicates there is something going wrong with IPoIB on your cluster (and Open MPI is not involved whatsoever in that) on sm3 and sm4, you can run /sbin/ifconfig brctl show iptables -L iptables -t nat -L we might be able to figure out what is going

Re: [OMPI users] Problems with IPoIB and Openib

2017-05-29 Thread Gilles Gouaillardet
host_file sm3-ib slots=2 sm4-ib slots=2 Will cause the command to hang. I ran your netcat test again on sm3 and sm4, [allan@sm3 proj]$ echo hello | nc 10.1.0.5 1234 [allan@sm4 ~]$ nc -l 1234 hello [allan@sm4 ~]$ Thanks, Allan On 05/29/2017 02:14 AM, Gilles Gouaillardet wrote: Allan,

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread Gilles Gouaillardet
Ralph, the issue Siegmar initially reported was loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi per what you wrote, this should be equivalent to loki hello_1 111 mpiexec -np 3 --host loki:2,exin:1 hello_1_mpi and this is what i initially wanted to double check (but i made a

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Gilles Gouaillardet
RTE update PR are committed? Perhaps Ralph has a point suggesting not to spend time with the problem if it may already be resolved. Nevertheless, I added the requested information after the commands below. Am 31.05.2017 um 04:43 schrieb Gilles Gouaillardet: Ralph, the issue Siegmar initially

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-31 Thread Gilles Gouaillardet
with Ralph. Best regards, Gilles On 5/31/2017 4:20 PM, Siegmar Gross wrote: Hi Gilles, Am 31.05.2017 um 08:38 schrieb Gilles Gouaillardet: Siegmar, the "big ORTE update" is a bunch of backports from master to v3.x btw, does the same error occurs with master ? Yes, it does, but t

Re: [OMPI users] "undefined reference to `MPI_Comm_create_group'" error message when using Open MPI 1.6.2

2017-06-08 Thread Gilles Gouaillardet
MPI_Comm_create_group was not available in Open MPI v1.6. so unless you are willing to create your own subroutine in your application, you'd rather upgrade to Open MPI v2 i recomment you configure Open MPI with --disable-dlopen --prefix= unless you plan to scale on thousands of nodes, you should

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-14 Thread Gilles Gouaillardet
Ted, fwiw, the 'master' branch has the behavior you expect. meanwhile, you can simple edit your 'dum.sh' script and replace /home/buildadina/src/aborttest02/aborttest02.exe with exec /home/buildadina/src/aborttest02/aborttest02.exe Cheers, Gilles On 6/15/2017 3:01 AM, Ted Sussman

Re: [OMPI users] Double free or corruption problem updated result

2017-06-17 Thread Gilles Gouaillardet
Ashwin, did you try to run your app with a MPICH-based library (mvapich, IntelMPI or even stock mpich) ? or did you try with Open MPI v1.10 ? the stacktrace does not indicate the double free occurs in MPI... it seems you ran valgrind vs a shell and not your binary. assuming your mpirun command

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-15 Thread Gilles Gouaillardet
n MPI 1.4.3, the output is >> >> After aborttest: OMPI_COMM_WORLD_RANK=0 >> >> which shows that the shell script for the process with rank 0 >> continues after the >> abort, >> but that the shell script for the process with rank

Re: [OMPI users] Double free or corruption problem updated result

2017-06-19 Thread Gilles Gouaillardet
Ashwin, the valgrind logs clearly indicate you are trying to access some memory that was already free'd for example [1,0]:==4683== Invalid read of size 4 [1,0]:==4683==at 0x795DC2: __src_input_MOD_organize_input (src_input.f90:2318) [1,0]:==4683== Address 0xb4001d0 is 0 bytes inside

Re: [OMPI users] OMPI users] Double free or corruption with OpenMPI 2.0

2017-06-13 Thread Gilles Gouaillardet
Hi Can you please post your configure command line for 2.1.1 ? On which architecture are you running? x86_64 ? Cheers, Gilles "ashwin .D" wrote: >Also when I try to build and run a make check I get these errors - Am I clear >to proceed or is my installation broken ? This

Re: [OMPI users] OMPI users] [OMPI USERS] Jumbo frames

2017-05-05 Thread Gilles Gouaillardet
Alberto, Are you saying the program hang even without jumbo frame (aka 1500 MTU) ? At first, make sure there is no firewall running, and then you can try mpirun --mca btl tcp,vader,self --mca oob_tcp_if_include eth0 --mca btl_tcp_if_include eth0 ... (Replace eth0 with the interface name you want

Re: [OMPI users] IBM Spectrum MPI problem

2017-05-19 Thread Gilles Gouaillardet
Host: openpower Framework: pml -- 2017-05-19 7:03 GMT+02:00 Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>>: Gabriele, pml/pami is h

Re: [OMPI users] Many different errors with ompi version 2.1.1

2017-05-19 Thread Gilles Gouaillardet
.1.0/24 --mca oob_base_verbose 100 ring &> cmd3 If I increase the number of processors in the ring program, mpirun will not succeed. mpirun -np 12 --hostfile nodes --mca oob_tcp_if_include 192.168.1.0/24 --mca oob_base_verbose 100 ring &> cmd4 On 05/19/2017 02:18 AM, Gille

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-14 Thread Gilles Gouaillardet
Peter and all, an easier option is to configure Open MPI with --mpirun-prefix-by-default this will automagically add rpath to the libs. Cheers, Gilles On Thu, Sep 14, 2017 at 6:43 PM, Peter Kjellström wrote: > On Wed, 13 Sep 2017 20:13:54 +0430 > Mahmood Naderan

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-14 Thread Gilles Gouaillardet
Mahmood, there is a typo, it should be -Wl,-rpath,/.../ (note the minus before rpath) Cheers, Gilles On Thu, Sep 14, 2017 at 6:58 PM, Mahmood Naderan wrote: >>In short, "mpicc -Wl,-rpath=/my/lib/path helloworld.c -o hello", will >>compile a dynamic binary "hello" with

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Gilles Gouaillardet
> On 09/21/2017 12:32 AM, Tim Jim wrote: >> >> Hi, >> >> I tried as you suggested: export nvml_enable=no, then reconfigured and >> ran make all install again, but mpicc is still producing the same error. >> What should I try next? >> >> Many tha

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-22 Thread Gilles Gouaillardet
nable=no" and "export enable_opencl=no"? What effects do these declarations have on the normal functioning of mpi? Many thanks. On 22 September 2017 at 15:55, Gilles Gouaillardet <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote: Was

Re: [OMPI users] Strange benchmarks at large message sizes

2017-09-21 Thread Gilles Gouaillardet
Unless you are using mxm, you can disable tcp with mpirun --mca pml ob1 --mca btl ^tcp ... coll/tuned select an algorithm based on communicator size and message size. The spike could occur because a suboptimal (on your cluster and with your job topology) algo is selected. Note you can force

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-20 Thread Gilles Gouaillardet
Thanks for the report, is this related to https://github.com/open-mpi/ompi/issues/4211 ? there is a known issue when libnl-3 is installed but libnl-route-3 is not Cheers, Gilles On 9/21/2017 8:53 AM, Stephen Guzik wrote: When compiling (on Debian stretch), I see: In file included from

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-20 Thread Gilles Gouaillardet
ent, and the upcoming real fix will be able to build this component. Cheers, Gilles On 9/21/2017 9:22 AM, Gilles Gouaillardet wrote: Thanks for the report, is this related to https://github.com/open-mpi/ompi/issues/4211 ? there is a known issue when libnl-3 is installed but libnl-rout

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-21 Thread Gilles Gouaillardet
Tim, i am not familiar with CUDA, but that might help can you please export nvml_enable=no and then re-configure and rebuild Open MPI ? i hope this will help you Cheers, Gilles On 9/21/2017 3:04 PM, Tim Jim wrote: Hello, Apologies to bring up this old thread - I finally had a

Re: [OMPI users] Libnl bug in openmpi v3.0.0?

2017-09-21 Thread Gilles Gouaillardet
Stephen, a simpler option is to install the libnl-route-3-dev. note you will not be able to build the reachable/netlink component without this package. Cheers, Gilles On 9/21/2017 1:04 PM, Gilles Gouaillardet wrote: Stephen, this is very likely related to the issue already reported

Re: [OMPI users] OpenMPI installation issue or mpi4py compatibility problem

2017-09-21 Thread Gilles Gouaillardet
- where should I set export nvml_enable=no? Should I reconfigure with default cuda support or keep the --without-cuda flag? Kind regards, Tim On 21 September 2017 at 15:22, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: Tim, i am no

Re: [OMPI users] Question concerning compatibility of languages used with building OpenMPI and languages OpenMPI uses to build MPI binaries.

2017-09-20 Thread Gilles Gouaillardet
On Tue, Sep 19, 2017 at 11:58 AM, Jeff Hammond wrote: > Fortran is a legit problem, although if somebody builds a standalone Fortran > 2015 implementation of the MPI interface, it would be decoupled from the MPI > library compilation. Is this even doable without making

Re: [OMPI users] OpenMPI with-tm is not obeying torque

2017-10-02 Thread Gilles Gouaillardet
Anthony, in your script, can you set -x env pbsdsh hostname mpirun --display-map --display-allocation --mca ess_base_verbose 10 --mca plm_base_verbose 10 --mca ras_base_verbose 10 hostname and then compress and send the output ? Cheers, Gilles On 10/3/2017 1:19 PM, Anthony Thyssen

<    2   3   4   5   6   7   8   9   10   11   >