Re: [OMPI users] OpenMPI + InfiniBand

2016-12-23 Thread gilles
 Serguei,

this looks like a very different issue, orted cannot be remotely started.

that typically occurs if orted cannot find some dependencies

(the Open MPI libs and/or the compiler runtime)

for example, from a node, ssh  orted should not fail because 
of unresolved dependencies.

a simple trick is to replace

mpirun ...

with

`which mpirun` ...

a better option (as long as you do not plan to relocate Open MPI install 
dir) is to configure with

--enable-mpirun-prefix-by-default

Cheers,

Gilles

- Original Message -

Hi All !

As there are no any positive changes with "UDSM + IPoIB" problem 
since my previous post,
we installed IPoIB on the cluster and "No OpenFabrics connection..." 
error doesn't appear more.
But now OpenMPI reports about another problem:

In app ERROR OUTPUT stream:

[node2:14142] [[37935,0],0] ORTE_ERROR_LOG: Data unpack had 
inadequate space in file base/plm_base_launch_support.c at line 1035

In app OUTPUT stream:


--
ORTE was unable to reliably start one or more daemons.
This usually is caused by:

* not finding the required libraries and/or binaries on
  one or more nodes. Please check your PATH and LD_LIBRARY_PATH
  settings, or configure OMPI with --enable-orterun-prefix-by-
default

* lack of authority to execute on one or more specified nodes.
  Please verify your allocation and authorities.

* the inability to write startup files into /tmp (--tmpdir/orte_
tmpdir_base).
  Please check with your sys admin to determine the correct location 
to use.

*  compilation of the orted with dynamic libraries when static are 
required
  (e.g., on Cray). Please check your configure cmd line and consider 
using
  one of the contrib/platform definitions for your system type.

* an inability to create a connection back to mpirun due to a
  lack of common network interfaces and/or no route found between
  them. Please check network connectivity (including firewalls
  and network routing requirements).

--

When I'm trying to run the task using single node - all works 
properly.
But when I specify "run on 2 nodes", the problem appears.

I tried to run ping using IPoIB addresses and all hosts are resolved 
properly,
ping requests and replies are going over IB without any problems.
So all nodes (including head) see each other via IPoIB.
But MPI app fails.

Same test task works perfect on all nodes being run with Ethernet 
transport instead of InfiniBand.

P.S. We use Torque resource manager to enqueue MPI tasks.

Best regards,
Sergei.



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI + InfiniBand

2016-12-26 Thread gilles
 Sergei,

thanks for confirming you are now able to use Open MPI

fwiw, orted is remotely started by the selected plm component.

it can be ssh if you run without a batch manager, the tm interface if 
PBS/torque, srun if slurm, etc ...

that should explain why exporting PATH and LD_LIBRARY_PATH is not enough 
in your environment,

not to mention your .bashrc or equivalent might reset/unset

Cheers,

Gilles

- Original Message -

Hi Gilles!
 

this looks like a very different issue, orted cannot be remotely 
started.
...

a better option (as long as you do not plan to relocate Open MPI 
install dir) is to configure with

--enable-mpirun-prefix-by-default


Yes, that's was a problem with orted.
I checked PATH and LD_LIBRARY_PATH variables and both are specified, 
but it was not enough!

So I added --enable-mpirun-prefix-by-default to configure and even 
when --prefix isn't specified the recompiled version woks properly.

When Ethernet transfer is used, all works both with and without --
enable-mpirun-prefix-by-default.

Thank you!

Best regards,
Sergei.



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] More confusion about --map-by!

2017-02-23 Thread gilles
Mark,

what about
mpirun -np 6 -map-by slot:PE=4 --bind-to core --report-bindings ./prog

it is a fit for 1) and 2) but not 3)

if you use OpenMP and want 2 threads per task, then you can
export OMP_NUM_THREADS=2
not to use 4 threads by default (with most OpenMP runtimes)

Cheers,

Gilles
- Original Message -
> Hi,
> 
> I'm still trying to figure out how to express the core binding I want 
to 
> openmpi 2.x via the --map-by option. Can anyone help, please?
> 
> I bet I'm being dumb, but it's proving tricky to achieve the following 
> aims (most important first):
> 
> 1) Maximise memory bandwidth usage (e.g. load balance ranks across
> processor sockets)
> 2) Optimise for nearest-neighbour comms (in MPI_COMM_WORLD) (e.g. put
> neighbouring ranks on the same socket)
> 3) Have an incantation that's simple to change based on number of 
ranks
> and processes per rank I want.
> 
> Example:
> 
> Considering a 2 socket, 12 cores/socket box and a program with 2 
threads 
> per rank...
> 
> ... this is great if I fully-populate the node:
> 
> $ mpirun -np 12 -map-by slot:PE=2 --bind-to core --report-bindings ./
prog
> [somehost:101235] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.]
> [somehost:101235] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.]
> [somehost:101235] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket 
0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.]
> [somehost:101235] MCW rank 3 bound to socket 0[core 6[hwt 0]], socket 
0[core 7[hwt 0]]: [././././././B/B/./././.][./././././././././././.]
> [somehost:101235] MCW rank 4 bound to socket 0[core 8[hwt 0]], socket 
0[core 9[hwt 0]]: [././././././././B/B/./.][./././././././././././.]
> [somehost:101235] MCW rank 5 bound to socket 0[core 10[hwt 0]], socket 
0[core 11[hwt 0]]: [././././././././././B/B][./././././././././././.]
> [somehost:101235] MCW rank 6 bound to socket 1[core 12[hwt 0]], socket 
1[core 13[hwt 0]]: [./././././././././././.][B/B/./././././././././.]
> [somehost:101235] MCW rank 7 bound to socket 1[core 14[hwt 0]], socket 
1[core 15[hwt 0]]: [./././././././././././.][././B/B/./././././././.]
> [somehost:101235] MCW rank 8 bound to socket 1[core 16[hwt 0]], socket 
1[core 17[hwt 0]]: [./././././././././././.][././././B/B/./././././.]
> [somehost:101235] MCW rank 9 bound to socket 1[core 18[hwt 0]], socket 
1[core 19[hwt 0]]: [./././././././././././.][././././././B/B/./././.]
> [somehost:101235] MCW rank 10 bound to socket 1[core 20[hwt 0]], 
socket 1[core 21[hwt 0]]: [./././././././././././.][././././././././B/B/.
/.]
> [somehost:101235] MCW rank 11 bound to socket 1[core 22[hwt 0]], 
socket 1[core 23[hwt 0]]: [./././././././././././.][././././././././././
B/B]
> 
> 
> ... but not if I don't [fails aim (1)]:
> 
> $ mpirun -np 6 -map-by slot:PE=2 --bind-to core --report-bindings ./
prog
> [somehost:102035] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.]
> [somehost:102035] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.]
> [somehost:102035] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket 
0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.]
> [somehost:102035] MCW rank 3 bound to socket 0[core 6[hwt 0]], socket 
0[core 7[hwt 0]]: [././././././B/B/./././.][./././././././././././.]
> [somehost:102035] MCW rank 4 bound to socket 0[core 8[hwt 0]], socket 
0[core 9[hwt 0]]: [././././././././B/B/./.][./././././././././././.]
> [somehost:102035] MCW rank 5 bound to socket 0[core 10[hwt 0]], socket 
0[core 11[hwt 0]]: [././././././././././B/B][./././././././././././.]
> 
> 
> ... whereas if I map by socket instead of slot, I achieve aim (1) but 
> fail on aim (2):
> 
> $ mpirun -np 6 -map-by socket:PE=2 --bind-to core --report-bindings ./
prog
> [somehost:105601] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 
0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.]
> [somehost:105601] MCW rank 1 bound to socket 1[core 12[hwt 0]], socket 
1[core 13[hwt 0]]: [./././././././././././.][B/B/./././././././././.]
> [somehost:105601] MCW rank 2 bound to socket 0[core 2[hwt 0]], socket 
0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.]
> [somehost:105601] MCW rank 3 bound to socket 1[core 14[hwt 0]], socket 
1[core 15[hwt 0]]: [./././././././././././.][././B/B/./././././././.]
> [somehost:105601] MCW rank 4 bound to socket 0[core 4[hwt 0]], socket 
0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.]
> [somehost:105601] MCW rank 5 bound to socket 1[cor

Re: [OMPI users] "No objects of the specified type were found on at least one node"

2017-03-09 Thread gilles
Yes, lstopo is part of hwloc

by default, Open MPI uses an embedded version of hwloc 1.11.2,
so i suggest you install the full hwloc with the same version

Cheers,

Gilles

- Original Message -
> Hi,
> 
> Gilles Gouaillardet  writes:
> > Can you run
> > lstopo
> > in your machine, and post the output ?
> 
> no lstopo in my machine. This is part of hwloc, right?
> 
> > can you also try
> > mpirun --map-by socket --bind-to socket ...
> > and see if it helps ?
> 
> same issue.
> 
> 
> Perhaps I need to compile hwloc as well??
> -- 
> Ángel de Vicente
> http://www.iac.es/galeria/angelv/  
> --
---
> ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protecció
n de Datos, acceda a http://www.iac.es/disclaimer.php
> WARNING: For more information on privacy and fulfilment of the Law 
concerning the Protection of Data, consult 
http://www.iac.es/disclaimer.php?lang=en
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] Mellanox EDR performance

2017-03-15 Thread gilles
 Thanks for sharing your findings.

just to be clear, your application is running at full speed.

only MPI_Wtime() is busted, so timers used internally in your app might

mislead you and suggests performance are worst than what they really are

Cheers,

Gilles

- Original Message -

I'm not sure why nobody has encountered this issue on the mailing 
list. After some fiddling I was finally able to isolated to a 
performance regression introduced between 2.0.1 and 2.0.2. While trying 
to binary search the exact commit causing the performance regression my 
colleague brought this to my attention,

https://github.com/open-mpi/ompi/issues/3003

Yes this is exactly the issue!

So just in case anybody run into the same issue again ...

On Tue, Mar 7, 2017 at 12:59 PM, Yong Qin  wrote:

OK, did some testing with MVAPICH and everything is normal so 
this is clearly with OMPI. Is there anything that I should try?

Thanks,

Yong Qin

On Mon, Mar 6, 2017 at 11:46 AM, Yong Qin  
wrote:

Hi,

I'm wondering if anybody who has done perf testing on 
Mellanox EDR with OMPI can shed some light here?

We have a pair of EDR HCAs connected back to back. We are 
testing with two dual-socked Intel Xeon E5-2670v3 (Haswell) nodes @2.
30GHz, 64GB memory. OS is Scientific Linux 6.7 with kernel
2.6.32-642.6.2.el6.x86_64, vanilla OFED 3.18-2. HCAs are 
running the latest FW. OMPI 2.0.2.

OSU bandwidth test only delivers ~5.5 GB/s at 4MB message 
size, latency is ~2.7 us at 0B message size. Both are far behind the 
claimed values. RDMA perf on the same set up was not too shabby - 
bandwidth ~10.6 GB/s, latency ~ 1.0 us.

So I'm wondering if I'm missing anything in the OMPI setup 
that causes such a huge delta? OMPI command was simplely: mpirun -np 2 -
H host1,host2 -mca btl openib,sm,self osu_bw

Thanks,

Yong Qin





___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Build Failed - OpenMPI 1.10.6 / Ubuntu 16.04 / Oracle Studio 12.5

2017-04-08 Thread gilles
Travis,

per the logs, the issue is the compiler does not find the definition of 
CLOCK_MONOTONIC nor clock_gettime(),

and this looks like a missing include header file.

on my box, these are defined in , but ubuntu could be different.

can you please confirm  should be used

(man clock_gettime should be enough for that)

 might not be (indirectly) pulled on ubuntu, so what if you 
manually

#include 

at the beginning of opal/mca/timer/linux/timer_linux_component.c ?

Cheers,

Gilles

- Original Message -

Hello,

I am attempting to build OpenMPI 1.10.6 (and also 2.0.x) on several 
different Ubuntu 16.04 workstations using the Oracle Developer 12.5 
compilers. I have verified that the suncc, sunCC, and sunf95 compilers 
work by compiling non-MPI codes.

The configure command is shown here:

../configure --enable-shared CC=suncc CXX=sunCC FC=sunf95 
CXXFLAGS=-L/usr/lib/x86_64-linux-gnu --prefix=/home/SOFTWARE/openmpi/1.
10.6/sun/12.5


The build fails while trying to compile 'timer_linux_component.c':

"../../../../../opal/include/opal/sys/amd64/atomic.h", line 136: 
warning: parameter in inline asm statement unused: %3
"../../../../../opal/include/opal/sys/amd64/atomic.h", line 182: 
warning: parameter in inline asm statement unused: %2
"../../../../../opal/include/opal/sys/amd64/atomic.h", line 203: 
warning: parameter in inline asm statement unused: %2
"../../../../../opal/include/opal/sys/amd64/atomic.h", line 224: 
warning: parameter in inline asm statement unused: %2
"../../../../../opal/include/opal/sys/amd64/atomic.h", line 245: 
warning: parameter in inline asm statement unused: %2
"../../../../../opal/include/opal/sys/amd64/timer.h", line 61: 
warning: initializer does not fit or is out of range: 0x8007
"../../../../../opal/mca/timer/linux/timer_linux_component.c", 
line 166: warning: implicit function declaration: clock_getres
"../../../../../opal/mca/timer/linux/timer_linux_component.c", 
line 166: undefined symbol: CLOCK_MONOTONIC
"../../../../../opal/mca/timer/linux/timer_linux_component.c", 
line 188: warning: implicit function declaration: clock_gettime
"../../../../../opal/mca/timer/linux/timer_linux_component.c", 
line 188: undefined symbol: CLOCK_MONOTONIC
"../../../../../opal/mca/timer/linux/timer_linux_component.c", 
line 197: undefined symbol: CLOCK_MONOTONIC
cc: acomp failed for ../../../../../opal/mca/timer/linux/timer_
linux_component.c
Makefile:1691: recipe for target 'timer_linux_component.lo' 
failed
make[2]: *** [timer_linux_component.lo] Error 1
make[2]: Leaving directory '/home/teb-admin/INSTALL/openmpi-1.10.
6/BUILD_SUN/opal/mca/timer/linux'
Makefile:2234: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/teb-admin/INSTALL/openmpi-1.10.
6/BUILD_SUN/opal'
Makefile:1777: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1


The compiler seems unable to find 'librt.so' which can be found in 
the following locations on the system:

/lib/i386-linux-gnu/librt.so.1
/lib/x86_64-linux-gnu/librt.so.1
/lib32/librt.so.1
/libx32/librt.so.1
/usr/lib/x86_64-linux-gnu/librt.so
/usr/lib32/librt.so
/usr/libx32/librt.so


I've tried various combinations of CFLAGS, LDFLAGS, and LIBS on the 
configure command with no luck. Any help is greatly appreciated.





___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Help

2017-04-27 Thread gilles
 Hi,

that looks like a typo, the command is

mpi-selector --list

Cheers,

Gilles

- Original Message -

Hello,

 

I am trying to install Open MPI on Centos and I got stuck. I have 
installed an GNU compiler and after that I run the command: yum install 
openmpi-devel.x86_64. But when I run command mpi selector –- list I 
receive this error “mpi: command not found”

I am following the instruction from here: 
https://na-inet.jp/na/pccluster/centos_x86_64-en.html


Any help is much appreciated. J

 

Corina



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Help

2017-04-27 Thread gilles
 Well, i cannot make sense of this error message.

if the command is mpi-selector, the error message could be

mpi-selector: command not found

but this is not the error message you reported

what does

rpm -ql mpi-selector

reports ?

Cheers,

Gilles

- Original Message -

Yes, I write it wrong the previous e-mail, but actually it does not 
work. Gives the error message: mpi: command not found

 

Corina

 

From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of 
gil...@rist.or.jp
Sent: Thursday, April 27, 2017 11:34 AM
To: Open MPI Users 
Subject: Re: [OMPI users] Help

 

 Hi,

 

that looks like a typo, the command is

mpi-selector --list

 

Cheers,

 

Gilles

- Original Message -

Hello,

 

 

I am trying to install Open MPI on Centos and I got stuck. I 
have installed an GNU compiler and after that I run the command: yum 
install openmpi-devel.x86_64. But when I run command mpi selector –- 
list I receive this error “mpi: command not found”

I am following the instruction from here: 
https://na-inet.jp/na/pccluster/centos_x86_64-en.html


Any help is much appreciated. J

 

 

Corina



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Help

2017-04-27 Thread gilles
 by the way, are you running CentOS 5 ?

it seems mpi-selector is no more available from CentOS 6

Cheers,

Gilles

- Original Message -

Yes, I write it wrong the previous e-mail, but actually it does not 
work. Gives the error message: mpi: command not found

 

Corina

 

From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of 
gil...@rist.or.jp
Sent: Thursday, April 27, 2017 11:34 AM
To: Open MPI Users 
Subject: Re: [OMPI users] Help

 

 Hi,

 

that looks like a typo, the command is

mpi-selector --list

 

Cheers,

 

Gilles

- Original Message -

Hello,

 

 

I am trying to install Open MPI on Centos and I got stuck. I 
have installed an GNU compiler and after that I run the command: yum 
install openmpi-devel.x86_64. But when I run command mpi selector –- 
list I receive this error “mpi: command not found”

I am following the instruction from here: 
https://na-inet.jp/na/pccluster/centos_x86_64-en.html


Any help is much appreciated. J

 

 

Corina



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Help

2017-04-27 Thread gilles
 Or you can replace the mpi-selector thing with

module load mpi/openmpi-x86_64

if it does not work,

module avail

and then

module load 

note this is per session, so you should do that each time you start a 
new terminal or submit a job

Cheers,

Gilles

- Original Message -

When I run command rpm --query centos-release, it shows the 
following: centos-release-7-3.1611.el7.centos.x86_64. So maybe I should 
install CentOS 5?

 

C.

 

From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of 
gil...@rist.or.jp
Sent: Thursday, April 27, 2017 12:36 PM
To: Open MPI Users 
Subject: Re: [OMPI users] Help

 

 by the way, are you running CentOS 5 ?

it seems mpi-selector is no more available from CentOS 6

 

Cheers,

 

Gilles

- Original Message -

Yes, I write it wrong the previous e-mail, but actually it does 
not work. Gives the error message: mpi: command not found

 

 

Corina

 

 

 

From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf 
Of gil...@rist.or.jp
Sent: Thursday, April 27, 2017 11:34 AM
To: Open MPI Users 
Subject: Re: [OMPI users] Help

 

 

 Hi,

 

that looks like a typo, the command is

mpi-selector --list

 

Cheers,

 

Gilles

- Original Message -

Hello,

 

 

I am trying to install Open MPI on Centos and I got stuck. I 
have installed an GNU compiler and after that I run the command: yum 
install openmpi-devel.x86_64. But when I run command mpi selector –- 
list I receive this error “mpi: command not found”

I am following the instruction from here: 
https://na-inet.jp/na/pccluster/centos_x86_64-en.html

Any help is much appreciated. J

 

 

Corina



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI I/O gives undefined behavior if the amount of bytes described by a filetype reaches 2^32

2017-04-28 Thread gilles
Before v1.10, the default is ROMIO, and you can force OMPIO with
mpirun --mca io ompio ...

>From v2, the default is OMPIO (unless you are running on lustre iirc), 
and you can force ROMIO with
mpirun --mca io ^ompio ...

maybe that can help for the time being

Cheers,

Gilles

- Original Message -
> Hello,
> 
> Which MPI Version are you using?
> This looks for me like it triggers 
> https://github.com/open-mpi/ompi/issues/2399

> 
> You can check if you are running into this problem by playing around 
with the mca_io_ompio_cycle_buffer_size parameter.
> 
> Best
> Christoph Niethammer
> 
> --
> 
> Christoph Niethammer
> High Performance Computing Center Stuttgart (HLRS)
> Nobelstrasse 19
> 70569 Stuttgart
> 
> Tel: ++49(0)711-685-87203
> email: nietham...@hlrs.de
> http://www.hlrs.de/people/niethammer
> 
> 
> 
> - Original Message -
> From: "Nils Moschuering" 
> To: "Open MPI Users" 
> Sent: Friday, April 28, 2017 12:51:50 PM
> Subject: [OMPI users] MPI I/O gives undefined behavior if the amount 
of bytes described by a filetype reaches 2^32
> 
> Dear OpenMPI Mailing List, 
> 
> I have a problem with MPI I/O running on more than 1 rank using very 
large filetypes. In order to reproduce the problem please take advantage 
of the attached program "mpi_io_test.c". After compilation it should be 
run on 2 nodes. 
> 
> The program will do the following for a variety of different 
parameters: 
> 1. Create an elementary datatype (commonly refered to as etype in the 
MPI Standard) of a specific size given by the parameter bsize (in 
multiple of bytes). This datatype is called blk_filetype . 
> 2. Create a complex filetype, which is different for each rank. This 
filetype divides the file into a number of blocks given by parameter nr_
blocks of size bsize . Each rank only gets access to a subarray 
containing 
> nr_blocks_per_rank = nr_blocks / size 
> blocks (where size is the number of participating ranks). The 
respective subarray of each rank starts at 
> rank * nr_blocks_per_rank 
> This guarantees that the regions of the different ranks don't overlap. 
> The resulting datatype is called full_filetype . 
> 3. Allocate enough memory on each rank, in order to be able to write a 
whole block. 
> 4. Fill the allocated memory with the rank number to be able to check 
the resulting file for correctness. 
> 5. Open a file named fname and set the view using the previously 
generated blk_filetype and full_filetype . 
> 6. Write one block on each rank, using the collective routine. 
> 7. Clean up. 
> 
> The above will be repeated for different values of bsize and nr_blocks 
. Please note, that there is no overflow of the used basic dataype int . 
> The output is verified using 
> hexdump fname 
> which performs a hexdump of the file. This tool collects consecutive 
equal lines in a file into one output line. The resulting output of a 
call to hexdump is given by a structure comparable to the following 
>  01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 01 |.
...| 
> * 
> 1f40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |.
...| 
> * 
> 3e80 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 02 |.
...| 
> * 
> 5dc0 
> This example is to be read in the following manner: 
> -From byte  to 1f40 (which is equal to 500 Mib) the file 
contains the value 01 in each byte. 
> -From byte 1f40 to 3e80 (which is equal to 1000 Mib) the file 
contains the value 00 in each byte. 
> -From byte 3e80 to 5dc0 (which is equal to 1500 Mib) the file 
contains the value 02 in each byte. 
> -The file ends here. 
> This is the correct output of the above outlined program with 
parameters 
> bsize=500*1023*1024 
> nr_blocks=4 
> running on 2 ranks. The attached file contains a lot of tests for 
different cases. These were made to pinpoint the source of the problem 
and to exclude different other, potentially important, factors. 
> I deem an output wrong if it doesn't follow from the parameters or if 
the program crashes on execution. 
> The only difference between OpenMPI and Intel MPI, according to my 
tests, is in the different behavior on error: OpenMPI will mostly write 
wrong data but won't crash, whereas Intel MPI mostly crashes. 
> 
> The tests and their results are defined in comments in the source. 
> The final conclusions, I derive from the tests, are the following: 
> 
> 1. If the filetype used in the view is set in a way that it describes 
an amount of bytes equaling or exceeding 2^32 = 4Gib the code produces 
wrong output. For values slightly smaller (the second example with fname
="test_8_blocks" uses a total filetype size of 4000 MiB which is smaller 
than 4Gi

Re: [OMPI users] MPI I/O gives undefined behavior if the amount of bytes described by a filetype reaches 2^32

2017-05-02 Thread gilles
Jeff and all,

i already reported the issue and posted a patch for ad_nfs at 
https://github.com/pmodels/mpich/pull/2617

a bug was also identified in Open MPI (related to datatype handling) and 
a first draft is available at https://github.com/open-mpi/ompi/pull/3439

Cheers,

Gilles

- Original Message -



On Fri, Apr 28, 2017 at 3:51 AM, Nils Moschuering  
wrote:

Dear OpenMPI Mailing List,

I have a problem with MPI I/O running on more than 1 rank using 
very large filetypes. In order to reproduce the problem please take 
advantage of the attached program "mpi_io_test.c". After compilation it 
should be run on 2 nodes.

The program will do the following for a variety of different 
parameters:
1. Create an elementary datatype (commonly refered to as etype 
in the MPI Standard) of a specific size given by the parameter bsize (in 
multiple of bytes). This datatype is called blk_filetype.
2. Create a complex filetype, which is different for each rank. 
This filetype divides the file into a number of blocks given by 
parameter nr_blocks of size bsize. Each rank only gets access to a 
subarray containing
nr_blocks_per_rank = nr_blocks / size
blocks (where size is the number of participating ranks). The 
respective subarray of each rank starts at
rank * nr_blocks_per_rank
This guarantees that the regions of the different ranks don't 
overlap.
The resulting datatype is called full_filetype.
3. Allocate enough memory on each rank, in order to be able to 
write a whole block.
4. Fill the allocated memory with the rank number to be able to 
check the resulting file for correctness.
5. Open a file named fname and set the view using the previously 
generated blk_filetype and full_filetype.
6. Write one block on each rank, using the collective routine.
7. Clean up.

The above will be repeated for different values of bsize and nr_
blocks. Please note, that there is no overflow of the used basic dataype 
int.
The output is verified using
hexdump fname
which performs a hexdump of the file. This tool collects 
consecutive equal lines in a file into one output line. The resulting 
output of a call to hexdump is given by a structure comparable to the 
following
  01 01 01 01 01 01 01 01  01 01 01 01 01 01 01 01  |
|
*
1f40  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |
|
*
3e80  02 02 02 02 02 02 02 02  02 02 02 02 02 02 02 02  |
|
*
5dc0
This example is to be read in the following manner:
-From byte  to 1f40 (which is equal to 500 Mib) the 
file contains the value 01 in each byte.
-From byte 1f40 to 3e80 (which is equal to 1000 Mib) the 
file contains the value 00 in each byte.
-From byte 3e80 to 5dc0 (which is equal to 1500 Mib) the 
file contains the value 02 in each byte.
-The file ends here.
This is the correct output of the above outlined program with 
parameters
bsize=500*1023*1024
nr_blocks=4
running on 2 ranks. The attached file contains a lot of tests 
for different cases. These were made to pinpoint the source of the 
problem and to exclude different other, potentially important, factors.
I deem an output wrong if it doesn't follow from the parameters 
or if the program crashes on execution.
The only difference between OpenMPI and Intel MPI, according to 
my tests, is in the different behavior on error: OpenMPI will mostly 
write wrong data but won't crash, whereas Intel MPI mostly crashes.


Intel MPI is based on MPICH so you should verify that this bug 
appears in MPICH and then report it here: 
https://github.com/pmodels/mpich/issues.  This is particularly useful 
because the person most responsible for MPI-IO in MPICH (Rob Latham) 
also happens to be interested in integer-overflow issues.
 

The tests and their results are defined in comments in the 
source.
The final conclusions, I derive from the tests, are the 
following:

1. If the filetype used in the view is set in a way that it 
describes an amount of bytes equaling or exceeding 2^32 = 4Gib the code 
produces wrong output. For values slightly smaller (the second example 
with fname="test_8_blocks" uses a total filetype size of 4000 MiB which 
is smaller than 4Gib) the code works as expected.
2. The act of actually writing the described regions is not 
important. When the filetype describes an area >= 4Gib but only writes 
to regions much smaller than that, the code still produces undefined 
behavior (please refer to the 6th example with fname="test_too_large_
blocks" in order to see an example).
3. It doesn't matter if the block size or the amount

Re: [OMPI users] Strange OpenMPI errors showing up in Caffe rc5 build

2017-05-04 Thread gilles
William,

the link error clearly shows libcaffe.so does require C++ bindings.

did you build caffe from a fresh tree ?

what if you

ldd libcaffe.so

nm libcaffe.so | grep -i ompi

if libcaffe.so does require mpi c++ bindings, it should depend on it

(otherwise the way it was built is questionnable)

you might want to link with mpic++ instead of g++

note mpi C++ bindings are no more built by default since v2.0, so you 
likely have to

configure --enable-mpi-cxx

last but not least, make sure caffe and openmpi were built with the same 
c++ compiler

Cheers,

Gilles
- Original Message -

I know this could possibly be off-topic, but the errors are OpenMPI 
errors and if anyone could shed light on the nature of these errors I 
figure it would be this group:


CXX/LD -o .build_release/tools/upgrade_solver_proto_text.bin
g++ .build_release/tools/upgrade_solver_proto_text.o -o .build_
release/tools/upgrade_solver_proto_text.bin -pthread -fPIC -DCAFFE_
VERSION=1.0.0-rc5 -DNDEBUG -O2 -DUSE_OPENCV -DUSE_LEVELDB -DUSE_LMDB -
DCPU_ONLY -DWITH_PYTHON_LAYER -I/hpc/apps/python27/include/python2.7 -I/
hpc/apps/python27/externals/numpy/1.9.2/lib/python2.7/site-packages/
numpy/core/include -I/usr/local/include -I/hpc/apps/hdf5/1.8.17/include 
-I.build_release/src -I./src -I./include -I/hpc/apps/atlas/3.10.2/
include -Wall -Wno-sign-compare -lcaffe -L/hpc/apps/gflags/lib -L/hpc/
apps/python27/lib -L/hpc/apps/python27/lib/python2.7 -L/hpc/apps/atlas/3.
10.2/lib -L.build_release/lib  -lglog -lgflags -lprotobuf -lboost_system 
-lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -
lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -
lboost_python -lpython2.7 -lcblas -latlas \
-Wl,-rpath,\$ORIGIN/../lib
.build_release/lib/libcaffe.so: undefined reference to `ompi_mpi
_cxx_op_intercept'
.build_release/lib/libcaffe.so: undefined reference to `MPI::
Datatype::Free()'
.build_release/lib/libcaffe.so: undefined reference to `MPI::
Comm::Comm()'
.build_release/lib/libcaffe.so: undefined reference to `MPI::Win:
:Free()'
collect2: error: ld returned 1 exit status


I've read this may be due to a dependency of Caffe that uses OpenMPI 
(since I've been told Caffe itself doesn't use OpenMPI).


Would adding -l directives to LIBRARIES line in the Makefile for 
Caffe that reference all OpenMPI libraries fix this problem?

For example, -l mpi.


Thank you in advance. Hopefully this isn't entirely OT.


William L.
IMPORTANT WARNING: This message is intended for the use of the 
person or entity to which it is addressed and may contain information 
that is privileged and confidential, the disclosure of which is governed 
by applicable law. If the reader of this message is not the intended 
recipient, or the employee or agent responsible for delivering it to the 
intended recipient, you are hereby notified that any dissemination, 
distribution or copying of this information is strictly prohibited. 
Thank you for your cooperation. IMPORTANT WARNING: This message is 
intended for the use of the person or entity to which it is addressed 
and may contain information that is privileged and confidential, the 
disclosure of which is governed by applicable law. If the reader of this 
message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this 
information is strictly prohibited. Thank you for your cooperation. 



___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] (no subject)

2017-05-15 Thread gilles
Ioannis,

### What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git 
branch name and hash, etc.)



### Describe how Open MPI was installed (e.g., from a source/
distribution tarball, from a git clone, from an operating system 
distribution package, etc.)



### Please describe the system on which you are running

* Operating system/version: 
* Computer hardware: 
* Network type: 

also, what if you

mpirun --mca shmem_base_verbose 100 ...


Cheers,

Gilles
- Original Message -
> Hi
> 
> I am trying to run the following simple demo to a cluster of two nodes
> 
> --

> #include 
> #include 
> 
> int main(int argc, char** argv) {
>  MPI_Init(NULL, NULL);
> 
>  int world_size;
>  MPI_Comm_size(MPI_COMM_WORLD, &world_size);
> 
>  int world_rank;
>  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
> 
>  char processor_name[MPI_MAX_PROCESSOR_NAME];
>  int name_len;
>  MPI_Get_processor_name(processor_name, &name_len);
> 
>  printf("Hello world from processor %s, rank %d"   " out of %d 
> processors\n",  processor_name, world_rank, world_size);
> 
>  MPI_Finalize();
> }
> --
---
> 
> i get always the message
> 
> --
--
> It looks like opal_init failed for some reason; your parallel process 
is
> likely to abort.  There are many reasons that a parallel process can
> fail during opal_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>opal_shmem_base_select failed
>--> Returned value -1 instead of OPAL_SUCCESS
> --

> 
> any hint?
> 
> Ioannis Botsis
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] All processes waiting on MPI_Bcast

2017-05-24 Thread gilles
Hi,

your program hangs because rank 0 does not call MPI_Bcast()

generally speaking, when using collective operations (such as MPI_Bcast),
all tasks of the communicators must invoke the collective operation, and 
with "matching" arguments.
in the case of MPI_Bcast(), the root value must be the same on all tasks,
 and all tasks must transfer
the same amount of data :
if all tasks use the same datatype, then the count must be the same on 
all tasks,
otherwise, the datatype size * count must be the same on all tasks
/* for the sake of completion, there are known issues specific to Open 
MPI when you mix
   large and small datatypes */

Cheers,

Gilles

- Original Message -
> Greetings!
> 
> I include a static header file utils.h with a function linspace. My 
main.cpp file is as follows:
> 
> #include 
> #include 
> #include 
> 
> using namespace std;
> 
> int main(int argc, const char * argv[]) {
> 
> float start = 0., end = 1.;
> unsigned long int num = 100;
> 
> double *linspaced;
> 
> float delta = (end - start) / num;
> int size, rank;
> 
> 
> MPI_Init(NULL, NULL);
> 
> MPI_Comm_size(MPI_COMM_WORLD, &size);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Status status;
> 
> // These have to be converted into unsigned long ints
> int casesPerNode = num / size;
> int remainderCases = num % size;
> 
> 
> if(rank==0){
> linspaced =  new double[num];
> 
> if(remainderCases!=0){
> linspace(&linspaced[(size-1)*casesPerNode], end - delta*
remainderCases, end, remainderCases);
> 
> } else {
> linspace(&linspaced[(size-1)*casesPerNode], end - delta*
casesPerNode, end, casesPerNode);
> 
> }
> 
> } else {
> 
> MPI_Bcast(&linspaced, num, MPI_DOUBLE, 0, MPI_COMM_WORLD);
> 
> 
> // Sending job to master node.
> // The node is already overloaded with coordinating.
> // Additional task now is also to take on remainder cases.
> 
> 
> // cout << "Rank " << rank << endl;
> float start_in = start + casesPerNode*delta*(rank-1);
> float end_in = start + casesPerNode*delta*(rank) - delta;
> 
> linspace(&linspaced[(rank-1)*casesPerNode], start_in, end_in, 
casesPerNode);
> 
> 
> }
> 
> MPI_Barrier(MPI_COMM_WORLD);
> // cout << "Print Here Rank " << rank << endl ;
> 
> 
> MPI_Finalize();
> 
> /*
> for(int i=0; i< num; i++){
> cout << *(linspaced + i) << endl;
> }
>  */
> 
> return 0;
> 
> }
> and my utils.h file is:
> 
> void linspace(double *ret, double start_in, double end_in, unsigned 
long int num_in)
> {
> /* This function generates equally spaced elements and returns
>  an array with the results */
> 
> 
> assert(num_in!=0);
> 
> 
> cout <<  "\tReceived start :" << start_in << "\tEnd :" << end_in <
< "\tNum_in :" << num_in << endl;
> 
> double delta_in = (end_in - start_in) / (num_in - 1);
> 
> if(num_in == 1){
> *(ret) = start_in;
> }
> 
> *(ret) = start_in;
> for(int i=1; i < num_in-1; i++) {
> *(ret + i) = *(ret + i - 1) + delta_in;
> }
> *(ret + (num_in - 1)) = end_in;
> 
> /*
> cout << "Finished executing linspace " << endl;
>  for(int i = 0; i  cout << "Address : " << &ret << "\tElement " << i << " : " << *(
ret + i) << endl;
>  }
>  */
> }
> I am unable to diagnose why my code gets stuck at MPI_Bcast. What 
could I do to fix it?
> 
> Thanks
> 
> 
> PS: I’m new to OpenMPI and may have a lot of these doubts initially. 
Thanks for patience and support. 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] All processes waiting on MPI_Bcast

2017-05-24 Thread gilles
At first, try to allocate linspaced on all ranks
 linspaced =  new double[num];
then use the pointer as the first parameter of MPI_Bcast   
MPI_Bcast(linspaced, num, MPI_DOUBLE, 0, MPI_COMM_WORLD);

this mailing list is to discuss MPI stuff specific to Open MPI.
if you have a doubt whether your problem is specific to Open MPI,
you can try to use an other MPI library such as mpich or its derivate
(mvapich, Intel MPI, ...)
if both MPI implementations fail, then the odds are the issue is in your 
app,
and forum such as stack overflow are more appropriate to look for help

Cheers,

Gilles

- Original Message -
> @Siva, Thanks for your inputs. I changed it and the process no longer 
hangs.
> 
> I now modified my main file to:
> #include 
> #include 
> #include 
> 
> using namespace std;
> 
> int main(int argc, const char * argv[]) {
> 
> float start = 0., end = 1.;
> unsigned long int num = 100;
> 
> double *linspaced;
> 
> float delta = (end - start) / num;
> int size, rank;
> 
> 
> MPI_Init(NULL, NULL);
> 
> MPI_Comm_size(MPI_COMM_WORLD, &size);
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Status status;
> 
> // These have to be converted into unsigned long ints
> int casesPerNode = num / size;
> int remainderCases = num % size;
> 
> 
> if(rank==0){
> linspaced =  new double[num];
> 
> if(remainderCases!=0){
> linspace(&linspaced[(size-1)*casesPerNode], end - delta*
remainderCases, end, remainderCases);
> 
> } else {
> linspace(&linspaced[(size-1)*casesPerNode], end - delta*
casesPerNode, end, casesPerNode);
> 
> }
> 
> }
> 
> MPI_Bcast(&linspaced, num, MPI_DOUBLE, 0, MPI_COMM_WORLD);
> 
> if(rank != 0) {
> 
> 
> 
> 
> // Sending job to master node.
> // The node is already overloaded with coordinating.
> // Additional task now is also to take on remainder cases.
> 
> 
> // cout << "Rank " << rank << endl;
> float start_in = start + casesPerNode*delta*(rank-1);
> float end_in = start + casesPerNode*delta*(rank) - delta;
> 
> linspace(&linspaced[(rank-1)*casesPerNode], start_in, end_in, 
casesPerNode);
> 
> 
> }
> 
> 
> MPI_Finalize();
> 
> 
> for(int i=0; i< num; i++){
> cout << *(linspaced + i) << endl;
> }
> 
> 
> return 0;
> 
> }
> 
> 
> On execution the error generated is:
> 
> [wlan-145-94-163-183:09801] *** Process received signal ***
> [wlan-145-94-163-183:09801] Signal: Segmentation fault: 11 (11)
> [wlan-145-94-163-183:09801] Signal code: Address not mapped (1)
> [wlan-145-94-163-183:09801] Failing at address: 0x7fed2d314220
> [wlan-145-94-163-183:09802] *** Process received signal ***
> [wlan-145-94-163-183:09802] Signal: Segmentation fault: 11 (11)
> [wlan-145-94-163-183:09802] Signal code: Address not mapped (1)
> [wlan-145-94-163-183:09802] Failing at address: 0x7fed2d3142e8
>   Received start :0.5 End :0.74   Num_in :25
> [wlan-145-94-163-183:09803] *** Process received signal ***
> [wlan-145-94-163-183:09803] Signal: Segmentation fault: 11 (11)
> [wlan-145-94-163-183:09803] Signal code: Address not mapped (1)
> [wlan-145-94-163-183:09803] Failing at address: 0x7fed2d3143b0
> [wlan-145-94-163-183:09801] [ 0] 0   libsystem_platform.dylib  
  0x7fffd6902b3a _sigtramp + 26
> [wlan-145-94-163-183:09801] [ 1] 0   ???   
  0x 0x0 + 0
> [wlan-145-94-163-183:09801] [ 2] 0   test  
  0x000108afafda main + 602
> [wlan-145-94-163-183:09801] [ 3] 0   libdyld.dylib 
  0x7fffd66f3235 start + 1
> [wlan-145-94-163-183:09801] *** End of error message ***
> [wlan-145-94-163-183:09802] [ 0] 0   libsystem_platform.dylib  
  0x7fffd6902b3a _sigtramp + 26
> [wlan-145-94-163-183:09802] [ 1] 0   ???   
  0x 0x0 + 0
> [wlan-145-94-163-183:09802] [ 2] 0   test  
  0x000107ed5fda main + 602
> [wlan-145-94-163-183:09802] [ 3] 0   libdyld.dylib 
  0x7fffd66f3235 start + 1
> [wlan-145-94-163-183:09802] *** End of error message ***
> [wlan-145-94-163-183:09803] [ 0] 0   libsystem_platform.dylib  
  0x7fffd6902b3a _

Re: [OMPI users] Problems with IPoIB and Openib

2017-05-27 Thread gilles
Allan,

about IPoIB, the error message (no route to host) is very puzzling.
did you double check IPoIB is ok between all nodes ?
this error message suggests IPoIB is not working between sm3 and sm4,
this could be caused by the subnet manager, or a firewall.
ping is the first tool you should use to test that, then you can use nc 
(netcat).
for example, on sm4
nc -l 1234
on sm3
echo hello | nc 10.1.0.5 1234
(expected result: "hello" should be displayed on sm4)

about openib, you first need to double check the btl/openib was built.
assuming you did not configure with --disable-dlopen, you should have a 
mca_btl_openib.so
file in /.../lib/openmpi. it should be accessible by the user, and
ldd /.../lib/openmpi/mca_btl_openib.so
should not have any unresolved dependencies on *all* your nodes

Cheers,

Gilles

- Original Message -
> I have been having some issues with using openmpi with tcp over IPoIB 
> and openib. The problems arise when I run a program that uses basic 
> collective communication. The two programs that I have been using are 
> attached.
> 
> *** IPoIB ***
> 
> The mpirun command I am using to run mpi over IPoIB is,
> mpirun --mca oob_tcp_if_include 192.168.1.0/24 --mca btl_tcp_include 
> 10.1.0.0/24 --mca pml ob1 --mca btl tcp,sm,vader,self -hostfile nodes 
> -np 8 ./avg 8000
> 
> This program will appear to run on the nodes, but will sit at 100% CPU 
> and use no memory. On the host node an error will be printed,
> 
> [sm1][[58411,1],0][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.3 failed: No route to host (113)
> 
> Using another program,
> 
> mpirun --mca oob_tcp_if_include 192.168.1.0/24 --mca btl_tcp_if_
include 
> 10.1.0.0/24 --mca pml ob1 --mca btl tcp,sm,vader,self -hostfile nodes 
> -np 8 ./congrad 800
> Produces the following result. This program will also run on the nodes 
> sm1, sm2, sm3, and sm4 at 100% and use no memory.
> [sm3][[61383,1],4][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.5 failed: No route to host (113)
> [sm4][[61383,1],6][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.4 failed: No route to host (113)
> [sm2][[61383,1],3][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.2 failed: No route to host (113)
> [sm3][[61383,1],5][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.5 failed: No route to host (113)
> [sm4][[61383,1],7][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.4 failed: No route to host (113)
> [sm2][[61383,1],2][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.2 failed: No route to host (113)
> [sm1][[61383,1],0][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.3 failed: No route to host (113)
> [sm1][[61383,1],1][btl_tcp_endpoint.c:803:mca_btl_tcp_endpoint_
complete_connect] 
> connect() to 10.1.0.3 failed: No route to host (113)
> 
> *** openib ***
> 
> Running the congrad program over openib will produce the result,
> mpirun --mca btl self,sm,openib --mca mtl ^psm --mca btl_tcp_if_
include 
> 10.1.0.0/24 -hostfile nodes -np 8 ./avg 800
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
> ***and potentially your MPI job)
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now 
abort,
> ***and potentially your MPI job)
> --

> A requested component was not found, or was unable to be opened. This
> means that this component is either not installed or is unable to be
> used on your system (e.g., sometimes this means that shared libraries
> that the component requires are unable to be found/loaded).  Note that
> Open MPI stopped checking at the first component that it did not find.
> Host:  sm2.overst.local
> Framework: btl
> Component: openib
> --

> --

> It looks like MPI_INIT failed for some reason; your parallel process 
is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or 
environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>mca_bml_base_open() failed
>--> Returned "Not found" (-13) instead of "S

Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread gilles
Hi Siegmar,

what if you ?
mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi

are loki and exin different ? (os, sockets, core) 

Cheers,

Gilles

- Original Message -
> Hi,
> 
> I have installed openmpi-v3.x-201705250239-d5200ea on my "SUSE Linux
> Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-7.1.0.
> Depending on the machine that I use to start my processes, I have
> a problem with "--host" for versions "v3.x" and "master", while
> everything works as expected with earlier versions.
> 
> 
> loki hello_1 111 mpiexec -np 3 --host loki:2,exin hello_1_mpi
> --

> There are not enough slots available in the system to satisfy the 3 
slots
> that were requested by the application:
>hello_1_mpi
> 
> Either request fewer slots for your application, or make more slots 
available
> for use.
> --

> 
> 
> 
> Everything is ok if I use the same command on "exin".
> 
> exin fd1026 107 mpiexec -np 3 --host loki:2,exin hello_1_mpi
> Process 0 of 3 running on loki
> Process 1 of 3 running on loki
> Process 2 of 3 running on exin
> ...
> 
> 
> 
> Everything is also ok if I use openmpi-v2.x-201705260340-58c6b3c on "
loki".
> 
> loki hello_1 114 which mpiexec
> /usr/local/openmpi-2.1.2_64_cc/bin/mpiexec
> loki hello_1 115 mpiexec -np 3 --host loki:2,exin hello_1_mpi
> Process 0 of 3 running on loki
> Process 1 of 3 running on loki
> Process 2 of 3 running on exin
> ...
> 
> 
> "exin" is a virtual machine on QEMU so that it uses a slightly 
different 
> processor architecture, e.g., it has no L3 cache but larger L2 caches.
> 
> loki fd1026 117 cat /proc/cpuinfo | grep -e "model name" -e "physical 
id" -e 
> "cpu cores" -e "cache size" | sort | uniq
> cache size: 15360 KB
> cpu cores: 6
> model name: Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
> physical id: 0
> physical id: 1
> 
> 
> loki fd1026 118 ssh exin cat /proc/cpuinfo | grep -e "model name" -e "
physical 
> id" -e "cpu cores" -e "cache size" | sort | uniq
> cache size: 4096 KB
> cpu cores: 6
> model name: Intel Core Processor (Haswell, no TSX)
> physical id: 0
> physical id: 1
> 
> 
> Any ideas what's different in the newer versions of Open MPI? Is the 
new
> behavior intended? I would be grateful, if somebody can fix the 
problem,
> if "mpiexec -np 3 --host loki:2,exin hello_1_mpi" should print my 
messages
> in versions "3.x" and "master" as well, if the programs are started on 
any
> machine. Do you need anything else? Thank you very much for any help 
in
> advance.
> 
> 
> Kind regards
> 
> Siegmar
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
> 


___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] problem with "--host" with openmpi-v3.x-201705250239-d5200ea

2017-05-30 Thread gilles
Hi Siegmar,

my bad, there was a typo in my reply.
i really meant
> > what if you ?
> > mpiexec --host loki:2,exin:1 -np 3 hello_1_mpi


but you also tried that and it did not help.

i could not find anything in your logs that suggest mpiexec tries to 
start 5 MPI tasks,
did i miss something ?

i will try to reproduce the issue by myself

Cheers,

Gilles

- Original Message -
> Hi Gilles,
> 
> > what if you ?
> > mpiexec --host loki:1,exin:1 -np 3 hello_1_mpi
> 
> I need as many slots as processes so that I use "-np 2".
> "mpiexec --host loki,exin -np 2 hello_1_mpi" works as well. The 
command
> breaks, if I use at least "-np 3" and distribute the processes across 
at
> least two machines.
> 
> loki hello_1 118 mpiexec --host loki:1,exin:1 -np 2 hello_1_mpi
> Process 0 of 2 running on loki
> Process 1 of 2 running on exin
> Now 1 slave tasks are sending greetings.
> Greetings from task 1:
>message type:3
>msg length:  131 characters
>message:
>  hostname:  exin
>  operating system:  Linux
>  release:   4.4.49-92.11-default
>  processor: x86_64
> loki hello_1 119
> 
> 
> 
> > are loki and exin different ? (os, sockets, core)
> 
> Yes, loki is a real machine and exin is a virtual one. "exin" uses a 
newer
> kernel.
> 
> loki fd1026 108 uname -a
> Linux loki 4.4.38-93-default #1 SMP Wed Dec 14 12:59:43 UTC 2016 (
2d3e9d4) 
> x86_64 x86_64 x86_64 GNU/Linux
> 
> loki fd1026 109 ssh exin uname -a
> Linux exin 4.4.49-92.11-default #1 SMP Fri Feb 17 08:29:30 UTC 2017 (
8f9478a) 
> x86_64 x86_64 x86_64 GNU/Linux
> loki fd1026 110
> 
> The number of sockets and cores is identical, but the processor types 
are
> different as you can see at the end of my previous email. "loki" uses 
two
> "Intel(R) Xeon(R) CPU E5-2620 v3" processors and "exin" two "Intel 
Core
> Processor (Haswell, no TSX)" from QEMU. I can provide a pdf file with 
both
> topologies (89 K) if you are interested in the output from lstopo. I'
ve
> added some runs. Most interesting in my opinion are the last two
> "mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi" and
> "mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi".
> Why does mpiexec create five processes although I've asked for only 
three
> processes? Why do I have to break the program with  for the 
first
> of the above commands?
> 
> 
> 
> loki hello_1 110 mpiexec --host loki:2,exin:1 -np 3 hello_1_mpi
> --

> There are not enough slots available in the system to satisfy the 3 
slots
> that were requested by the application:
>hello_1_mpi
> 
> Either request fewer slots for your application, or make more slots 
available
> for use.
> --

> 
> 
> 
> loki hello_1 111 mpiexec --host exin:3 -np 3 hello_1_mpi
> Process 0 of 3 running on exin
> Process 1 of 3 running on exin
> Process 2 of 3 running on exin
> ...
> 
> 
> 
> loki hello_1 115 mpiexec --host exin:2,loki:3 -np 3 hello_1_mpi
> Process 1 of 3 running on loki
> Process 0 of 3 running on loki
> Process 2 of 3 running on loki
> ...
> 
> Process 0 of 3 running on exin
> Process 1 of 3 running on exin
> [exin][[52173,1],1][../../../../../openmpi-v3.x-201705250239-d5200ea/
opal/mca/btl/tcp/btl_tcp_endpoint.c:794:mca_btl_tcp_endpoint_complete_
connect] 
> connect() to 193.xxx.xxx.xxx failed: Connection refused (111)
> 
> ^Cloki hello_1 116
> 
> 
> 
> 
> loki hello_1 116 mpiexec -np 3 --host exin:2,loki:3 hello_1_mpi
> Process 0 of 3 running on loki
> Process 2 of 3 running on loki
> Process 1 of 3 running on loki
> ...
> Process 1 of 3 running on exin
> Process 0 of 3 running on exin
> [exin][[51638,1],1][../../../../../openmpi-v3.x-201705250239-d5200ea/
opal/mca/btl/tcp/btl_tcp_endpoint.c:590:mca_btl_tcp_endpoint_recv_
blocking] 
> recv(16, 0/8) failed: Connection reset by peer (104)
> [exin:31909] 
> ../../../../../openmpi-v3.x-201705250239-d5200ea/ompi/mca/pml/ob1/pml_
ob1_sendreq.c:191 
> FATAL
> loki hello_1 117
> 
> 
> Do you need anything else?
> 
> 
> Kind regards and thank you very much for your help
> 
> Siegmar
> 
> 
> 
> > 
> > Cheers,
> > 
> > Gilles
> > 
> > - Original Message -
> >> Hi,
> >>
> >> I have installed openmpi-v3.x-201705250239-d5200ea on my "SUSE 
Linux
> >> Enterprise Server 12.2 (x86_64)" with Sun C 5.14 and gcc-7.1.0.
> >> Depending on the 

Re: [OMPI users] Double free or corruption with OpenMPI 2.0

2017-06-14 Thread gilles
 Hi,

at first, i suggest you decide which Open MPI version you want to use.

the most up to date versions are 2.0.3 and 2.1.1

then please provide all the info Jeff previously requested.

ideally, you would write a simple and standalone program that exhibits 
the issue, so we can reproduce and investigate it.

if not, i suggest you use an other MPI library (mvapich, Intel MPI or 
any mpich-based MPI) and see if the issue is still there.

if the double free error still occurs, it is very likely the issue comes 
from your application and not the MPI library.

if you have a parallel debugger such as allinea ddt, then you can run 
your program under the debugger with thorough memory debugging. the 
program will halt when the memory corruption occurs, and this will be a 
hint

(app issue vs mpi issue).

if you did not configure Open MPI with --enable-debug, then please do so 
and try again,

you will increase the likelyhood of trapping such a memory corruption 
error earlier, and you will get a clean Open MPI stack trace if a crash 
occurs.

you might also want to try to

mpirun --mca btl tcp,self ...

and see if you get a different behavior.

this will only use TCP for inter process communication, and this is way 
easier to debug than shared memory or rdma

Cheers,

Gilles

- Original Message -

Hello,
  I found a thread with Intel MPI(although I am using 
gfortran 4.8.5 and OpenMPI 2.1.1) - 
https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/564266
 but the error the OP gets is the same as mine

*** glibc detected *** ./a.out: double free or corruption (!prev): 
0x7fc6dc80 ***
04  === Backtrace: =
05  /lib64/libc.so.6[0x3411e75e66]
06 /lib64/libc.so.6[0x3411e789b3]

So the explanation given in that post is this -
"From their examination our Development team concluded the 
underlying problem with openmpi 1.8.6 resulted from mixing out-of-date/
incompatible Fortran RTLs. In short, there were older static Fortran RTL 
bodies incorporated in the openmpi library that when mixed with newer 
Fortran RTL led to the failure. They found the issue is resolved in the 
newer openmpi-1.10.1rc2 and recommend resolving requires using a newer 
openmpi release with our 15.0 (or newer) release." Could this be 
possible with my version as well ?


I am willing to debug this provided I am given some clue on how to 
approach my problem. At the moment I am unable to proceed further and 
the only thing I can add is I ran tests with the sequential form of my 
application and it is much slower although I am using shared memory and 
all the cores are in the same machine.

Best regards,
Ashwin.





On Tue, Jun 13, 2017 at 5:52 PM, ashwin .D  
wrote:

Also when I try to build and run a make check I get these errors 
- Am I clear to proceed or is my installation broken ? This is on Ubuntu 
16.04 LTS.

==
   Open MPI 2.1.1: test/datatype/test-suite.log
==

# TOTAL: 9
# PASS:  8
# SKIP:  0
# XFAIL: 0
# FAIL:  1
# XPASS: 0
# ERROR: 0

.. contents:: :depth: 2

FAIL: external32


/home/t/openmpi-2.1.1/test/datatype/.libs/lt-external32: symbol 
lookup error: /home/openmpi-2.1.1/test/datatype/.libs/lt-external32: 
undefined symbol: ompi_datatype_pack_external_size
FAIL external32 (exit status:

On Tue, Jun 13, 2017 at 5:24 PM, ashwin .D  
wrote:

Hello,
  I am using OpenMPI 2.0.0 with a computational 
fluid dynamics software and I am encountering a series of errors when 
running this with mpirun. This is my lscpu output

CPU(s):4
On-line CPU(s) list:   0-3
Thread(s) per core:2
Core(s) per socket:2
Socket(s): 1 and I am running OpenMPI's mpirun 
in the following

way

mpirun -np 4  cfd_software



and I get double free or corruption every single time.



I have two questions -



1) I am unable to capture the standard error that mpirun 
throws in a file

How can I go about capturing the standard error of mpirun ? 

2) Has this error i.e. double free or corruption been 
reported by others ? Is there a Is a 

bug fix available ?



Regards,

Ashwin.





___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-16 Thread gilles
Ted,

if you

mpirun --mca odls_base_verbose 10 ...

you will see which processes get killed and how

Best regards,


Gilles

- Original Message -
> Hello Jeff,
> 
> Thanks for your comments.
> 
> I am not seeing behavior #4, on the two computers that I have tested 
on, using Open MPI 
> 2.1.1.
> 
> I wonder if you can duplicate my results with the files that I have 
uploaded.
> 
> Regarding what is the "correct" behavior, I am willing to modify my 
application to correspond 
> to Open MPI's behavior (whatever behavior the Open MPI developers 
decide is best) -- 
> provided that Open MPI does in fact kill off both shells.
> 
> So my highest priority now is to find out why Open MPI 2.1.1 does not 
kill off both shells on 
> my computer.
> 
> Sincerely,
> 
> Ted Sussman
> 
>  On 16 Jun 2017 at 16:35, Jeff Squyres (jsquyres) wrote:
> 
> > Ted --
> > 
> > Sorry for jumping in late.  Here's my $0.02...
> > 
> > In the runtime, we can do 4 things:
> > 
> > 1. Kill just the process that we forked.
> > 2. Kill just the process(es) that call back and identify themselves 
as MPI processes (we don't track this right now, but we could add that 
functionality).
> > 3. Union of #1 and #2.
> > 4. Kill all processes (to include any intermediate processes that 
are not included in #1 and #2).
> > 
> > In Open MPI 2.x, #4 is the intended behavior.  There may be a bug or 
two that needs to get fixed (e.g., in your last mail, I don't see 
offhand why it waits until the MPI process finishes sleeping), but we 
should be killing the process group, which -- unless any of the 
descendant processes have explicitly left the process group -- should 
hit the entire process tree.  
> > 
> > Sidenote: there's actually a way to be a bit more aggressive and do 
a better job of ensuring that we kill *all* processes (via creative use 
of PR_SET_CHILD_SUBREAPER), but that's basically a future enhancement / 
optimization.
> > 
> > I think Gilles and Ralph proposed a good point to you: if you want 
to be sure to be able to do cleanup after an MPI process terminates (
normally or abnormally), you should trap signals in your intermediate 
processes to catch what Open MPI's runtime throws and therefore know 
that it is time to cleanup.  
> > 
> > Hypothetically, this should work in all versions of Open MPI...?
> > 
> > I think Ralph made a pull request that adds an MCA param to change 
the default behavior from #4 to #1.
> > 
> > Note, however, that there's a little time between when Open MPI 
sends the SIGTERM and the SIGKILL, so this solution could be racy.  If 
you find that you're running out of time to cleanup, we might be able to 
make the delay between the SIGTERM and SIGKILL be configurable (e.g., 
via MCA param).
> > 
> > 
> > 
> > 
> > > On Jun 16, 2017, at 10:08 AM, Ted Sussman  
wrote:
> > > 
> > > Hello Gilles and Ralph,
> > > 
> > > Thank you for your advice so far.  I appreciate the time that you 
have spent to educate me about the details of Open MPI.
> > > 
> > > But I think that there is something fundamental that I don't 
understand.  Consider Example 2 run with Open MPI 2.1.1. 
> > > 
> > > mpirun --> shell for process 0 -->  executable for process 0 --> 
MPI calls, MPI_Abort
> > >--> shell for process 1 -->  executable for process 1 --> 
MPI calls
> > > 
> > > After the MPI_Abort is called, ps shows that both shells are 
running, and that the executable for process 1 is running (in this case, 
process 1 is sleeping).  And mpirun does not exit until process 1 is 
finished sleeping.
> > > 
> > > I cannot reconcile this observed behavior with the statement
> > > 
> > > > > 2.x: each process is put into its own process group 
upon launch. When we issue a
> > > > > "kill", we issue it to the process group. Thus, every 
child proc of that child proc will
> > > > > receive it. IIRC, this was the intended behavior.
> > > 
> > > I assume that, for my example, there are two process groups.  The 
process group for process 0 contains the shell for process 0 and the 
executable for process 0; and the process group for process 1 contains 
the shell for process 1 and the executable for process 1.  So what does 
MPI_ABORT do?  MPI_ABORT does not kill the process group for process 0, 
since the shell for process 0 continues.  And MPI_ABORT does not kill 
the process group for process 1, since both the shell and executable for 
process 1 continue.
> > > 
> > > If I hit Ctrl-C after MPI_

Re: [OMPI users] MPI_ABORT, indirect execution of executables by mpirun, Open MPI 2.1.1

2017-06-17 Thread gilles
Ted,

i do not observe the same behavior you describe with Open MPI 2.1.1

# mpirun -np 2 -mca btl tcp,self --mca odls_base_verbose 5 ./abort.sh

abort.sh 31361 launching abort
abort.sh 31362 launching abort
I am rank 0 with pid 31363
I am rank 1 with pid 31364

--
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.

--
[linux:31356] [[18199,0],0] odls:kill_local_proc working on WILDCARD
[linux:31356] [[18199,0],0] odls:kill_local_proc checking child process 
[[18199,1],0]
[linux:31356] [[18199,0],0] SENDING SIGCONT TO [[18199,1],0]
[linux:31356] [[18199,0],0] odls:default:SENT KILL 18 TO PID 31361 
SUCCESS
[linux:31356] [[18199,0],0] odls:kill_local_proc checking child process 
[[18199,1],1]
[linux:31356] [[18199,0],0] SENDING SIGCONT TO [[18199,1],1]
[linux:31356] [[18199,0],0] odls:default:SENT KILL 18 TO PID 31362 
SUCCESS
[linux:31356] [[18199,0],0] SENDING SIGTERM TO [[18199,1],0]
[linux:31356] [[18199,0],0] odls:default:SENT KILL 15 TO PID 31361 
SUCCESS
[linux:31356] [[18199,0],0] SENDING SIGTERM TO [[18199,1],1]
[linux:31356] [[18199,0],0] odls:default:SENT KILL 15 TO PID 31362 
SUCCESS
[linux:31356] [[18199,0],0] SENDING SIGKILL TO [[18199,1],0]
[linux:31356] [[18199,0],0] odls:default:SENT KILL 9 TO PID 31361 
SUCCESS
[linux:31356] [[18199,0],0] SENDING SIGKILL TO [[18199,1],1]
[linux:31356] [[18199,0],0] odls:default:SENT KILL 9 TO PID 31362 
SUCCESS
[linux:31356] [[18199,0],0] odls:kill_local_proc working on WILDCARD
[linux:31356] [[18199,0],0] odls:kill_local_proc checking child process 
[[18199,1],0]
[linux:31356] [[18199,0],0] odls:kill_local_proc child [[18199,1],0] is 
not alive
[linux:31356] [[18199,0],0] odls:kill_local_proc checking child process 
[[18199,1],1]
[linux:31356] [[18199,0],0] odls:kill_local_proc child [[18199,1],1] is 
not alive


Open MPI did kill both shells, and they were indeed killed as evidenced 
by ps

#ps -fu gilles --forest
UIDPID  PPID  C STIME TTY  TIME CMD
gilles1564  1561  0 15:39 ?00:00:01 sshd: gilles@pts/1
gilles1565  1564  0 15:39 pts/100:00:00  \_ -bash
gilles   31356  1565  3 15:57 pts/100:00:00  \_ /home/gilles/
local/ompi-v2.x/bin/mpirun -np 2 -mca btl tcp,self --mca odls_base
gilles   31364 1  1 15:57 pts/100:00:00 ./abort


so trapping SIGTERM in your shell and manually killing the MPI task 
should work
(as Jeff explained, as long as the shell script is fast enough to do 
that between SIGTERM and SIGKILL)


if you observe a different behavior, please double check your Open MPI 
version and post the outputs of the same commands.

btw, are you running from a batch manager ? if yes, which one ?

Cheers,

Gilles

- Original Message -
> Ted,
> 
> if you
> 
> mpirun --mca odls_base_verbose 10 ...
> 
> you will see which processes get killed and how
> 
> Best regards,
> 
> 
> Gilles
> 
> - Original Message -
> > Hello Jeff,
> > 
> > Thanks for your comments.
> > 
> > I am not seeing behavior #4, on the two computers that I have tested 
> on, using Open MPI 
> > 2.1.1.
> > 
> > I wonder if you can duplicate my results with the files that I have 
> uploaded.
> > 
> > Regarding what is the "correct" behavior, I am willing to modify my 
> application to correspond 
> > to Open MPI's behavior (whatever behavior the Open MPI developers 
> decide is best) -- 
> > provided that Open MPI does in fact kill off both shells.
> > 
> > So my highest priority now is to find out why Open MPI 2.1.1 does 
not 
> kill off both shells on 
> > my computer.
> > 
> > Sincerely,
> > 
> > Ted Sussman
> > 
> >  On 16 Jun 2017 at 16:35, Jeff Squyres (jsquyres) wrote:
> > 
> > > Ted --
> > > 
> > > Sorry for jumping in late.  Here's my $0.02...
> > > 
> > > In the runtime, we can do 4 things:
> > > 
> > > 1. Kill just the process that we forked.
> > > 2. Kill just the process(es) that call back and identify 
themselves 
> as MPI processes (we don't track this right now, but we could add that 
> functionality).
> > > 3. Union of #1 and #2.
> > > 4. Kill all processes (to include any intermediate processes that 
> are not included in #1 and #2).
> > > 
> > > In Open MPI 2.x, #4 is the intended behavior.  There may be a bug 
or 
> two that needs to get fixed (e.g., in your last mail, I don't see 
> offhand why it waits until the MPI process fi

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-13 Thread gilles
 Mahmood,

since you are building a static binary, only static library (e.g. 
libibverbs.a) can be used.

on your system, only dynamic libibverbs.so is available.

simply install libibverbs.a and you should be fine.

Best regards,

Gilles

- Original Message -

Hi,
I am trying to build an application with static linking that uses 
openmpi. in the middle of the build, I get this

mpif90 -g -pthread -static -o iotk_print_kinds.x iotk_print_kinds.o 
libiotk.a
/usr/bin/ld: cannot find -libverbs
collect2: ld returned 1 exit status


However, such library exists on the system.

[root@cluster source]# find /usr/ -name *ibverb*
/usr/lib64/libibverbs.so
/usr/lib64/libibverbs.so.1.0.0
/usr/lib64/libibverbs.so.1
/usr/share/doc/libibverbs-1.1.8
[root@cluster source]# mpif90 -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man 
--infodir=/usr/share/info --with-bugurl=
http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared -
-enable-threads=posix --enable-checking=release --with-system-zlib --
enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-
object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-
java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj
-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --
with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib 
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=
x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)




Any idea for that?
Regards,
Mahmood




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] mpif90 unable to find ibverbs

2017-09-13 Thread gilles
This is something related to DAPL.

/* just google "libdat" */

iirc, Intel MPI uses that,  but i do not recall Open MPI using it (!)

are you sure you are using Open MPI ?

which interconnect do you have ?

Cheers,

Gilles

- Original Message -

    Thanks Gilles... That has been solved. Another issue is

mpif90 -g -pthread -static -o iotk_print_kinds.x iotk_print_kinds.o 
libiotk.a
/usr/bin/ld: cannot find -ldat


The name is actually hard to google! I cannot find the library name 
for "dat". Have you heard of that? There is not "libdat" package as I 
searched.


Regards,
Mahmood



On Wed, Sep 13, 2017 at 2:54 PM,  wrote:

 Mahmood,

 

since you are building a static binary, only static library (e.g.
 libibverbs.a) can be used.

on your system, only dynamic libibverbs.so is available.

 

simply install libibverbs.a and you should be fine.

 

Best regards,

 

Gilles

- Original Message -

Hi,
I am trying to build an application with static linking that 
uses openmpi. in the middle of the build, I get this

mpif90 -g -pthread -static -o iotk_print_kinds.x iotk_print_
kinds.o libiotk.a
/usr/bin/ld: cannot find -libverbs
collect2: ld returned 1 exit status

However, such library exists on the system.

[root@cluster source]# find /usr/ -name *ibverb*
/usr/lib64/libibverbs.so
/usr/lib64/libibverbs.so.1.0.0
/usr/lib64/libibverbs.so.1
/usr/share/doc/libibverbs-1.1.8
[root@cluster source]# mpif90 -v
Using built-in specs.
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/
share/man --infodir=/usr/share/info --with-bugurl=
http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared -
-enable-threads=posix --enable-checking=release --with-system-zlib --
enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-
object --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-
java-awt=gtk --disable-dssi --with-java-home=/usr/lib/jvm/java-1.5.0-gcj
-1.5.0.0/jre --enable-libgcj-multifile --enable-java-maintainer-mode --
with-ecj-jar=/usr/share/java/eclipse-ecj.jar --disable-libjava-multilib 
--with-ppl --with-cloog --with-tune=generic --with-arch_32=i686 --build=
x86_64-redhat-linux
Thread model: posix
gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC)



Any idea for that?
Regards,
Mahmood


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Bug in 2.1.2 configure script

2017-11-24 Thread gilles
Thanks Fabrizio !

this has been fixed from v3.0.x, but has never been back-ported into the 
v2.x branch.

i will issue a PR to fix this


Cheers,

Gilles

- Original Message -
> 
> Dear All
> 
> I have already posted to the Developers list but thought it best in 
mail 
> here as well.
> 
> There appears to be a bug in the following files bundled with version 
> 2.1.2.
> 
> config/opal_setup_wrappers.m4
> configure
> 
> This relates specifically to the following line
> 
> $OPAL_TOP_BUILDDIR/libtool --tag=FC--config > $rpath_outfile
> 
> I believe this should read
> 
> $OPAL_TOP_BUILDDIR/libtool --tag=FC --config > $rpath_outfile
> 
> [note the ' ' between FC and --config]
> 
> Currently the following error results when building OpenMPI 2.1.2
> 
> configure.err:libtool:   error: ignoring unknown tag FC--config
> 
> I have fixed this in the version I have downloaded but wondered if 
this 
> could be rectified in the versions available for download.
> 
> Many thanks,
> Fab
> 
> --
> Fabrizio Sidoli
> System Administrator, LCN
> Tel: 020 7679 2869 (32869)
> f.sidoli (a) ucl.ac.uk
> 
> London Centre for Nanotechnology
> University College London
> 17-19 Gordon Street
> London
> WC1H 0AH
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> 


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Compiling Open MPI for Cross-Compilation

2017-12-15 Thread gilles
 Benjamin,

try removing the --target option.

if it still does not work, then try replacing --target with --build

you can refer to 
http://jingfenghanmax.blogspot.jp/2010/09/configure-with-host-target-and-build.html
 for the details

As far as Open MPI is concerned, note you cannot cross build Fortran 
support from the scratch.

Cheers,

Gilles

- Original Message -

I'd like to run Open MPI on a cluster of RISC-V machines.  These 
machines are pretty weak cores and so I need to cross-compile.  I'd like 
to do this:

Machine 1, which is x86_64-linux-gnu, compiles programs for machine 
2.

Machine 2, which is riscv64-unknown-linux, will run these programs.

It seems to me like the correct configure line for this might be:

./configure --host=riscv64-unknown-linux --target=x86_64-linux-
gnu --enable-static --disable-shared --prefix=/home/ubuntu/src/ben-build
/openmpi


However, this yields an error:

configure: WARNING: *** The Open MPI configure script does not 
support --program-prefix, --program-suffix or --program-transform-name. 
Users are recommended to instead use --prefix with a unique directory 
and make symbolic links as desired for renaming.
configure: error: *** Cannot continue


Any tips?  Will it be possible for me to cross-compile this way with 
Open MPI?

Ben



___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI users] Compiling Open MPI for Cross-Compilation

2017-12-17 Thread gilles
 Benjamin,



i noticed you build Open MPI with plain gcc.

is gcc a cross compiler ?

if not, you have to tell configure to use the cross compilers (and cross 
assembler and linker too)

for example

configure CC=crosscompiler ...

you might be able to achieve this with standard gcc and the right -march 
flag

for example

configure CFLAGS=-march=riscv64



Cheers,



Gilles

- Original Message -

Here's my config.log.

Recompiling now with those options.

Ben
 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OpenMPI 3.0.1 debug crashes eclipse due to mpirun -display-map bug

2018-04-23 Thread gilles
Thanks for the report,

This is indeed an issue i fixed at https://github.com/open-mpi/ompi/pull/5088



Cheers,

Gilles

- Original Message -
> Hello,
> 
> Because of the error below, Eclipse is not able to run PTP debugger 
with OpenMPI 3.0.1.
> 
> Can someone help me??
> 
> I use CentOS 7. 
> 
> Thanks!
> 
> Erico
> 
> -
> 
> [erico@centos64 ContainerServiceDebug]$ mpirun -mca orte_show_resolved
_nodenames 1 -display-map -np 1 pwd
>  Data for JOB [23564,1] offset 0 Total slots allocated 4
> [centos64:98315] *** Process received signal ***
> [centos64:98315] Signal: Segmentation fault (11)
> [centos64:98315] Signal code: Address not mapped (1)
> [centos64:98315] Failing at address: (nil)
> [centos64:98315] [ 0] /usr/lib64/libpthread.so.0(+0xf100)[
0x7f74537c9100]
> [centos64:98315] [ 1] /usr/local/lib/libopen-rte.so.40(orte_dt_print_
node+0x451)[0x7f7454a6a35f]
> [centos64:98315] [ 2] /usr/local/lib/libopen-pal.so.40(opal_dss_print+
0x68)[0x7f745474d1d5]
> [centos64:98315] [ 3] /usr/local/lib/libopen-rte.so.40(orte_dt_print_
map+0x517)[0x7f7454a6b834]
> [centos64:98315] [ 4] /usr/local/lib/libopen-pal.so.40(opal_dss_print+
0x68)[0x7f745474d1d5]
> [centos64:98315] [ 5] /usr/local/lib/libopen-rte.so.40(orte_rmaps_base
_display_map+0x53b)[0x7f7454aefd0c]
> [centos64:98315] [ 6] /usr/local/lib/libopen-rte.so.40(orte_odls_base_
default_construct_child_list+0x13f7)[0x7f7454acf090]
> [centos64:98315] [ 7] /usr/local/lib/openmpi/mca_odls_default.so(+
0x2c7c)[0x7f744d5f3c7c]
> [centos64:98315] [ 8] /usr/local/lib/libopen-rte.so.40(orte_daemon_
recv+0x6d7)[0x7f7454a9bdb5]
> [centos64:98315] [ 9] /usr/local/lib/libopen-rte.so.40(orte_rml_base_
process_msg+0x2e5)[0x7f7454afbde8]
> [centos64:98315] [10] /usr/local/lib/libopen-pal.so.40(opal_
libevent2022_event_base_loop+0x8fc)[0x7f74547a246c]
> [centos64:98315] [11] mpirun[0x4016f7]
> [centos64:98315] [12] mpirun[0x4010e0]
> [centos64:98315] [13] /usr/lib64/libc.so.6(__libc_start_main+0xf5)[
0x7f7453419b15]
> [centos64:98315] [14] mpirun[0x400ff9]
> [centos64:98315] *** End of error message ***
> Segmentation fault (core dumped)
> [erico@centos64 ContainerServiceDebug]$ 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> 


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] openmpi/slurm/pmix

2018-04-24 Thread gilles
Charles,

have you tried to configure --with-pmix-libdir=/.../lib64 ?

Cheers,

Gilles

- Original Message -
> I’ll add that when building OpenMPI 3.0.0 with an external PMIx, I 
found that the OpenMPI configure script only looks in “lib” for the the 
pmix library but the pmix configure/build uses “lib64” (as it should on 
a 64-bit system) so the configure script falls back to the internal PMIx.
  As Robert suggested, check your config.log for “not found” messages.  
> 
> In my case, I simply added a “lib -> lib64” symlink in the PMIx 
installation directory rather than alter the configure script and that 
did the trick.
> 
> Good luck,
> 
> Charlie
> 
> > On Apr 23, 2018, at 6:07 PM, r...@open-mpi.org wrote:
> > 
> > Hi Michael
> > 
> > Looks like the problem is that you didn’t wind up with the external 
PMIx. The component listed in your error is the internal PMIx one which 
shouldn’t have built given that configure line.
> > 
> > Check your config.out and see what happened. Also, ensure that your 
LD_LIBRARY_PATH is properly pointing to the installation, and that you 
built into a “clean” prefix.
> > 
> > 
> >> On Apr 23, 2018, at 12:01 PM, Michael Di Domenico  wrote:
> >> 
> >> i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.
> >> 
> >> everything compiled, but when i run something it get
> >> 
> >> : symbol lookup error: /openmpi/mca_pmix_pmix2x.so: undefined 
symbol:
> >> opal_libevent2022_evthread_use_pthreads
> >> 
> >> i more then sure i did something wrong, but i'm not sure what, here
's what i did
> >> 
> >> compile libevent 2.1.8
> >> 
> >> ./configure --prefix=/libevent-2.1.8
> >> 
> >> compile pmix 2.1.0
> >> 
> >> ./configure --prefix=/pmix-2.1.0 --with-psm2
> >> --with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8
> >> 
> >> compile openmpi
> >> 
> >> ./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
> >> --with-hwloc=external --with-mxm=/opt/mellanox/mxm
> >> --with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
> >> --with-libevent=/libevent-2.1.8
> >> 
> >> when i look at the symbols in the mca_pmix_pmix2x.so library the
> >> function is indeed undefined (U) in the output, but checking ldd
> >> against the library doesn't show any missing
> >> 
> >> any thoughts?
> >> ___
> >> users mailing list
> >> users@lists.open-mpi.org
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=

> > 
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.open-2Dmpi.org_mailman_listinfo_users&d=DwIGaQ&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=HOtXciFqK5GlgIgLAxthUQ&m=XE6hInyZVJ5VMrO5vdTEKEw3pZBBVnLE7U8Nm67zj2M&s=_sgJVrkRzlv7dIYMvtMfj26AJdbH-fcOOarmN7PyJCI&e=

> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Error in file base/plm_base_launch_support.c: OPAL_HWLOC_TOPO

2018-07-24 Thread gilles
 Henry,

First, you could/should use mpicc instead of the cc cray compiler


I also noted


gfortran Linker Flags   : -pthread -I/global/homes/h/hlovelac/BMAD/bmad_
dist_2018_0724/production/lib -Wl,-rpath -Wl,/global/homes/h/hlovelac/
BMAD/bmad_dist_2018_0724/production/lib -Wl,--enable-new-dtags -L/global
/homes/h/hlovelac/BMAD/bmad_dist_2018_0724/production/lib -lmpi_
usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi;-lX11


It should be '-lmpi -lX11' instead of '-lmpi;-lX11'


The linker flags suggest Open MPI is installed in /global/homes/h/
hlovelac/BMAD/bmad_dist_2018_0724/production/lib, but your LD_LIBRARY_
PATH suggests it is in
$HOME/BMAD/bmad_dist_2018_0717/production/lib

(note 0724 vs 0717)

Also, keep in mind LD_LIBRARY_PATH is only used at runtime in order to 
resolve dependencies.

The linker does *not* use LD_LIBRARY_PATH.

IIRC, it uses LIBRARY_PATH, but the preferred way is to use the -L 
argument.

If your problem persists, I suggest you get the full command line that 
is failing.

(It should invoke mpifort instead gfortran or cc). Then you can copy/
paste the mpifort command, add the

-showme parameter, and run it manually so we can understand what is 
really hapenning under the(cmake) hood.

Cheers,

Gilles

- Original Message -

Hi,

   I am receiving these errors when building with OpenMPI on the 
NERSC system.
Building directory: util_programs

-- The C compiler identification is GNU 7.1.0
-- The CXX compiler identification is GNU 7.1.0
-- Cray Programming Environment 2.5.12 C
-- Check for working C compiler: /opt/cray/pe/craype/2.5.12/bin/cc
-- Check for working C compiler: /opt/cray/pe/craype/2.5.12/bin/cc -
- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Cray Programming Environment 2.5.12 CXX
-- Check for working CXX compiler: /opt/cray/pe/craype/2.5.12/bin/CC
-- Check for working CXX compiler: /opt/cray/pe/craype/2.5.12/bin/CC 
-- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- The Fortran compiler identification is GNU 7.1.0
-- Check for working Fortran compiler: /global/homes/h/hlovelac/BMAD
/bmad_dist_2018_0724/production/bin/mpifort
-- Check for working Fortran compiler: /global/homes/h/hlovelac/BMAD
/bmad_dist_2018_0724/production/bin/mpifort  -- works
-- Detecting Fortran compiler ABI info
-- Detecting Fortran compiler ABI info - done
-- Checking whether /global/homes/h/hlovelac/BMAD/bmad_dist_2018_
0724/production/bin/mpifort supports Fortran 90
-- Checking whether /global/homes/h/hlovelac/BMAD/bmad_dist_2018_
0724/production/bin/mpifort supports Fortran 90 -- yes

Build type   : Production
Linking with release : /global/homes/h/hlovelac/BMAD/bmad_dist_2018_
0724 (Off-site Distribution)
C Compiler   : /opt/cray/pe/craype/2.5.12/bin/cc
Fortran Compiler : /global/homes/h/hlovelac/BMAD/bmad_dist_2018_
0724/production/bin/mpifort
Plotting Libraries   : pgplot
OpenMP Support   : Not Enabled
MPI Support  : Enabled
FFLAGS   :  
gfortran Compiler Flags : -Df2cFortran -DCESR_UNIX -DCESR_LINUX -u -
traceback -cpp -fno-range-check -fdollar-ok -fbacktrace -Bstatic -ffree-
line-length-none -DCESR_PGPLOT -I/global/homes/h/hlovelac/BMAD/bmad_dist
_2018_0724/production/include -pthread -I/global/homes/h/hlovelac/BMAD/
bmad_dist_2018_0724/production/lib -fPIC -O2
gfortran Linker Flags   : -pthread -I/global/homes/h/hlovelac/BMAD/
bmad_dist_2018_0724/production/lib -Wl,-rpath -Wl,/global/homes/h/
hlovelac/BMAD/bmad_dist_2018_0724/production/lib -Wl,--enable-new-dtags 
-L/global/homes/h/hlovelac/BMAD/bmad_dist_2018_0724/production/lib -lmpi
_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi;-lX11

SHARED DEPS  :

-- Configuring done
-- Generating done
-- Build files have been written to: /global/homes/h/hlovelac/BMAD/
bmad_dist_2018_0724/util_programs/production
Scanning dependencies of target compare_tracking_methods_text-exe
Scanning dependencies of target compare_tracking_methods_plot-exe
Scanning dependencies of target f77_to_f90-exe
Scanning dependencies of target util_programs
Scanning dependencies of target lattice_cleaner-exe
Scanning dependencies of target bmad_to_gpt-exe
Scanning dependencies of target bmad_to_mad_sad_and_xsif-exe
Scanning dependencies of target sad_to_bmad_postprocess-exe
Scanning dependencies of target aspea2-exe
Scanning dependencies of target bmad_to_csrtrack-exe
Scanning dependencies of target ansga2-exe
Scanning dependencies of target bmad_to_blender-exe
Scanning dependencies of target bmad_to_autocad-exe
Scanning dependencies of target el

Re: [OMPI users] Memory Leak in 3.1.2 + UCX

2018-10-06 Thread gilles
Charles,

ucx has a higher priority than ob1, that is why it is used by default 
when available.


If you can provide simple instructions on how to build and test one of 
the apps that experiment
a memory leak, that would greatly help us and the UCX folks reproduce, 
troubleshoot and diagnose this issue.


Cheers,

Gilles

- Original Message -
> 
> > On Oct 5, 2018, at 11:31 AM, Gilles Gouaillardet  wrote:
> > 
> > are you saying that even if you
> > 
> > mpirun --mca pml ob1 ...
> > 
> > (e.g. force the ob1 component of the pml framework) the memory leak 
is
> > still present ?
> 
> No, I do not mean to say that - at least not in the current 
incarnation.  Running with the following parameters avoids the leak…
> 
> export OMPI_MCA_pml="ob1"
> export OMPI_MCA_btl_openib_eager_limit=1048576
> export OMPI_MCA_btl_openib_max_send_size=1048576
> 
> as does building OpenMPI without UCX support (i.e. —without-ucx).   
> 
> However, building _with_ UCX support (including the current github 
source) and running with the following parameters produces
> the leak (note that no PML was explicitly requested).  
> 
>export OMPI_MCA_oob_tcp_listen_mode="listen_thread"
>export OMPI_MCA_btl_openib_eager_limit=1048576
>export OMPI_MCA_btl_openib_max_send_size=1048576
>export OMPI_MCA_btl="self,vader,openib”
> 
> The eager_limit and send_size limits are needed with this app to 
prevent a deadlock that I’ve posted about previously. 
> 
> Also, explicitly requesting the UCX PML with,
> 
>  export OMPI_MCA_pml=“ucx"
> 
> produces the leak.
> 
> I’m continuing to try to find exactly what I’m doing wrong to produce 
this behavior but have been unable to arrive at 
> a solution other than excluding UCX which seems like a bad idea since 
Jeff (Squyres) pointed out that it is the
> Mellanox-recommended way to run on Mellanox hardware.  Interestingly, 
using the UCX PML framework avoids
> the deadlock that results when running with the default parameters and 
not limiting the message sizes - another
> reason we’d like to be able to use it.
> 
> I can read your mind at this point - “Wow, these guys have really 
horked their cluster”.  Could be.   But we run
> thousands of jobs every day including many other OpenMPI jobs (vasp, 
gromacs, raxml, lammps, namd, etc).
> Also the users of the Arepo and Gadget code are currently running with 
MVAPICH2 without issue.  I installed
> it specifically to get them past these OpenMPI problems.  We don’t 
normally build anything with MPICH/MVAPICH/IMPI
> since we have never had any real reason to - until now.
> 
> That may have to be the solution but the memory leak is so readily 
reproducible that I thought I’d ask about it.
> Since it appears that others are not seeing this issue, I’ll continue 
to try to figure it out and if I do, I’ll be sure to post back.
> 
> > As a side note, we strongly recommend to avoid
> > configure --with-FOO=/usr
> > instead
> > configure --with-FOO
> > should be used (otherwise you will end up with -I/usr/include
> > -L/usr/lib64 and that could silently hide third party libraries
> > installed in a non standard directory). If --with-FOO fails for you,
> > then this is a bug we will appreciate you report.
> 
> Noted and logged.  We’ve been using the —with-FOO=/usr for a long time 
(since 1.x days).  There was a reason we started doing
> it but I’ve long since forgotten what it was but I think it was to _
avoid_ what you describe - not cause it.  Regardless,
> I’ll heed your warning and remove it from future builds and file a bug 
if there are any problems.
> 
> However, I did post of a similar problem previously in when 
configuring against an external PMIx library.  The configure
> script produces (or did) a "-L/usr/lib” instead of a "-L/usr/lib64” 
resulting in unresolved PMIx routines when linking.
> That was with OpenMPI 2.1.2.  We now include a lib -> lib64 symlink in 
our /opt/pmix/x.y.z directories so I haven’t looked to 
> see if that was fixed for 3.x or not.
> 
> I should have also mentioned in my previous post that HPC_CUDA_DIR=NO 
meaning that CUDA support has
> been excluded from these builds (in case anyone was wondering).
> 
> Thanks for the feedback,
> 
> Charlie
> 
> > 
> > Cheers,
> > 
> > Gilles
> > On Fri, Oct 5, 2018 at 6:42 AM Charles A Taylor  
wrote:
> >> 
> >> 
> >> We are seeing a gaping memory leak when running OpenMPI 3.1.x (or 2.
1.2, for that matter) built with UCX support.   The leak shows up
> >> whether the “ucx” PML is specified for the run or not.  The 
applications in question are arepo and gizmo but it I have no reason to

Re: [OMPI users] Building OpenMPI with Lustre support using PGI fails

2018-11-13 Thread gilles
Raymond,

can you please compress and post your config.log ?


Cheers,

Gilles

- Original Message -
> I am trying  to build OpenMPI with Lustre support using PGI 18.7 on 
> CentOS 7.5 (1804).
> 
> It builds successfully with Intel compilers, but fails to find the 
> necessary  Lustre components with the PGI compiler.
> 
> I have tried building  OpenMPI 4.0.0, 3.1.3 and 2.1.5.   I can build 
> OpenMPI, but configure does not find the proper Lustre files.
> 
> Lustre is installed from current client RPMS, version 2.10.5
> 
> Include files are in /usr/include/lustre
> 
> When specifying --with-lustre, I get:
> 
> --- MCA component fs:lustre (m4 configuration macro)
> checking for MCA component fs:lustre compile mode... dso
> checking --with-lustre value... simple ok (unspecified value)
> looking for header without includes
> checking lustre/lustreapi.h usability... yes
> checking lustre/lustreapi.h presence... yes
> checking for lustre/lustreapi.h... yes
> checking for library containing llapi_file_create... -llustreapi
> checking if liblustreapi requires libnl v1 or v3...
> checking for required lustre data structures... no
> configure: error: Lustre support requested but not found. Aborting
> 
> 
> -- 
>   
>   Ray Muno
>   IT Manager
>   
> 
>University of Minnesota
>   Aerospace Engineering and Mechanics Mechanical Engineering
>   110 Union St. S.E.  111 Church Street SE
>   Minneapolis, MN 55455   Minneapolis, MN 55455
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Open MPI installation problem

2019-01-25 Thread gilles
Serdar,



you need to export PATH and LD_LIBRARY_PATH in your .bashrc



(e.g. export PATH=$HOME/openmpi/bin:$PATH)



Also, make sure you built your application with Open MPI installed in $
HOME/openmpi





Cheers,



Gilles



- Original Message -

Hi folks,

After installing OpenMPI, I executed these lines

echo 'PATH=$HOME/openmpi/bin:$PATH' >> ~/.bashrc
echo 'LD_LIBRARY_PATH=$HOME/openmpi/' >> ~/.bashrc
source. bashrc

and run  a simple file by using

mpirun np 1 helloworld

The error message is

helloworld: error while loading shared libraries: libmpi_cxx.so.1: 
cannot open shared object file: No such file or directory

--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.

--
 It is about inaccurate linking the libraries but I could not fix it. 
When i run ldd helloworld, this appears

linux-vdso.so.1 (0x7fff8f2d2000)
libmpi_cxx.so.1 => not found
libmpi.so.1 => not found
libm.so.6 => /lib64/libm.so.6 (0x7fbc21e55000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x7fbc21acb000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x7fbc218b3000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x7fbc21696000)
libc.so.6 => /lib64/libc.so.6 (0x7fbc212f1000)
/lib64/ld-linux-x86-64.so.2 (0x7fbc22152000)
  
Do you have any idea to fix it ?

Best, 
Serdar


Serdar Hiçdurmaz , 23 Oca 2019 Çar, 16:26 
tarihinde şunu yazdı:
Thanks Ralph. It worked. 

Serdar

Ralph H Castain , 23 Oca 2019 Çar, 15:48 tarihinde ş
unu yazdı:
Your PATH and LD_LIBRARY_PATH setting is incorrect. You installed OMPI 
into $HOME/openmpi, so you should have done:

PATH=$HOME/openmpi/bin:$PATH
LD_LIBRARY_PATH=$HOME/openmpi/lib:$LD_LIBRARY_PATH

Ralph


On Jan 23, 2019, at 6:36 AM, Serdar Hiçdurmaz  wrote:

Hi All,

I try to install Open MPI, which is prerequiste for liggghts (DEM 
software). Some info about my current linux version :

NAME="SLED"
VERSION="12-SP3"
VERSION_ID="12.3"
PRETTY_NAME="SUSE Linux Enterprise Desktop 12 SP3"
ID="sled"

I installed Open MPI 1.6 by typing

./configure --prefix=$HOME/openmpi
make all
make install

Here, it is discussed that openmpi 1.6 is compatible with OpenSuse 12.3 
https://public.kitware.com/pipermail/paraview/2014-February/030487.html 
https://build.opensuse.org/package/show/openSUSE:12.3/openmpi

To add OpenMPI to my path and LD_LIBRARY_PATH, I execute the following 
comands on terminal:

export PATH=$PATH:/usr/lib64/mpi/gcc/openmpi/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64/mpi/gcc/openmpi/lib64

Then, in /liggghts/src directory, I execute make auto, this appears :

Creating list of contact models completed.
make[1]: Entering directory '/home/serdarhd/liggghts/LIGGGHTS-PUBLIC/src
/Obj_auto'
Makefile:456: *** 'Could not compile a simple MPI example. Test was done 
with MPI_INC="" and MPICXX="mpicxx"'. Stop.
make[1]: Leaving directory '/home/serdarhd/liggghts/LIGGGHTS-PUBLIC/src/
Obj_auto'
Makefile:106: recipe for target 'auto' failed
make: *** [auto] Error 2

Do you have any idea what the problem is here ? I went through the "
makefile" but it looks like quite complicated as linux beginner like me.

Thanks in advance. Regards,

Serdar



___
users mailing list
users@lists.open-mpi.org 
https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org 
https://lists.open-mpi.org/mailman/listinfo/users
 
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686 on Solaris

2014-10-24 Thread Gilles Gouaillardet
Siegmar,

how did you configure openmpi ? which java version did you use ?

i just found a regression and you currently have to explicitly add
CFLAGS=-D_REENTRANT CPPFLAGS=-D_REENTRANT
to your configure command line

if you want to debug this issue (i cannot reproduce it on a solaris 11
x86 virtual machine)
you can apply the attached patch, and make sure you configure with
--enable-debug and run

OMPI_ATTACH=1 mpiexec -n 1 java InitFinalizeMain

then you will need to attach the *java* process with gdb, set the _dbg
local variable to zero and continue
you should get a clean stack trace and hopefully we will be able to help

Cheers,

Gilles

On 2014/10/24 0:03, Siegmar Gross wrote:
> Hello Oscar,
>
> do you have time to look into my problem? Probably Takahiro has a
> point and gdb behaves differently on Solaris and Linux, so that
> the differing outputs have no meaning. I tried to debug my Java
> program, but without success so far, because I wasn't able to get
> into the Java program to set a breakpoint or to see the code. Have
> you succeeded to debug a mpiJava program? If so, how must I call
> gdb (I normally use "gdb mipexec" and then "run -np 1 java ...")?
> What can I do to get helpful information to track the error down?
> I have attached the error log file. Perhaps you can see if something
> is going wrong with the Java interface. Thank you very much for your
> help and any hints for the usage of gdb with mpiJava in advance.
> Please let me know if I can provide anything else.
>
>
> Kind regards
>
> Siegmar
>
>
>>> I think that it must have to do with MPI, because everything
>>> works fine on Linux and my Java program works fine with an older
>>> MPI version (openmpi-1.8.2a1r31804) as well.
>> Yes. I also think it must have to do with MPI.
>> But java process side, not mpiexec process side.
>>
>> When you run Java MPI program via mpiexec, a mpiexec process
>> process launch a java process. When the java process (your
>> Java program) calls a MPI method, native part (written in C/C++)
>> of the MPI library is called. It runs in java process, not in
>> mpiexec process. I suspect that part.
>>
>>> On Solaris things are different.
>> Are you saying the following difference?
>> After this line,
>>> 881 ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_INIT);
>> Linux shows
>>> orte_job_state_to_str (state=1)
>>> at ../../openmpi-dev-124-g91e9686/orte/util/error_strings.c:217
>>> 217 switch(state) {
>> but Solaris shows
>>> orte_util_print_name_args (name=0x100118380 )
>>> at ../../openmpi-dev-124-g91e9686/orte/util/name_fns.c:122
>>> 122 if (NULL == name) {
>> Each macro is defined as:
>>
>> #define ORTE_ACTIVATE_JOB_STATE(j, s)   \
>> do {\
>> orte_job_t *shadow=(j); \
>> opal_output_verbose(1, orte_state_base_framework.framework_output, \
>> "%s ACTIVATE JOB %s STATE %s AT %s:%d",  \
>> ORTE_NAME_PRINT(ORTE_PROC_MY_NAME), \
>> (NULL == shadow) ? "NULL" : \
>> ORTE_JOBID_PRINT(shadow->jobid), \
>> orte_job_state_to_str((s)), \
>> __FILE__, __LINE__); \
>> orte_state.activate_job_state(shadow, (s)); \
>> } while(0);
>>
>> #define ORTE_NAME_PRINT(n) \
>> orte_util_print_name_args(n)
>>
>> #define ORTE_JOBID_PRINT(n) \
>> orte_util_print_jobids(n)
>>
>> I'm not sure, but I think the gdb on Solaris steps into
>> orte_util_print_name_args, but gdb on Linux doesn't step into
>> orte_util_print_name_args and orte_util_print_jobids for some
>> reason, or orte_job_state_to_str is evaluated before them.
>>
>> So I think it's not an important difference.
>>
>> You showed the following lines.
>>>>> orterun (argc=5, argv=0x7fffe0d8)
>>>>> at 
> ../../../../openmpi-dev-124-g91e9686/orte/tools/orterun/orterun.c:1084
>>>>> 1084while (orte_event_base_active) {
>>>>> (gdb) 
>>>>> 1085opal_event_loop(orte_event_base, OPAL_EVLOOP_ONCE);
>>>>> (gdb) 
>> I'm not familiar with this code but I think this part (in mpiexec
>> process) is only waiting the java proce

Re: [OMPI users] OMPI users] low CPU utilization with OpenMPI

2014-10-24 Thread Gilles Gouaillardet
Can you also check there is no cpu binding issue (several mpi tasks and/or 
OpenMP threads if any, bound to the same core and doing time sharing ?
A simple way to check that is to log into a compute node, run top and then 
press 1 f j
If some cores have higher usage than others, you are likely doing time sharing.
An other option is to disable cpu binding (ompi and openmp if any) and see if 
things get better
(This is suboptimal but still better than time sharing)

"Jeff Squyres (jsquyres)"  wrote:
>- Is /tmp on that machine on NFS or local?
>
>- Have you looked at the text of the help message that came out before the "9 
>more processes have sent help message help-opal-shmem-mmap.txt / mmap on nfs" 
>message?  It should contain details about what the problematic NFS directory 
>is.
>
>- Do you know that it's MPI that is causing this low CPU utilization?
>
>- You mentioned other MPI implementations; have you tested with them to see if 
>they get better CPU utilization?
>
>- What happens if you run this application on a single machine, with no 
>network messaging?
>
>- Do you know what specifically in your application is slow?  I.e., have you 
>done any instrumentation to see what steps / API calls are running slowly, and 
>then tried to figure out why?
>
>- Do you have blocking message patterns that might operate well in shared 
>memory, but expose the inefficiencies of its algorithms/design when it moves 
>to higher-latency transports?
>
>- How long does your application run for?
>
>I ask these questions because MPI applications tend to be quite complicated. 
>Sometimes it's the application itself that is the cause of slowdown / 
>inefficiencies.
>
>
>
>On Oct 23, 2014, at 9:29 PM, Vinson Leung  wrote:
>
>> Later I change another machine and set the TMPDIR to default /tmp, but the 
>> problem (low CPU utilization under 20%) still occur :<
>> 
>> Vincent
>> 
>> On Thu, Oct 23, 2014 at 10:38 PM, Jeff Squyres (jsquyres) 
>>  wrote:
>> If normal users can't write to /tmp (or if /tmp is an NFS-mounted 
>> filesystem), that's the underlying problem.
>> 
>> @Vinson -- you should probably try to get that fixed.
>> 
>> 
>> 
>> On Oct 23, 2014, at 10:35 AM, Joshua Ladd  wrote:
>> 
>> > It's not coming from OSHMEM but from the OPAL "shmem" framework. You are 
>> > going to get terrible performance - possibly slowing to a crawl having all 
>> > processes open their backing files for mmap on NSF. I think that's the 
>> > error that he's getting.
>> >
>> >
>> > Josh
>> >
>> > On Thu, Oct 23, 2014 at 6:06 AM, Vinson Leung  
>> > wrote:
>> > HI, Thanks for your reply:)
>> > I really run an MPI program (compile with OpenMPI and run with "mpirun -n 
>> > 8 .."). My OpenMPI version is 1.8.3 and my program is Gromacs. BTW, 
>> > what is OSHMEM ?
>> >
>> > Best
>> > Vincent
>> >
>> > On Thu, Oct 23, 2014 at 12:21 PM, Ralph Castain  wrote:
>> > From your error message, I gather you are not running an MPI program, but 
>> > rather an OSHMEM one? Otherwise, I find the message strange as it only 
>> > would be emitted from an OSHMEM program.
>> >
>> > What version of OMPI are you trying to use?
>> >
>> >> On Oct 22, 2014, at 7:12 PM, Vinson Leung  wrote:
>> >>
>> >> Thanks for your reply:)
>> >> Follow your advice I tried to set the TMPDIR to /var/tmp and /dev/shm and 
>> >> even reset to /tmp (I get the system permission), the problem still occur 
>> >> (CPU utilization still lower than 20%). I have no idea why and ready to 
>> >> give up OpenMPI instead of using other MPI library.
>> >>
>> >> Old Message-
>> >>
>> >> Date: Tue, 21 Oct 2014 22:21:31 -0400
>> >> From: Brock Palen 
>> >> To: Open MPI Users 
>> >> Subject: Re: [OMPI users] low CPU utilization with OpenMPI
>> >> Message-ID: 
>> >> Content-Type: text/plain; charset=us-ascii
>> >>
>> >> Doing special files on NFS can be weird,  try the other /tmp/ locations:
>> >>
>> >> /var/tmp/
>> >> /dev/shm  (ram disk careful!)
>> >>
>> >> Brock Palen
>> >> www.umich.edu/~brockp
>> >> CAEN Advanced Computing
>> >> XSEDE Campus Champion
>> >> bro...@umich.edu
>> >> (734)936-1985
>> >>
>> >>
>> >>
>> >> > On Oct 21, 2014, at 10:18 PM, Vinson Leung  
>> >> > wrote:
>> >> >
>> >> > Because of permission reason (OpenMPI can not write temporary file to 
>> >> > the default /tmp directory), I change the TMPDIR to my local directory 
>> >> > (export TMPDIR=/home/user/tmp ) and then the MPI program can run. But 
>> >> > the CPU utilization is very low under 20% (8 MPI rank running in Intel 
>> >> > Xeon 8-core CPU).
>> >> >
>> >> > And I also got some message when I run with OpenMPI:
>> >> > [cn3:28072] 9 more processes have sent help message 
>> >> > help-opal-shmem-mmap.txt / mmap on nfs
>> >> > [cn3:28072] Set MCA parameter "orte_base_help_aggregate" to 0 to see 
>> >> > all help / error messages
>> >> >
>> >> > Any idea?
>> >> > Thanks
>> >> >
>> >> > VIncent
>
>
>-- 
>Jeff Squyres
>jsquy...@cisco.com
>For corporate legal information go to: 
>http://www.cisco.com

Re: [OMPI users] OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686on Solaris

2014-10-25 Thread Gilles Gouaillardet
Hi Siegmar,

You might need to configure with --enable-debug and add -g -O0 to your CFLAGS 
and LDFLAGS

Then once you attach with gdb, you have to find the thread that is polling :
thread 1
bt
thread 2
bt
and so on until you find the good thread
If _dbg is a local variable, you need to select the right frame before you can 
change the value :
get the frame number from bt (generally 1 under linux)
f 
set _dbg=0

I hope this helps

Gilles


Siegmar Gross  wrote:
>Hi Gilles,
>
>I changed _dbg to a static variable, so that it is visible in the
>library, but unfortunately still not in the symbol table.
>
>
>tyr java 419 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so | grep -i 
>_dbg
>[271]   |  1249644| 4|OBJT |LOCL |0|18 |_dbg.14258
>tyr java 420 /usr/local/gdb-7.6.1_64_gcc/bin/gdb
>GNU gdb (GDB) 7.6.1
>...
>(gdb) attach 13019
>Attaching to process 13019
>[New process 13019]
>Retry #1:
>Retry #2:
>Retry #3:
>Retry #4:
>0x7eadcb04 in ?? ()
>(gdb) symbol-file /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so
>Reading symbols from 
>/export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so.0.0.0...done.
>(gdb) set var _dbg.14258=0
>No symbol "_dbg" in current context.
>(gdb) 
>
>
>Kind regards
>
>Siegmar
>
>
>
>
>> unfortunately I didn't get anything useful. It's probably my fault,
>> because I'm still not very familiar with gdb or any other debugger.
>> I did the following things.
>> 
>> 
>> 1st window:
>> ---
>> 
>> tyr java 174 setenv OMPI_ATTACH 1
>> tyr java 175 mpijavac InitFinalizeMain.java 
>> warning: [path] bad path element
>>   "/usr/local/openmpi-1.9.0_64_gcc/lib64/shmem.jar":
>>   no such file or directory
>> 1 warning
>> tyr java 176 mpiexec -np 1 java InitFinalizeMain
>> 
>> 
>> 
>> 2nd window:
>> ---
>> 
>> tyr java 379 ps -aef | grep java
>> noaccess  1345 1   0   May 22 ? 113:23 /usr/java/bin/java 
>> -server -Xmx128m -XX:+UseParallelGC 
>-XX:ParallelGCThreads=4 
>>   fd1026  3661 10753   0 14:09:12 pts/14  0:00 mpiexec -np 1 java 
>> InitFinalizeMain
>>   fd1026  3677 13371   0 14:16:55 pts/2   0:00 grep java
>>   fd1026  3663  3661   0 14:09:12 pts/14  0:01 java -cp 
>/home/fd1026/work/skripte/master/parallel/prog/mpi/java:/usr/local/jun
>> tyr java 380 /usr/local/gdb-7.6.1_64_gcc/bin/gdb
>> GNU gdb (GDB) 7.6.1
>> ...
>> (gdb) attach 3663
>> Attaching to process 3663
>> [New process 3663]
>> Retry #1:
>> Retry #2:
>> Retry #3:
>> Retry #4:
>> 0x7eadcb04 in ?? ()
>> (gdb) symbol-file /usr/local/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so
>> Reading symbols from 
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi_java.so.0.0.0...done.
>> (gdb) set var _dbg=0
>> No symbol "_dbg" in current context.
>> (gdb) set var JNI_OnLoad::_dbg=0
>> No symbol "_dbg" in specified context.
>> (gdb) set JNI_OnLoad::_dbg=0
>> No symbol "_dbg" in specified context.
>> (gdb) info threads
>> [New LWP 12]
>> [New LWP 11]
>> [New LWP 10]
>> [New LWP 9]
>> [New LWP 8]
>> [New LWP 7]
>> [New LWP 6]
>> [New LWP 5]
>> [New LWP 4]
>> [New LWP 3]
>> [New LWP 2]
>>   Id   Target Id Frame 
>>   12   LWP 2 0x7eadc6b0 in ?? ()
>>   11   LWP 3 0x7eadcbb8 in ?? ()
>>   10   LWP 4 0x7eadcbb8 in ?? ()
>>   9LWP 5 0x7eadcbb8 in ?? ()
>>   8LWP 6 0x7eadcbb8 in ?? ()
>>   7LWP 7 0x7eadcbb8 in ?? ()
>>   6LWP 8 0x7ead8b0c in ?? ()
>>   5LWP 9 0x7eadcbb8 in ?? ()
>>   4LWP 100x7eadcbb8 in ?? ()
>>   3LWP 110x7eadcbb8 in ?? ()
>>   2LWP 120x7eadcbb8 in ?? ()
>> * 1LWP 1 0x7eadcb04 in ?? ()
>> (gdb) 
>> 
>> 
>> 
>> It seems that "_dbg" is unknown and unavailable.
>> 
>> tyr java 399 grep _dbg 
>> /export2/src/openmpi-1.9/openmpi-dev-124-g91e9686/ompi/mpi/java/c/*
>> /export2/src/openmpi-1.9/openmpi-dev-124-g91e9686/ompi/mpi/java/c/mpi_MPI.c: 
>>volatile int _dbg = 1;
>> /export2/src/openmpi-1.9/openmpi-dev-124-g91e9686/ompi/mpi/java/c/mpi_MPI.c: 
>>while (_dbg) poll(NULL, 0, 1);
>> tyr java 400 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i _dbg
>> tyr java 401 nm /usr/local/openmpi-1.9.0_64_gcc/lib64/*.so | grep -i 
>> JNI_OnLoad
>> [

Re: [OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-26 Thread Gilles Gouaillardet
It looks like we faced a similar issue :
opal_process_name_t is 64 bits aligned wheteas orte_process_name_t is 32 bits 
aligned. If you run an alignment sensitive cpu such as sparc and you are not 
lucky (so to speak) you can run into this issue.
i will make a patch for this shortly

Ralph Castain  wrote:
>Afraid this must be something about the Sparc - just ran on a Solaris 11 x86 
>box and everything works fine.
>
>
>> On Oct 26, 2014, at 8:22 AM, Siegmar Gross 
>>  wrote:
>> 
>> Hi Gilles,
>> 
>> I wanted to explore which function is called, when I call MPI_Init
>> in a C program, because this function should be called from a Java
>> program as well. Unfortunately C programs break with a Bus Error
>> once more for openmpi-dev-124-g91e9686 on Solaris. I assume that's
>> the reason why I get no useful backtrace for my Java program.
>> 
>> tyr small_prog 117 mpicc -o init_finalize init_finalize.c
>> tyr small_prog 118 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>> ...
>> (gdb) run -np 1 init_finalize
>> Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 1 
>> init_finalize
>> [Thread debugging using libthread_db enabled]
>> [New Thread 1 (LWP 1)]
>> [New LWP2]
>> [tyr:19240] *** Process received signal ***
>> [tyr:19240] Signal: Bus Error (10)
>> [tyr:19240] Signal code: Invalid address alignment (1)
>> [tyr:19240] Failing at address: 7bd1c10c
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_backtrace_print+0x2c
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:0xdcc04
>> /lib/sparcv9/libc.so.1:0xd8b98
>> /lib/sparcv9/libc.so.1:0xcc70c
>> /lib/sparcv9/libc.so.1:0xcc918
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_proc_set_name+0x1c
>>  [ Signal 10 (BUS)]
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/openmpi/mca_pmix_native.so:0x103e8
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/openmpi/mca_ess_pmi.so:0x33dc
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-rte.so.0.0.0:orte_init+0x67c
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi.so.0.0.0:ompi_mpi_init+0x374
>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi.so.0.0.0:PMPI_Init+0x2a8
>> /home/fd1026/work/skripte/master/parallel/prog/mpi/small_prog/init_finalize:main+0x20
>> /home/fd1026/work/skripte/master/parallel/prog/mpi/small_prog/init_finalize:_start+0x7c
>> [tyr:19240] *** End of error message ***
>> --
>> mpiexec noticed that process rank 0 with PID 0 on node tyr exited on signal 
>> 10 (Bus Error).
>> --
>> [LWP2 exited]
>> [New Thread 2]
>> [Switching to Thread 1 (LWP 1)]
>> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to 
>> satisfy query
>> (gdb) bt
>> #0  0x7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1
>> #1  0x7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
>> #2  0x7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
>> #3  0x7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
>> #4  0x7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
>> #5  0x7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
>> #6  0x7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
>> #7  0x7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1
>> #8  0x7ec87f60 in vm_close (loader_data=0x0, 
>> module=0x7c901fe0)
>>at ../../../openmpi-dev-124-g91e9686/opal/libltdl/loaders/dlopen.c:212
>> #9  0x7ec85534 in lt_dlclose (handle=0x100189b50)
>>at ../../../openmpi-dev-124-g91e9686/opal/libltdl/ltdl.c:1982
>> #10 0x7ecaabd4 in ri_destructor (obj=0x1001893a0)
>>at 
>> ../../../../openmpi-dev-124-g91e9686/opal/mca/base/mca_base_component_repository.c:382
>> #11 0x7eca9504 in opal_obj_run_destructors (object=0x1001893a0)
>>at ../../../../openmpi-dev-124-g91e9686/opal/class/opal_object.h:446
>> #12 0x7ecaa474 in mca_base_component_repository_release (
>>component=0x7b1236f0 )
>>at 
>> ../../../../openmpi-dev-124-g91e9686/opal/mca/base/mca_base_component_repository.c:240
>> #13 0x7ecac774 in mca_base_component_unload (
>>component=0x7b1236f0 , output_id=-1)
>>at 
>> ../../../../openmpi-dev-124-g91e9686/opal/mca/base/mca_base

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-26 Thread Gilles Gouaillardet
No :-(
I need some extra work to stop declaring orte_process_name_t and 
ompi_process_name_t variables.
#249 will make things much easier.
One option is to use opal_process_name_t everywhere or typedef orte and ompi 
types to the opal one.
An other (lightweight but error prone imho) is to change variable declaration 
only.
Any thought ?

Ralph Castain  wrote:
>Will PR#249 solve it? If so, we should just go with it as I suspect that is 
>the long-term solution.
>
>> On Oct 26, 2014, at 4:25 PM, Gilles Gouaillardet 
>>  wrote:
>> 
>> It looks like we faced a similar issue :
>> opal_process_name_t is 64 bits aligned wheteas orte_process_name_t is 32 
>> bits aligned. If you run an alignment sensitive cpu such as sparc and you 
>> are not lucky (so to speak) you can run into this issue.
>> i will make a patch for this shortly
>> 
>> Ralph Castain  wrote:
>>> Afraid this must be something about the Sparc - just ran on a Solaris 11 
>>> x86 box and everything works fine.
>>> 
>>> 
>>>> On Oct 26, 2014, at 8:22 AM, Siegmar Gross 
>>>>  wrote:
>>>> 
>>>> Hi Gilles,
>>>> 
>>>> I wanted to explore which function is called, when I call MPI_Init
>>>> in a C program, because this function should be called from a Java
>>>> program as well. Unfortunately C programs break with a Bus Error
>>>> once more for openmpi-dev-124-g91e9686 on Solaris. I assume that's
>>>> the reason why I get no useful backtrace for my Java program.
>>>> 
>>>> tyr small_prog 117 mpicc -o init_finalize init_finalize.c
>>>> tyr small_prog 118 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>>>> ...
>>>> (gdb) run -np 1 init_finalize
>>>> Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 1 
>>>> init_finalize
>>>> [Thread debugging using libthread_db enabled]
>>>> [New Thread 1 (LWP 1)]
>>>> [New LWP2]
>>>> [tyr:19240] *** Process received signal ***
>>>> [tyr:19240] Signal: Bus Error (10)
>>>> [tyr:19240] Signal code: Invalid address alignment (1)
>>>> [tyr:19240] Failing at address: 7bd1c10c
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_backtrace_print+0x2c
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:0xdcc04
>>>> /lib/sparcv9/libc.so.1:0xd8b98
>>>> /lib/sparcv9/libc.so.1:0xcc70c
>>>> /lib/sparcv9/libc.so.1:0xcc918
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_proc_set_name+0x1c
>>>>  [ Signal 10 (BUS)]
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/openmpi/mca_pmix_native.so:0x103e8
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/openmpi/mca_ess_pmi.so:0x33dc
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-rte.so.0.0.0:orte_init+0x67c
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi.so.0.0.0:ompi_mpi_init+0x374
>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libmpi.so.0.0.0:PMPI_Init+0x2a8
>>>> /home/fd1026/work/skripte/master/parallel/prog/mpi/small_prog/init_finalize:main+0x20
>>>> /home/fd1026/work/skripte/master/parallel/prog/mpi/small_prog/init_finalize:_start+0x7c
>>>> [tyr:19240] *** End of error message ***
>>>> --
>>>> mpiexec noticed that process rank 0 with PID 0 on node tyr exited on 
>>>> signal 10 (Bus Error).
>>>> --
>>>> [LWP2 exited]
>>>> [New Thread 2]
>>>> [Switching to Thread 1 (LWP 1)]
>>>> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to 
>>>> satisfy query
>>>> (gdb) bt
>>>> #0  0x7f6173d0 in rtld_db_dlactivity () from 
>>>> /usr/lib/sparcv9/ld.so.1
>>>> #1  0x7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
>>>> #2  0x7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
>>>> #3  0x7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
>>>> #4  0x7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
>>>> #5  0x7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
>>>> #6  0x7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
>>>> #7  0x7f61db0c in dlclose () from 

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Ralph,

this is also a solution.
the pro is it seems more lightweight than PR #249
the two cons i can see are :
- opal_process_name_t alignment goes from 64 to 32 bits
- some functions (opal_hash_table_*) takes an uint64_t as argument so we
still need to use memcpy in order to
  * guarantee 64 bits alignment on some archs (such as sparc)
  * avoid ugly cast such as uint64_t id = *(uint64_t *)&process_name;

as far as i am concerned, i am fine with your proposed suggestion to
dump opal_identifier_t.

about the patch, did you mean you have something ready i can apply to my
PR ?
or do you expect me to do the changes (i am ok to do it if needed)

Cheers,

Gilles

On 2014/10/27 11:04, Ralph Castain wrote:
> Just took a glance thru 249 and have a few suggestions on it - will pass them 
> along tomorrow. I think the right solution is to (a) dump opal_identifier_t 
> in favor of using opal_process_name_t everywhere in the opal layer, (b) 
> typedef orte_process_name_t to opal_process_name_t, and (c) leave 
> ompi_process_name_t as typedef’d to the RTE component in the MPI layer. This 
> lets other RTEs decide for themselves how they want to handle it.
>
> If you add changes to your branch, I can pass you a patch with my suggested 
> alterations.
>
>> On Oct 26, 2014, at 5:55 PM, Gilles Gouaillardet 
>>  wrote:
>>
>> No :-(
>> I need some extra work to stop declaring orte_process_name_t and 
>> ompi_process_name_t variables.
>> #249 will make things much easier.
>> One option is to use opal_process_name_t everywhere or typedef orte and ompi 
>> types to the opal one.
>> An other (lightweight but error prone imho) is to change variable 
>> declaration only.
>> Any thought ?
>>
>> Ralph Castain  wrote:
>>> Will PR#249 solve it? If so, we should just go with it as I suspect that is 
>>> the long-term solution.
>>>
>>>> On Oct 26, 2014, at 4:25 PM, Gilles Gouaillardet 
>>>>  wrote:
>>>>
>>>> It looks like we faced a similar issue :
>>>> opal_process_name_t is 64 bits aligned wheteas orte_process_name_t is 32 
>>>> bits aligned. If you run an alignment sensitive cpu such as sparc and you 
>>>> are not lucky (so to speak) you can run into this issue.
>>>> i will make a patch for this shortly
>>>>
>>>> Ralph Castain  wrote:
>>>>> Afraid this must be something about the Sparc - just ran on a Solaris 11 
>>>>> x86 box and everything works fine.
>>>>>
>>>>>
>>>>>> On Oct 26, 2014, at 8:22 AM, Siegmar Gross 
>>>>>>  wrote:
>>>>>>
>>>>>> Hi Gilles,
>>>>>>
>>>>>> I wanted to explore which function is called, when I call MPI_Init
>>>>>> in a C program, because this function should be called from a Java
>>>>>> program as well. Unfortunately C programs break with a Bus Error
>>>>>> once more for openmpi-dev-124-g91e9686 on Solaris. I assume that's
>>>>>> the reason why I get no useful backtrace for my Java program.
>>>>>>
>>>>>> tyr small_prog 117 mpicc -o init_finalize init_finalize.c
>>>>>> tyr small_prog 118 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>>>>>> ...
>>>>>> (gdb) run -np 1 init_finalize
>>>>>> Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 1 
>>>>>> init_finalize
>>>>>> [Thread debugging using libthread_db enabled]
>>>>>> [New Thread 1 (LWP 1)]
>>>>>> [New LWP2]
>>>>>> [tyr:19240] *** Process received signal ***
>>>>>> [tyr:19240] Signal: Bus Error (10)
>>>>>> [tyr:19240] Signal code: Invalid address alignment (1)
>>>>>> [tyr:19240] Failing at address: 7bd1c10c
>>>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_backtrace_print+0x2c
>>>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:0xdcc04
>>>>>> /lib/sparcv9/libc.so.1:0xd8b98
>>>>>> /lib/sparcv9/libc.so.1:0xcc70c
>>>>>> /lib/sparcv9/libc.so.1:0xcc918
>>>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_proc_set_name+0x1c
>>>>>>  [ Signal 10 (BUS)]
>>>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/openmpi/mca_pmix_native.so:0x103e8
>>>>>> /export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib

Re: [OMPI users] which info is needed for SIGSEGV in Java foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Kawashima-san,

thanks a lot for the detailled explanation.
FWIW, i was previously testing on Solaris 11 that behaves like Linux :
printf("%s", NULL) outputs '(null)'
vs a SIGSEGV on Solaris 10

i commited a16c1e44189366fbc8e967769e050f517a40f3f8 in order to fix this
issue
(i moved the call to mca_base_var_register *after* MPI_Init)

regarding the BUS error reported by Siegmar, i also commited
62bde1fcb554079143030bb305512c236672386f
in order to fix it (this is based on code review only, i have no sparc64
hardware to test it is enough)

Siegmar, --enable-heterogeneous is known to be broken on the trunk, and
there are discussions on how to fix it.
in the mean time, you can either apply the attached minimal
heterogeneous.diff patch or avoid the --enable-heterogeneous option
/* the attached patch "fixes" --enable-heterogeneous on homogeneous
clusters *only* */

about attaching a process with gdb, i usually run
gdb none 
on Linux and everything is fine
on Solaris, i had to do
gdb /usr/bin/java 
in order to get the symbols loaded by gdb
and then
thread 11
f 3
set _dbg=0
/* but this is likely environment specific */

Cheers,

Gilles

On 2014/10/27 10:58, Ralph Castain wrote:
> Oh yeah - that would indeed be very bad :-(
>
>
>> On Oct 26, 2014, at 6:06 PM, Kawashima, Takahiro 
>>  wrote:
>>
>> Siegmar, Oscar,
>>
>> I suspect that the problem is calling mca_base_var_register
>> without initializing OPAL in JNI_OnLoad.
>>
>> ompi/mpi/java/c/mpi_MPI.c:
>> 
>> jint JNI_OnLoad(JavaVM *vm, void *reserved)
>> {
>>libmpi = dlopen("libmpi." OPAL_DYN_LIB_SUFFIX, RTLD_NOW | RTLD_GLOBAL);
>>
>>if(libmpi == NULL)
>>{
>>fprintf(stderr, "Java bindings failed to load liboshmem.\n");
>>exit(1);
>>}
>>
>>mca_base_var_register("ompi", "mpi", "java", "eager",
>>  "Java buffers eager size",
>>  MCA_BASE_VAR_TYPE_INT, NULL, 0, 0,
>>  OPAL_INFO_LVL_5,
>>  MCA_BASE_VAR_SCOPE_READONLY,
>>  &ompi_mpi_java_eager);
>>
>>return JNI_VERSION_1_6;
>> }
>> 
>>
>> I suppose JNI_OnLoad is the first function in the libmpi_java.so
>> which is called by JVM. So OPAL is not initialized yet.
>> As shown in Siegmar's JRE log, SEGV occurred in asprintf called
>> by mca_base_var_cache_files.
>>
>> Siegmar's hs_err_pid13080.log:
>> 
>> siginfo:si_signo=SIGSEGV: si_errno=0, si_code=1 (SEGV_MAPERR), 
>> si_addr=0x
>>
>> Stack: [0x7b40,0x7b50],  sp=0x7b4fc730,  
>> free space=1009k
>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
>> code)
>> C  [libc.so.1+0x3c7f0]  strlen+0x50
>> C  [libc.so.1+0xaf640]  vsnprintf+0x84
>> C  [libc.so.1+0xaadb4]  vasprintf+0x20
>> C  [libc.so.1+0xaaf04]  asprintf+0x28
>> C  [libopen-pal.so.0.0.0+0xaf3cc]  mca_base_var_cache_files+0x160
>> C  [libopen-pal.so.0.0.0+0xaed90]  mca_base_var_init+0x4e8
>> C  [libopen-pal.so.0.0.0+0xb260c]  register_variable+0x214
>> C  [libopen-pal.so.0.0.0+0xb36a0]  mca_base_var_register+0x104
>> C  [libmpi_java.so.0.0.0+0x221e8]  JNI_OnLoad+0x128
>> C  [libjava.so+0x10860]  
>> Java_java_lang_ClassLoader_00024NativeLibrary_load+0xb8
>> j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;Z)V+-665819
>> j  java.lang.ClassLoader$NativeLibrary.load(Ljava/lang/String;Z)V+0
>> j  java.lang.ClassLoader.loadLibrary0(Ljava/lang/Class;Ljava/io/File;)Z+328
>> j  
>> java.lang.ClassLoader.loadLibrary(Ljava/lang/Class;Ljava/lang/String;Z)V+290
>> j  java.lang.Runtime.loadLibrary0(Ljava/lang/Class;Ljava/lang/String;)V+54
>> j  java.lang.System.loadLibrary(Ljava/lang/String;)V+7
>> j  mpi.MPI.()V+28
>> 
>>
>> mca_base_var_cache_files passes opal_install_dirs.sysconfdir to
>> asprintf.
>>
>> opal/mca/base/mca_base_var.c:
>> 
>>asprintf(&mca_base_var_files, "%s"OPAL_PATH_SEP".openmpi" OPAL_PATH_SEP
>> "mca-params.conf%c%s" OPAL_PATH_SEP "openmpi-mca-params.conf",
>> home, OPAL_ENV_SEP, opal_install_dirs.sysconfdir);
>> --

Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Gilles Gouaillardet
Hi,

i tested on a RedHat 6 like linux server and could not observe any
memory leak.

BTW, are you running 32 or 64 bits cygwin ? and what is your configure
command line ?

Thanks,

Gilles

On 2014/10/27 18:26, Marco Atzeri wrote:
> On 10/27/2014 8:30 AM, maxinator333 wrote:
>> Hello,
>>
>> I noticed this weird behavior, because after a certain time of more than
>> one minute the transfer rates of MPI_Send and MPI_Recv dropped by a
>> factor of 100+. By chance I saw, that my program did allocate more and
>> more memory. I have the following minimal working example:
>>
>> #include 
>> #include 
>>
>> const uint32_t MSG_LENGTH = 256;
>>
>> int main(int argc, char* argv[]) {
>>  MPI_Init(NULL, NULL);
>>  int rank;
>>  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>
>>  volatile char * msg  = (char*) malloc( sizeof(char) *
>> MSG_LENGTH );
>>
>>  for (uint64_t i = 0; i < 1e9; i++) {
>>  if ( rank == 1 ) {
>>  MPI_Recv( const_cast(msg), MSG_LENGTH, MPI_CHAR,
>>rank-1, 0, MPI_COMM_WORLD,
>> MPI_STATUS_IGNORE);
>>  MPI_Send( const_cast(msg), MSG_LENGTH, MPI_CHAR,
>>rank-1, 0, MPI_COMM_WORLD);
>>  } else if ( rank == 0 ) {
>>  MPI_Send( const_cast(msg), MSG_LENGTH, MPI_CHAR,
>>rank+1, 0, MPI_COMM_WORLD);
>>  MPI_Recv( const_cast(msg), MSG_LENGTH, MPI_CHAR,
>>rank+1, 0, MPI_COMM_WORLD,
>> MPI_STATUS_IGNORE);
>>  }
>>  MPI_Barrier( MPI_COMM_WORLD );
>>  for (uint32_t k = 0; k < MSG_LENGTH; k++)
>>  msg[k]++;
>>  }
>>
>>  MPI_Finalize();
>>  return 0;
>> }
>>
>>
>> I run this with mpirun -n 2 ./pingpong_memleak.exe
>>
>> The program does nothing more than send a message from rank 0 to rank 1,
>> then from rank 1 to rank 0 and so on in standard blocking mode, not even
>> asynchronous.
>>
>> Running the program will allocate roughly 30mb/s (Windows Task Manager)
>> until it stops at around 1.313.180kb. This is when the transfer rates
>> (not being measured in above snippet) drop significantly to maybe a
>> second per send instead of roughly 1µs.
>>
>> I use Cygwin with Windows 7 and 16Gb RAM. I haven't tested this minimal
>> working example on other setups.
>
> Can someone test on other platforms and confirm me that is a cygwin
> specific issue ?
>
> Regards
> Marco
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/10/25602.php



Re: [OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-27 Thread Gilles Gouaillardet
Thanks Marco,

I could reproduce the issue even with one node sending/receiving to itself.

I will investigate this tomorrow

Cheers,

Gilles

Marco Atzeri  wrote:
>
>
>On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
>>
>> i tested on a RedHat 6 like linux server and could not observe any
>> memory leak.
>>
>> BTW, are you running 32 or 64 bits cygwin ? and what is your configure
>> command line ?
>>
>> Thanks,
>>
>> Gilles
>>
>
>the problem is present in both versions.
>
>cygwin 1.8.3-1 packages  are built with configure:
>
>  --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin --sbindir=/usr/sbin 
>--libexecdir=/usr/libexec --datadir=/usr/share --localstatedir=/var 
>--sysconfdir=/etc --libdir=/usr/lib --datarootdir=/usr/share 
>--docdir=/usr/share/doc/openmpi --htmldir=/usr/share/doc/openmpi/html -C 
>LDFLAGS=-Wl,--export-all-symbols --disable-mca-dso --disable-sysv-shmem 
>--enable-cxx-exceptions --with-threads=posix --without-cs-fs 
>--with-mpi-param_check=always --enable-contrib-no-build=vt,libompitrace 
>--enable-mca-no-build=paffinity,installdirs-windows,timer-windows,shmem-sysv
>
>Regards
>Marco
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/10/25604.php


Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Gilles Gouaillardet
Michael,

Could you please run
mpirun -np 1 df -h
mpirun -np 1 df -hi
on both compute and login nodes

Thanks

Gilles

michael.rach...@dlr.de wrote:
>Dear developers of OPENMPI,
>
>We have now installed and tested the bugfixed OPENMPI Nightly Tarball  of 
>2014-10-24  (openmpi-dev-176-g9334abc.tar.gz) on Cluster5 .
>As before (with OPENMPI-1.8.3 release version) the small Ftn-testprogram runs 
>correctly on the login-node.
>As before the program aborts on the compute node, but now with a different 
>error message: 
>
>The following message appears when launching the program with 2 processes: 
>mpiexec -np 2 -bind-to core -tag-output ./a.out
>
>[1,0]: on nodemaster: iwin= 685 :
>[1,0]:  total storage [MByte] alloc. in shared windows so far:   
>137.
>[ [1,0]: === allocation of shared window no. iwin= 686
>[1,0]:  starting now with idim_1=   5
>-
>It appears as if there is not enough space for 
>/tmp/openmpi-sessions-rachner@r5i5n13_0/48127/1/shared_window_688.r5i5n13 (the 
>shared-memory backing
>file). It is likely that your MPI job will now either abort or experience
>performance degradation.
>
>  Local host:  r5i5n13
>  Space Requested: 204256 B
>  Space Available: 208896 B
>--
>[r5i5n13:26917] *** An error occurred in MPI_Win_allocate_shared
>[r5i5n13:26917] *** reported by process [3154051073,140733193388032]
>[r5i5n13:26917] *** on communicator MPI_COMM_WORLD
>[r5i5n13:26917] *** MPI_ERR_INTERN: internal error
>[r5i5n13:26917] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
>now abort,
>[r5i5n13:26917] ***and potentially your MPI job)
>rachner@r5i5n13:~/dat>
>
>
>
>When I repeat the run using 24 processes (on same compute node) the same kind 
>of abort message occurs, but earlier:
>
>[1,0]: on nodemaster: iwin= 231 :
>[1,0]:  total storage [MByte] alloc. in shared windows so far:   
>46.2
> [1,0]: === allocation of shared window no. iwin= 232
>[1,0]:  starting now with idim_1=   5
>-
>It appears as if there is not enough space for 
>/tmp/openmpi-sessions-rachner@r5i5n13_0/48029/1/shared_window_234.r5i5n13 (the 
>shared-memory backing
>file). It is likely that your MPI job will now either abort or experience
>performance degradation.
>
>  Local host:  r5i5n13
>  Space Requested: 204784 B
>  Space Available: 131072 B
>--
>[r5i5n13:26947] *** An error occurred in MPI_Win_allocate_shared
>[r5i5n13:26947] *** reported by process [3147628545,140733193388032]
>[r5i5n13:26947] *** on communicator MPI_COMM_WORLD
>[r5i5n13:26947] *** MPI_ERR_INTERN: internal error
>[r5i5n13:26947] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
>now abort,
>[r5i5n13:26947] ***and potentially your MPI job)
>rachner@r5i5n13:~/dat>
>
>
>So the problem is not yet resolved.
>
>Greetings
> Michael Rachner
>
>
>
>
>
>
>-Ursprüngliche Nachricht-
>Von: Rachner, Michael 
>Gesendet: Montag, 27. Oktober 2014 11:49
>An: 'Open MPI Users'
>Betreff: AW: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared 
>memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code
>
>Dear Mr. Squyres.
>
>We will try to install your bug-fixed nigthly tarball of 2014-10-24 on 
>Cluster5 to see whether it works or not.
>The installation however will take some time. I get back to you, if I know 
>more.
>
>Let me add the information that on the Laki each nodes has 16 GB of shared 
>memory (there it worked), the login-node on Cluster 5 has 64 GB (there it 
>worked too), whereas the compute nodes on Cluster5 have 128 GB (there it did 
>not work).
>So possibly the bug might have something to do with the size of the physical 
>shared memory available on the node.
>
>Greetings
>Michael Rachner
>
>-Ursprüngliche Nachricht-
>Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Jeff Squyres 
>(jsquyres)
>Gesendet: Freitag, 24. Oktober 2014 22:45
>An: Open MPI User's List
>Betreff: Re: [OMPI use

Re: [OMPI users] WG: Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-10-27 Thread Gilles Gouaillardet
Michael,

The available space must be greater than the requested size + 5%

From the logs, the error message makes sense to me : there is not enough space 
in /tmp
Since the compute nodes have a lot of memory, you might want to try using 
/dev/shm instead of /tmp for the backing files

Cheers,

Gilles

michael.rach...@dlr.de wrote:
>Dear developers of OPENMPI,
>
>We have now installed and tested the bugfixed OPENMPI Nightly Tarball  of 
>2014-10-24  (openmpi-dev-176-g9334abc.tar.gz) on Cluster5 .
>As before (with OPENMPI-1.8.3 release version) the small Ftn-testprogram runs 
>correctly on the login-node.
>As before the program aborts on the compute node, but now with a different 
>error message: 
>
>The following message appears when launching the program with 2 processes: 
>mpiexec -np 2 -bind-to core -tag-output ./a.out
>
>[1,0]: on nodemaster: iwin= 685 :
>[1,0]:  total storage [MByte] alloc. in shared windows so far:   
>137.
>[ [1,0]: === allocation of shared window no. iwin= 686
>[1,0]:  starting now with idim_1=   5
>-
>It appears as if there is not enough space for 
>/tmp/openmpi-sessions-rachner@r5i5n13_0/48127/1/shared_window_688.r5i5n13 (the 
>shared-memory backing
>file). It is likely that your MPI job will now either abort or experience
>performance degradation.
>
>  Local host:  r5i5n13
>  Space Requested: 204256 B
>  Space Available: 208896 B
>--
>[r5i5n13:26917] *** An error occurred in MPI_Win_allocate_shared
>[r5i5n13:26917] *** reported by process [3154051073,140733193388032]
>[r5i5n13:26917] *** on communicator MPI_COMM_WORLD
>[r5i5n13:26917] *** MPI_ERR_INTERN: internal error
>[r5i5n13:26917] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
>now abort,
>[r5i5n13:26917] ***and potentially your MPI job)
>rachner@r5i5n13:~/dat>
>
>
>
>When I repeat the run using 24 processes (on same compute node) the same kind 
>of abort message occurs, but earlier:
>
>[1,0]: on nodemaster: iwin= 231 :
>[1,0]:  total storage [MByte] alloc. in shared windows so far:   
>46.2
> [1,0]: === allocation of shared window no. iwin= 232
>[1,0]:  starting now with idim_1=   5
>-
>It appears as if there is not enough space for 
>/tmp/openmpi-sessions-rachner@r5i5n13_0/48029/1/shared_window_234.r5i5n13 (the 
>shared-memory backing
>file). It is likely that your MPI job will now either abort or experience
>performance degradation.
>
>  Local host:  r5i5n13
>  Space Requested: 204784 B
>  Space Available: 131072 B
>--
>[r5i5n13:26947] *** An error occurred in MPI_Win_allocate_shared
>[r5i5n13:26947] *** reported by process [3147628545,140733193388032]
>[r5i5n13:26947] *** on communicator MPI_COMM_WORLD
>[r5i5n13:26947] *** MPI_ERR_INTERN: internal error
>[r5i5n13:26947] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will 
>now abort,
>[r5i5n13:26947] ***and potentially your MPI job)
>rachner@r5i5n13:~/dat>
>
>
>So the problem is not yet resolved.
>
>Greetings
> Michael Rachner
>
>
>
>
>
>
>-Ursprüngliche Nachricht-
>Von: Rachner, Michael 
>Gesendet: Montag, 27. Oktober 2014 11:49
>An: 'Open MPI Users'
>Betreff: AW: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared 
>memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code
>
>Dear Mr. Squyres.
>
>We will try to install your bug-fixed nigthly tarball of 2014-10-24 on 
>Cluster5 to see whether it works or not.
>The installation however will take some time. I get back to you, if I know 
>more.
>
>Let me add the information that on the Laki each nodes has 16 GB of shared 
>memory (there it worked), the login-node on Cluster 5 has 64 GB (there it 
>worked too), whereas the compute nodes on Cluster5 have 128 GB (there it did 
>not work).
>So possibly the bug might have something to do with the size of the physical 
>shared memory available on the node.
>
>Greetings
>Michael Rachner
>
>-Ursprüngliche Nachricht-
>Von: users [mail

Re: [OMPI users] OMPI users] OMPI users] OMPI users] which info is needed for SIGSEGV inJava foropenmpi-dev-124-g91e9686on Solaris

2014-10-27 Thread Gilles Gouaillardet
Ralph,

On 2014/10/28 0:46, Ralph Castain wrote:
> Actually, I propose to also remove that issue. Simple enough to use a
> hash_table_32 to handle the jobids, and let that point to a
> hash_table_32 of vpids. Since we rarely have more than one jobid
> anyway, the memory overhead actually decreases with this model, and we
> get rid of that annoying need to memcpy everything. 
sounds good to me.
from an implementation/performance point of view, should we put treat
the local jobid differently ?
(e.g. use a special variable for the hash_table_32 of the vpids of the
current jobid)
>> as far as i am concerned, i am fine with your proposed suggestion to
>> dump opal_identifier_t.
>>
>> about the patch, did you mean you have something ready i can apply to my
>> PR ?
>> or do you expect me to do the changes (i am ok to do it if needed)
> Why don’t I grab your branch, create a separate repo based on it (just to 
> keep things clean), push it to my area and give you write access? We can then 
> collaborate on the changes and create a PR from there. This way, you don’t 
> need to give me write access to your entire repo.
>
> Make sense?
ok to work on an other "somehow shared" repo for that issue.
i am not convinced you should grab my branch since all the changes i
made are will be no more valid.
anyway, feel free to fork a repo from my branch or the master and i will
work from here.

Cheers,

Gilles



Re: [OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-28 Thread Gilles Gouaillardet
Marco,

here is attached a patch that fixes the issue
/* i could not find yet why this does not occurs on Linux ... */

could you please give it a try ?

Cheers,

Gilles

On 2014/10/27 18:45, Marco Atzeri wrote:
>
>
> On 10/27/2014 10:30 AM, Gilles Gouaillardet wrote:
>> Hi,
>>
>> i tested on a RedHat 6 like linux server and could not observe any
>> memory leak.
>>
>> BTW, are you running 32 or 64 bits cygwin ? and what is your configure
>> command line ?
>>
>> Thanks,
>>
>> Gilles
>>
>
> the problem is present in both versions.
>
> cygwin 1.8.3-1 packages  are built with configure:
>
>  --prefix=/usr --exec-prefix=/usr --bindir=/usr/bin
> --sbindir=/usr/sbin --libexecdir=/usr/libexec --datadir=/usr/share
> --localstatedir=/var --sysconfdir=/etc --libdir=/usr/lib
> --datarootdir=/usr/share --docdir=/usr/share/doc/openmpi
> --htmldir=/usr/share/doc/openmpi/html -C
> LDFLAGS=-Wl,--export-all-symbols --disable-mca-dso
> --disable-sysv-shmem --enable-cxx-exceptions --with-threads=posix
> --without-cs-fs --with-mpi-param_check=always
> --enable-contrib-no-build=vt,libompitrace
> --enable-mca-no-build=paffinity,installdirs-windows,timer-windows,shmem-sysv
>
> Regards
> Marco
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/10/25604.php

diff --git a/ompi/mca/pml/ob1/pml_ob1_recvreq.c 
b/ompi/mca/pml/ob1/pml_ob1_recvreq.c
index 7c8853f..c4a 100644
--- a/ompi/mca/pml/ob1/pml_ob1_recvreq.c
+++ b/ompi/mca/pml/ob1/pml_ob1_recvreq.c
@@ -16,6 +16,8 @@
  * Copyright (c) 2011-2012 Los Alamos National Security, LLC. All rights
  * reserved.
  * Copyright (c) 2012  FUJITSU LIMITED.  All rights reserved.
+ * Copyright (c) 2014  Research Organization for Information Science
+ * and Technology (RIST). All rights reserved.
  * $COPYRIGHT$
  * 
  * Additional copyrights may follow
@@ -152,11 +154,16 @@ static void 
mca_pml_ob1_recv_request_construct(mca_pml_ob1_recv_request_t* reque
 OBJ_CONSTRUCT(&request->lock, opal_mutex_t);
 }

+static void mca_pml_ob1_recv_request_destruct(mca_pml_ob1_recv_request_t* 
request)
+{
+OBJ_DESTRUCT(&request->lock);
+}
+
 OBJ_CLASS_INSTANCE(
 mca_pml_ob1_recv_request_t,
 mca_pml_base_recv_request_t,
 mca_pml_ob1_recv_request_construct,
-NULL);
+mca_pml_ob1_recv_request_destruct);


 /*


Re: [OMPI users] SIGBUS in openmpi-dev-178-ga16c1e4 on Solaris 10 Sparc

2014-10-28 Thread Gilles Gouaillardet
Hi Siegmar,

From the jvm logs, there is an alignment error in native_get_attr but i could 
not find it by reading the source code.

Could you please do
ulimit -c unlimited
mpiexec ...
and then
gdb /bin/java core
And run bt on all threads until you get a line number in native_get_attr

Thanks

Gilles

Siegmar Gross  wrote:
>Hi,
>
>today I installed openmpi-dev-178-ga16c1e4 on Solaris 10 Sparc
>with gcc-4.9.1 and Java 8. Now a very simple Java program works
>as expected, but other Java programs still break. I removed the
>warnings about "shmem.jar" and used the following configure
>command.
>
>tyr openmpi-dev-178-ga16c1e4-SunOS.sparc.64_gcc 406 head config.log \
>  | grep openmpi
>$ ../openmpi-dev-178-ga16c1e4/configure
>  --prefix=/usr/local/openmpi-1.9.0_64_gcc
>  --libdir=/usr/local/openmpi-1.9.0_64_gcc/lib64
>  --with-jdk-bindir=/usr/local/jdk1.8.0/bin
>  --with-jdk-headers=/usr/local/jdk1.8.0/include
>  JAVA_HOME=/usr/local/jdk1.8.0
>  LDFLAGS=-m64 CC=gcc CXX=g++ FC=gfortran CFLAGS=-m64 -D_REENTRANT
>  CXXFLAGS=-m64 FCFLAGS=-m64 CPP=cpp CXXCPP=cpp
>  CPPFLAGS= -D_REENTRANT CXXCPPFLAGS=
>  --enable-mpi-cxx --enable-cxx-exceptions --enable-mpi-java
>  --enable-mpi-thread-multiple --with-threads=posix
>  --with-hwloc=internal
>  --without-verbs --with-wrapper-cflags=-std=c11 -m64
>  --with-wrapper-cxxflags=-m64 --enable-debug
>
>
>tyr java 290 ompi_info | grep -e "Open MPI repo revision:" -e "C compiler 
>version:"
>  Open MPI repo revision: dev-178-ga16c1e4
>  C compiler version: 4.9.1
>
>
>
>> > regarding the BUS error reported by Siegmar, i also commited
>> > 62bde1fcb554079143030bb305512c236672386f
>> > in order to fix it (this is based on code review only, i have no sparc64
>> > hardware to test it is enough)
>> 
>> I'll test it, when a new nightly snapshot is available for the trunk.
>
>
>tyr java 291 mpijavac InitFinalizeMain.java 
>tyr java 292 mpiexec -np 1 java InitFinalizeMain
>Hello!
>
>tyr java 293 mpijavac BcastIntMain.java 
>tyr java 294 mpiexec -np 2 java BcastIntMain
>#
># A fatal error has been detected by the Java Runtime Environment:
>#
>#  SIGBUS (0xa) at pc=0xfffee3210bfc, pid=24792, tid=2
>...
>
>
>
>tyr java 296 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>...
>(gdb) run -np 2 java BcastIntMain
>Starting program: /usr/local/openmpi-1.9.0_64_gcc/bin/mpiexec -np 2 java 
>BcastIntMain
>[Thread debugging using libthread_db enabled]
>[New Thread 1 (LWP 1)]
>[New LWP2]
>#
># A fatal error has been detected by the Java Runtime Environment:
>#
>#  SIGBUS (0xa) at pc=0xfffee3210bfc, pid=24814, tid=2
>#
># JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
># Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode 
>solaris-sparc compressed oops)
># Problematic frame:
># C  [mca_pmix_native.so+0x10bfc]  native_get_attr+0x3000
>#
># Failed to write core dump. Core dumps have been disabled. To enable core 
>dumping, try "ulimit -c unlimited" before starting Java again
>#
># An error report file with more information is saved as:
># /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid24814.log
>#
># A fatal error has been detected by the Java Runtime Environment:
>#
>#  SIGBUS (0xa) at pc=0xfffee3210bfc, pid=24812, tid=2
>#
># JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
># Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode 
>solaris-sparc compressed oops)
># Problematic frame:
># C  [mca_pmix_native.so+0x10bfc]  native_get_attr+0x3000
>#
># Failed to write core dump. Core dumps have been disabled. To enable core 
>dumping, try "ulimit -c unlimited" before starting Java again
>#
># An error report file with more information is saved as:
># /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid24812.log
>#
># If you would like to submit a bug report, please visit:
>#   http://bugreport.sun.com/bugreport/crash.jsp
># The crash happened outside the Java Virtual Machine in native code.
># See problematic frame for where to report the bug.
>#
>[tyr:24814] *** Process received signal ***
>[tyr:24814] Signal: Abort (6)
>[tyr:24814] Signal code:  (-1)
>/export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:opal_backtrace_print+0x2c
>/export2/prog/SunOS_sparc/openmpi-1.9.0_64_gcc/lib64/libopen-pal.so.0.0.0:0xdc2d4
>/lib/sparcv9/libc.so.1:0xd8b98
>/lib/sparcv9/libc.so.1:0xcc70c
>/lib/sparcv9/libc.so.1:0xcc918
>/lib/sparcv9/libc.so.1:0xdd2d0 [ Signal 6 (ABRT)]
>/lib/sparcv9/libc.so.1:_thr_sigsetmask+0x1c4
>/lib/sparcv9/libc.so.1:sigprocmas

Re: [OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-28 Thread Gilles Gouaillardet
Thanks Marco,

pthread_mutex_init calls calloc under cygwin but does not allocate memory under 
linux, so not invoking pthread_mutex_destroy causes a memory leak only under 
cygwin.

Gilles

Marco Atzeri  wrote:
>On 10/28/2014 12:04 PM, Gilles Gouaillardet wrote:
>> Marco,
>>
>> here is attached a patch that fixes the issue
>> /* i could not find yet why this does not occurs on Linux ... */
>>
>> could you please give it a try ?
>>
>> Cheers,
>>
>> Gilles
>>
>
>It solves the issue on 64 bit.
>I see no growing memory usage anymore
>
>I will build 32 bit and then upload both as 1.8.3-2
>
>Thanks
>Marco
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/10/25630.php


Re: [OMPI users] OMPI users] OMPI users] Possible Memory Leak in simple PingPong-Routine with OpenMPI 1.8.3?

2014-10-28 Thread Gilles Gouaillardet
Yep, will do today

Ralph Castain  wrote:
>Gilles: will you be committing this to trunk and PR to 1.8?
>
>
>> On Oct 28, 2014, at 11:05 AM, Marco Atzeri  wrote:
>> 
>> On 10/28/2014 4:41 PM, Gilles Gouaillardet wrote:
>>> Thanks Marco,
>>> 
>>> pthread_mutex_init calls calloc under cygwin but does not allocate memory 
>>> under linux, so not invoking pthread_mutex_destroy causes a memory leak 
>>> only under cygwin.
>>> 
>>> Gilles
>> 
>> thanks for the work .
>> 
>> uploading 1.8.3-2 on www.cygwin.com
>> 
>> Regards
>> Marco
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/10/25634.php
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/10/25636.php


Re: [OMPI users] SIGBUS in openmpi-dev-178-ga16c1e4 on Solaris 10 Sparc

2014-10-29 Thread Gilles Gouaillardet
Hi Siegmar,

thanks for the detailled report.

i think i found the alignment issue and fixed it (commit
8c556bbc66c06fb19c6e46c67624bac1d6719b12)

here is attached the patch that fixes the issue.

Cheers,

Gilles

On 2014/10/29 5:24, Siegmar Gross wrote:
> Hi Gilles,
>  
>> From the jvm logs, there is an alignment error in native_get_attr
>> but i could not find it by reading the source code.
>>
>> Could you please do
>> ulimit -c unlimited
>> mpiexec ...
>> and then
>> gdb /bin/java core
>> And run bt on all threads until you get a line number in native_get_attr
> I found pmix_native.c:1131 in native_get_attr, attached gdb to the
> Java process and set a breakpoint to this line. From there I single
> stepped until I got SIGSEGV, so that you can see what happened.
>
>
> (gdb) b pmix_native.c:1131
> No source file named pmix_native.c.
> Make breakpoint pending on future shared library load? (y or [n]) y
>
> Breakpoint 1 (pmix_native.c:1131) pending.
> (gdb) thread 14
> [Switching to thread 14 (Thread 2 (LWP 2))]
> #0  0x7eadc6b0 in __pollsys () from /lib/sparcv9/libc.so.1
> (gdb) f 3
> #3  0xfffee5122230 in JNI_OnLoad (vm=0x7e57e9d8 , 
> reserved=0x0)
> at ../../../../../openmpi-dev-178-ga16c1e4/ompi/mpi/java/c/mpi_MPI.c:128
> 128 while (_dbg) poll(NULL, 0, 1);
> (gdb) set _dbg=0
> (gdb) c
> Continuing.
> [New LWP13]
>
> Breakpoint 1, native_get_attr (attr=0xfffee2e05db0 "pmix.jobid", 
> kv=0x7b4ff028)
> at 
> ../../../../../openmpi-dev-178-ga16c1e4/opal/mca/pmix/native/pmix_native.c:1131
> 1131OPAL_OUTPUT_VERBOSE((1, 
> opal_pmix_base_framework.framework_output,
> (gdb) s
> opal_proc_local_get () at 
> ../../../openmpi-dev-178-ga16c1e4/opal/util/proc.c:80
> 80  return opal_proc_my_name;
> (gdb) 
> 81  }
> (gdb) 
> _process_name_print_for_opal (procname=14259803799433510912)
> at ../../openmpi-dev-178-ga16c1e4/orte/runtime/orte_init.c:64
> 64  orte_process_name_t* rte_name = (orte_process_name_t*)&procname;
> (gdb) 
> 65  return ORTE_NAME_PRINT(rte_name);
> (gdb) 
> orte_util_print_name_args (name=0x7b4feb90)
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:122
> 122 if (NULL == name) {
> (gdb) 
> 142 job = orte_util_print_jobids(name->jobid);
> (gdb) 
> orte_util_print_jobids (job=3320119297)
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:170
> 170 ptr = get_print_name_buffer();
> (gdb) 
> get_print_name_buffer ()
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:92
> 92  if (!fns_init) {
> (gdb) 
> 101 ret = opal_tsd_getspecific(print_args_tsd_key, (void**)&ptr);
> (gdb) 
> opal_tsd_getspecific (key=4, valuep=0x7b4fe8a0)
> at ../../openmpi-dev-178-ga16c1e4/opal/threads/tsd.h:163
> 163 *valuep = pthread_getspecific(key);
> (gdb) 
> 164 return OPAL_SUCCESS;
> (gdb) 
> 165 }
> (gdb) 
> get_print_name_buffer ()
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:102
> 102 if (OPAL_SUCCESS != ret) return NULL;
> (gdb) 
> 104 if (NULL == ptr) {
> (gdb) 
> 113 return (orte_print_args_buffers_t*) ptr;
> (gdb) 
> 114 }
> (gdb) 
> orte_util_print_jobids (job=3320119297)
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:172
> 172 if (NULL == ptr) {
> (gdb) 
> 178 if (ORTE_PRINT_NAME_ARG_NUM_BUFS == ptr->cntr) {
> (gdb) 
> 179 ptr->cntr = 0;
> (gdb) 
> 182 if (ORTE_JOBID_INVALID == job) {
> (gdb) 
> 184 } else if (ORTE_JOBID_WILDCARD == job) {
> (gdb) 
> 187 tmp1 = ORTE_JOB_FAMILY((unsigned long)job);
> (gdb) 
> 188 tmp2 = ORTE_LOCAL_JOBID((unsigned long)job);
> (gdb) 
> 189 snprintf(ptr->buffers[ptr->cntr++], 
> (gdb) 
> 193 return ptr->buffers[ptr->cntr-1];
> (gdb) 
> 194 }
> (gdb) 
> orte_util_print_name_args (name=0x7b4feb90)
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:143
> 143 vpid = orte_util_print_vpids(name->vpid);
> (gdb) 
> orte_util_print_vpids (vpid=0)
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:260
> 260 ptr = get_print_name_buffer();
> (gdb) 
> get_print_name_buffer ()
> at ../../openmpi-dev-178-ga16c1e4/orte/util/name_fns.c:92
> 92  if (!fns_init) {
> (gdb) 
> 101 ret = opal_tsd_getspecific(print_args_tsd_key, (void**)&ptr);
> (gdb) 
> opal_tsd_getspecific (key=4, valuep=0x7b4fe8b0)
> at ../../o

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-11-05 Thread Gilles Gouaillardet
Michael,

could you please share your test program so we can investigate it ?

Cheers,

Gilles

On 2014/10/31 18:53, michael.rach...@dlr.de wrote:
> Dear developers of OPENMPI,
>
> There remains a hanging observed in MPI_WIN_ALLOCATE_SHARED.
>
> But first: 
> Thank you for your advices to employ shmem_mmap_relocate_backing_file = 1
> It indeed turned out, that the bad (but silent) allocations  by 
> MPI_WIN_ALLOCATE_SHARED, which I observed in the past after ~140 MB of 
> allocated shared memory, 
> were indeed caused by  a too small available storage for the sharedmem 
> backing files. Applying the MCA parameter resolved the problem.
>
> Now the allocation of shared data windows by  MPI_WIN_ALLOCATE_SHARED in the 
> OPENMPI-1.8.3 release version works on both clusters!
> I tested it both with my small sharedmem-Ftn-testprogram  as well as with our 
> Ftn-CFD-code.
> It worked  even when allocating 1000 shared data windows containing a total 
> of 40 GB.  Very well.
>
> But now I come to the problem remaining:
> According to the attached email of Jeff (see below) of 2014-10-24, 
> we have alternatively installed and tested the bugfixed OPENMPI Nightly 
> Tarball  of 2014-10-24  (openmpi-dev-176-g9334abc.tar.gz) on Cluster5 .
> That version worked well, when our CFD-code was running on only 1 node.
> But I observe now, that when running the CFD-code on 2 node with  2 processes 
> per node,
> after having allocated a total of 200 MB of data in 20 shared windows, the 
> allocation of the 21-th window fails, 
> because all 4 processes enter MPI_WIN_ALLOCATE_SHARED but never leave it. The 
> code hangs in that routine, without any message.
>
> In contrast, that bug does NOT occur with the  OPENMPI-1.8.3 release version  
>  with same program on same machine.
>
> That means for you:  
>In openmpi-dev-176-g9334abc.tar.gz   the new-introduced  bugfix concerning 
> the shared memory allocation may be not yet correctly coded ,
>or that version contains another new bug in sharedmemory allocation  
> compared to the working(!) 1.8.3-release version.
>
> Greetings to you all
>   Michael Rachner
> 
>
>
>
> -Ursprüngliche Nachricht-
> Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Jeff Squyres 
> (jsquyres)
> Gesendet: Freitag, 24. Oktober 2014 22:45
> An: Open MPI User's List
> Betreff: Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared 
> memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code
>
> Nathan tells me that this may well be related to a fix that was literally 
> just pulled into the v1.8 branch today:
>
> https://github.com/open-mpi/ompi-release/pull/56
>
> Would you mind testing any nightly tarball after tonight?  (i.e., the v1.8 
> tarballs generated tonight will be the first ones to contain this fix)
>
> http://www.open-mpi.org/nightly/master/
>
>
>
> On Oct 24, 2014, at 11:46 AM,  
>  wrote:
>
>> Dear developers of OPENMPI,
>>  
>> I am running a small downsized Fortran-testprogram for shared memory 
>> allocation (using MPI_WIN_ALLOCATE_SHARED and  MPI_WIN_SHARED_QUERY) )
>> on only 1 node   of 2 different Linux-clusters with OPENMPI-1.8.3 and 
>> Intel-14.0.4 /Intel-13.0.1, respectively.
>>  
>> The program simply allocates a sequence of shared data windows, each 
>> consisting of 1 integer*4-array.
>> None of the windows is freed, so the amount of allocated data  in shared 
>> windows raises during the course of the execution.
>>  
>> That worked well on the 1st cluster (Laki, having 8 procs per node))  
>> when allocating even 1000 shared windows each having 5 integer*4 array 
>> elements, i.e. a total of  200 MBytes.
>> On the 2nd cluster (Cluster5, having 24 procs per node) it also worked on 
>> the login node, but it did NOT work on a compute node.
>> In that error case, there occurs something like an internal storage limit of 
>> ~ 140 MB for the total storage allocated in all shared windows.
>> When that limit is reached, all later shared memory allocations fail (but 
>> silently).
>> So the first attempt to use such a bad shared data window results in a bus 
>> error due to the bad storage address encountered.
>>  
>> That strange behavior could be observed in the small testprogram but also 
>> with my large Fortran CFD-code.
>> If the error occurs, then it occurs with both codes, and both at a storage 
>> limit of  ~140 MB.
>> I found that this storage limit depends only weakly on  the number of 
>> processes (for np=2,4,8,16,24  it is: 144.4 , 144.0, 141.0, 137.0, 
>> 132.2 MB)
>>  
>> 

Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code

2014-11-05 Thread Gilles Gouaillardet
Hi Michael,

bigger the program, bigger the fun ;-)

i will have a look at it.

Cheers,

Gilles

On 2014/11/05 19:08, michael.rach...@dlr.de wrote:
> Dear Gilles,
>
> My small downsized Ftn-testprogram for testing the shared memory  feature 
> (MPI_WIN_ALLOCATE_SHARED,  MPI_WIN_SHARED_QUERY, C_F_POINTER)
>  presumes for simplicity that all processes are running on the same node 
> (i.e. the communicator containing the procs on the same node  is just 
> MPI_COMM_WORLD).
> So the hanging of MPI_WIN_ALLOCATE_SHARED when running on 2 nodes could only 
> be observed with our large CFD-code. 
>
> Are OPENMPI-developers nevertheless interested in that testprogram?
>
> Greetings
> Michael
>
>
>
>
>
>
> -Ursprüngliche Nachricht-
> Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles 
> Gouaillardet
> Gesendet: Mittwoch, 5. November 2014 10:46
> An: Open MPI Users
> Betreff: Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in shared 
> memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code
>
> Michael,
>
> could you please share your test program so we can investigate it ?
>
> Cheers,
>
> Gilles
>
> On 2014/10/31 18:53, michael.rach...@dlr.de wrote:
>> Dear developers of OPENMPI,
>>
>> There remains a hanging observed in MPI_WIN_ALLOCATE_SHARED.
>>
>> But first: 
>> Thank you for your advices to employ shmem_mmap_relocate_backing_file = 1
>> It indeed turned out, that the bad (but silent) allocations  by 
>> MPI_WIN_ALLOCATE_SHARED, which I observed in the past after ~140 MB of 
>> allocated shared memory, were indeed caused by  a too small available 
>> storage for the sharedmem backing files. Applying the MCA parameter resolved 
>> the problem.
>>
>> Now the allocation of shared data windows by  MPI_WIN_ALLOCATE_SHARED in the 
>> OPENMPI-1.8.3 release version works on both clusters!
>> I tested it both with my small sharedmem-Ftn-testprogram  as well as with 
>> our Ftn-CFD-code.
>> It worked  even when allocating 1000 shared data windows containing a total 
>> of 40 GB.  Very well.
>>
>> But now I come to the problem remaining:
>> According to the attached email of Jeff (see below) of 2014-10-24, we 
>> have alternatively installed and tested the bugfixed OPENMPI Nightly Tarball 
>>  of 2014-10-24  (openmpi-dev-176-g9334abc.tar.gz) on Cluster5 .
>> That version worked well, when our CFD-code was running on only 1 node.
>> But I observe now, that when running the CFD-code on 2 node with  2 
>> processes per node, after having allocated a total of 200 MB of data 
>> in 20 shared windows, the allocation of the 21-th window fails, because all 
>> 4 processes enter MPI_WIN_ALLOCATE_SHARED but never leave it. The code hangs 
>> in that routine, without any message.
>>
>> In contrast, that bug does NOT occur with the  OPENMPI-1.8.3 release version 
>>   with same program on same machine.
>>
>> That means for you:  
>>In openmpi-dev-176-g9334abc.tar.gz   the new-introduced  bugfix 
>> concerning the shared memory allocation may be not yet correctly coded ,
>>or that version contains another new bug in sharedmemory allocation  
>> compared to the working(!) 1.8.3-release version.
>>
>> Greetings to you all
>>   Michael Rachner
>> 
>>
>>
>>
>> -Ursprüngliche Nachricht-
>> Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Jeff 
>> Squyres (jsquyres)
>> Gesendet: Freitag, 24. Oktober 2014 22:45
>> An: Open MPI User's List
>> Betreff: Re: [OMPI users] Bug in OpenMPI-1.8.3: storage limition in 
>> shared memory allocation (MPI_WIN_ALLOCATE_SHARED) in Ftn-code
>>
>> Nathan tells me that this may well be related to a fix that was literally 
>> just pulled into the v1.8 branch today:
>>
>> https://github.com/open-mpi/ompi-release/pull/56
>>
>> Would you mind testing any nightly tarball after tonight?  (i.e., the 
>> v1.8 tarballs generated tonight will be the first ones to contain this 
>> fix)
>>
>> http://www.open-mpi.org/nightly/master/
>>
>>
>>
>> On Oct 24, 2014, at 11:46 AM,  
>>  wrote:
>>
>>> Dear developers of OPENMPI,
>>>  
>>> I am running a small downsized Fortran-testprogram for shared memory 
>>> allocation (using MPI_WIN_ALLOCATE_SHARED and  MPI_WIN_SHARED_QUERY) )
>>> on only 1 node   of 2 different Linux-clusters with OPENMPI-1.8.3 and 
>>> Intel-14.0.4 /Intel-13.0.1, respectively.
>>>  
>>> The program sim

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Gilles Gouaillardet
Michael,

the root cause is openmpi was not compiled with the intel compilers but
the gnu compiler.
fortran modules are not binary compatible so openmpi and your
application must be compiled
with the same compiler.

Cheers,

Gilles

On 2014/11/05 18:25, michael.rach...@dlr.de wrote:
> Dear OPENMPI developers,
>
> In OPENMPI-1.8.3 the Ftn-bindings for  MPI_SIZEOF  are missing, when using 
> the mpi-module and when using mpif.h .
> (I have not controlled, whether they are present in the mpi_08 module.)
>
> I get this message from the linker (Intel-14.0.2):
>  /home/vat/src/KERNEL/mpi_ini.f90:534: undefined reference to 
> `mpi_sizeof0di4_'
>
> So can you add  the Ftn-bindings for MPI_SIZEOF?
>
> Once again I feel, that Fortran-bindings are unloved step-children for 
> C-programmers. 
>
> Greetings to you all
>  Michael Rachner
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25676.php



Re: [OMPI users] OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-05 Thread Gilles Gouaillardet
Michael,

Did you recompile with gfortran compiler or relink only ?
You need to recompile and relink
Can you attach your program so i can have a look ?

You really need one mpi install per compiler, and more if compilers versions 
from the same vendor are not compatible.
modules are useful to make this easy for end users, and this is out of the 
scope of openmpi.

Cheers,

Gilles

michael.rach...@dlr.de wrote:
>Sorry, Gilles, you might be wrong:
>
>The error occurs also with gfortran-4.9.1, when running my small shared memory 
>testprogram:
>
>This is the answer of the linker with gfortran-4.9.1 :  
> sharedmemtest.f90:(.text+0x1145): undefined reference to `mpi_sizeof0di4_'
>
>and this is the answer with intel-14.0.4:
>sharedmemtest.f90:(.text+0x11c3): undefined reference to `mpi_sizeof0di4_'
>
>
>If openmpi  actually provides a module file   mpi.mod,  that was  precompiled 
>by openmpi for a certain Fortran compiler,
>then the whole installation of openmpi on a User machine from the 
>openmpi-sourcecode for a User chosen Ftn-compiler would be a farce.
>The module file  mpi.mod  must be either generated during the installation 
>process of openmpi on the User-machine for the User chosen Ftn-compiler,
>or alternatively Openmpi must provide the module not by a  mpi.mod file,  but 
>a mpi.f90 file.  MS-MPI does it that way.
>In my opinion, providing a  mpi.f90  file is indeed  better than providing an  
>mpi.mod file, because the User can look inside the module
>and can directly see, if something is missing or possibly wrongly coded. 
>
>Greetings 
>  Michael Rachner
>
>
>-Ursprüngliche Nachricht-
>Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles 
>Gouaillardet
>Gesendet: Mittwoch, 5. November 2014 11:33
>An: Open MPI Users
>Betreff: Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for 
>MPI_SIZEOF
>
>Michael,
>
>the root cause is openmpi was not compiled with the intel compilers but the 
>gnu compiler.
>fortran modules are not binary compatible so openmpi and your application must 
>be compiled with the same compiler.
>
>Cheers,
>
>Gilles
>
>On 2014/11/05 18:25, michael.rach...@dlr.de wrote:
>> Dear OPENMPI developers,
>>
>> In OPENMPI-1.8.3 the Ftn-bindings for  MPI_SIZEOF  are missing, when using 
>> the mpi-module and when using mpif.h .
>> (I have not controlled, whether they are present in the mpi_08 
>> module.)
>>
>> I get this message from the linker (Intel-14.0.2):
>>  /home/vat/src/KERNEL/mpi_ini.f90:534: undefined reference to 
>> `mpi_sizeof0di4_'
>>
>> So can you add  the Ftn-bindings for MPI_SIZEOF?
>>
>> Once again I feel, that Fortran-bindings are unloved step-children for 
>> C-programmers. 
>>
>> Greetings to you all
>>  Michael Rachner
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25676.php
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25682.php
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25683.php


Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Gilles Gouaillardet
Brock,

Is your post related to ib0/eoib0 being used at all, or being used with load 
balancing ?

let me clarify this :
--mca btl ^openib
disables the openib btl aka *native* infiniband.
This does not disable ib0 and eoib0 that are handled by the tcp btl.
As you already figured out, btl_tcp_if_include (or btl_tcp_if_exclude) can be 
used for that purpose.

Cheers,

Gilles




Ralph Castain  wrote:
>OMPI discovers all active interfaces and automatically considers them 
>available for its use unless instructed otherwise via the params. I’d have to 
>look at the TCP BTL code to see the loadbalancing algo - I thought we didn’t 
>have that “on” by default across BTLs, but I don’t know if the TCP one 
>automatically uses all available Ethernet interfaces by default. Sounds like 
>it must.
>
>
>> On Nov 7, 2014, at 11:53 AM, Brock Palen  wrote:
>> 
>> I was doing a test on our IB based cluster, where I was diabling IB
>> 
>> --mca btl ^openib --mca mtl ^mxm
>> 
>> I was sending very large messages >1GB  and I was surppised by the speed.
>> 
>> I noticed then that of all our ethernet interfaces
>> 
>> eth0  (1gig-e)
>> ib0  (ip over ib, for lustre configuration at vendor request)
>> eoib0  (ethernet over IB interface for IB -> Ethernet gateway for some 
>> extrnal storage support at >1Gig speed
>> 
>> I saw all three were getting traffic.
>> 
>> We use torque for our Resource Manager and use TM support, the hostnames 
>> given by torque match the eth0 interfaces.
>> 
>> How does OMPI figure out that it can also talk over the others?  How does it 
>> chose to load balance?
>> 
>> BTW that is fine, but we will use if_exclude on one of the IB ones as ib0 
>> and eoib0  are the same physical device and may screw with load balancing if 
>> anyone ver falls back to TCP.
>> 
>> Brock Palen
>> www.umich.edu/~brockp
>> CAEN Advanced Computing
>> XSEDE Campus Champion
>> bro...@umich.edu
>> (734)936-1985
>> 
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25709.php
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25710.php

Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-07 Thread Gilles Gouaillardet
Ralph,

IIRC there is load balancing accros all the btl, for example
between vader and scif.
So load balancing between ib0 and eoib0 is just a particular case that might 
not necessarily be handled by the btl tcp.

Cheers,

Gilles

Ralph Castain  wrote:
>OMPI discovers all active interfaces and automatically considers them 
>available for its use unless instructed otherwise via the params. I’d have to 
>look at the TCP BTL code to see the loadbalancing algo - I thought we didn’t 
>have that “on” by default across BTLs, but I don’t know if the TCP one 
>automatically uses all available Ethernet interfaces by default. Sounds like 
>it must.
>
>
>> On Nov 7, 2014, at 11:53 AM, Brock Palen  wrote:
>> 
>> I was doing a test on our IB based cluster, where I was diabling IB
>> 
>> --mca btl ^openib --mca mtl ^mxm
>> 
>> I was sending very large messages >1GB  and I was surppised by the speed.
>> 
>> I noticed then that of all our ethernet interfaces
>> 
>> eth0  (1gig-e)
>> ib0  (ip over ib, for lustre configuration at vendor request)
>> eoib0  (ethernet over IB interface for IB -> Ethernet gateway for some 
>> extrnal storage support at >1Gig speed
>> 
>> I saw all three were getting traffic.
>> 
>> We use torque for our Resource Manager and use TM support, the hostnames 
>> given by torque match the eth0 interfaces.
>> 
>> How does OMPI figure out that it can also talk over the others?  How does it 
>> chose to load balance?
>> 
>> BTW that is fine, but we will use if_exclude on one of the IB ones as ib0 
>> and eoib0  are the same physical device and may screw with load balancing if 
>> anyone ver falls back to TCP.
>> 
>> Brock Palen
>> www.umich.edu/~brockp
>> CAEN Advanced Computing
>> XSEDE Campus Champion
>> bro...@umich.edu
>> (734)936-1985
>> 
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25709.php
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25710.php

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-10 Thread Gilles Gouaillardet
Hi,

IIRC there were some bug fixes between 1.8.1 and 1.8.2 in order to really
use all the published interfaces.

by any change, are you running a firewall on your head node ?
one possible explanation is the compute node tries to access the public
interface of the head node, and packets get dropped by the firewall.

if you are running a firewall, can you make a test without it ?
/* if you do need NAT, then just remove the DROP and REJECT rules "/

an other possible explanation is the compute node is doing (reverse) dns
requests with the public name and/or ip of the head node and that takes
some time to complete (success or failure, this does not really matter here)

/* a simple test is to make sure all the hosts/ip of the head node are in
the /etc/hosts of the compute node */

could you check your network config (firewall and dns) ?

can you reproduce the delay when running mpirun on the head node and with
one mpi task on the compute node ?

if yes, then the hard way to trace the delay issue would be to strace -ttt
both orted and mpi task that are launched on the compute node and see where
the time is lost.
/* at this stage, i would suspect orted ... */

Cheers,

Gilles

On Mon, Nov 10, 2014 at 5:56 PM, Reuti  wrote:

> Hi,
>
> Am 10.11.2014 um 16:39 schrieb Ralph Castain:
>
> > That is indeed bizarre - we haven’t heard of anything similar from other
> users. What is your network configuration? If you use oob_tcp_if_include or
> exclude, can you resolve the problem?
>
> Thx - this option helped to get it working.
>
> These tests were made for sake of simplicity between the headnode of the
> cluster and one (idle) compute node. I tried then between the (identical)
> compute nodes and this worked fine. The headnode of the cluster and the
> compute node are slightly different though (i.e. number of cores), and
> using eth1 resp. eth0 for the internal network of the cluster.
>
> I tried --hetero-nodes with no change.
>
> Then I turned to:
>
> reuti@annemarie:~> date; mpiexec -mca btl self,tcp --mca
> oob_tcp_if_include 192.168.154.0/26 -n 4 --hetero-nodes --hostfile
> machines ./mpihello; date
>
> and the application started instantly. On another cluster, where the
> headnode is identical to the compute nodes but with the same network setup
> as above, I observed a delay of "only" 30 seconds. Nevertheless, also on
> this cluster the working addition was the correct "oob_tcp_if_include" to
> solve the issue.
>
> The questions which remain: a) is this a targeted behavior, b) what
> changed in this scope between 1.8.1 and 1.8.2?
>
> -- Reuti
>
>
> >
> >> On Nov 10, 2014, at 4:50 AM, Reuti  wrote:
> >>
> >> Am 10.11.2014 um 12:50 schrieb Jeff Squyres (jsquyres):
> >>
> >>> Wow, that's pretty terrible!  :(
> >>>
> >>> Is the behavior BTL-specific, perchance?  E.G., if you only use
> certain BTLs, does the delay disappear?
> >>
> >> You mean something like:
> >>
> >> reuti@annemarie:~> date; mpiexec -mca btl self,tcp -n 4 --hostfile
> machines ./mpihello; date
> >> Mon Nov 10 13:44:34 CET 2014
> >> Hello World from Node 1.
> >> Total: 4
> >> Universe: 4
> >> Hello World from Node 0.
> >> Hello World from Node 3.
> >> Hello World from Node 2.
> >> Mon Nov 10 13:46:42 CET 2014
> >>
> >> (the above was even the latest v1.8.3-186-g978f61d)
> >>
> >> Falling back to 1.8.1 gives (as expected):
> >>
> >> reuti@annemarie:~> date; mpiexec -mca btl self,tcp -n 4 --hostfile
> machines ./mpihello; date
> >> Mon Nov 10 13:49:51 CET 2014
> >> Hello World from Node 1.
> >> Total: 4
> >> Universe: 4
> >> Hello World from Node 0.
> >> Hello World from Node 2.
> >> Hello World from Node 3.
> >> Mon Nov 10 13:49:53 CET 2014
> >>
> >>
> >> -- Reuti
> >>
> >>> FWIW: the use-all-IP interfaces approach has been in OMPI forever.
> >>>
> >>> Sent from my phone. No type good.
> >>>
> >>>> On Nov 10, 2014, at 6:42 AM, Reuti 
> wrote:
> >>>>
> >>>>> Am 10.11.2014 um 12:24 schrieb Reuti:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>>> Am 09.11.2014 um 05:38 schrieb Ralph Castain:
> >>>>>>
> >>>>>> FWIW: during MPI_Init, each process “publishes” all of its
> interfaces. Each process receives a complete map of that info for every
> process in the job. So when the TCP btl sets itself up, it attempts to
> connect across -all- the interfaces

Re: [OMPI users] OMPI users] How OMPI picks ethernet interfaces

2014-11-12 Thread Gilles Gouaillardet
Could you please send the output of netstat -nr on both head and compute node ?
no problem obfuscating the ip of the head node, i am only interested in 
netmasks and routes.

Ralph Castain  wrote:
>
>> On Nov 12, 2014, at 2:45 PM, Reuti  wrote:
>> 
>> Am 12.11.2014 um 17:27 schrieb Reuti:
>> 
>>> Am 11.11.2014 um 02:25 schrieb Ralph Castain:
>>> 
 Another thing you can do is (a) ensure you built with —enable-debug, and 
 then (b) run it with -mca oob_base_verbose 100  (without the 
 tcp_if_include option) so we can watch the connection handshake and see 
 what it is doing. The —hetero-nodes will have not affect here and can be 
 ignored.
>>> 
>>> Done. It really tries to connect to the outside interface of the headnode. 
>>> But being there a firewall or not: the nodes have no clue how to reach 
>>> 137.248.0.0 - they have no gateway to this network at all.
>> 
>> I have to revert this. They think that there is a gateway although it isn't. 
>> When I remove the entry by hand for the gateway in the routing table it 
>> starts up instantly too.
>> 
>> While I can do this on my own cluster I still have the 30 seconds delay on a 
>> cluster where I'm not root, while this can be because of the firewall there. 
>> The gateway on this cluster is indeed going to the outside world.
>> 
>> Personally I find this behavior a little bit too aggressive to use all 
>> interfaces. If you don't check this carefully beforehand and start a long 
>> running application one might even not notice the delay during the startup.
>
>Agreed - do you have any suggestions on how we should choose the order in 
>which to try them? I haven’t been able to come up with anything yet. Jeff has 
>some fancy algo in his usnic BTL that we are going to discuss after SC that 
>I’m hoping will help, but I’d be open to doing something better in the interim 
>for 1.8.4
>
>> 
>> -- Reuti
>> 
>> 
>>> It tries so independent from the internal or external name of the headnode 
>>> given in the machinefile - I hit ^C then. I attached the output of Open MPI 
>>> 1.8.1 for this setup too.
>>> 
>>> -- Reuti
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/11/25777.php
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25781.php
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25782.php


Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
Hi,

it seems you messed up the command line

could you try

$ mpirun --mca btl ^openib --host compute-01-01,compute-01-06 ring_c


can you also try to run mpirun from a compute node instead of the head
node ?

Cheers,

Gilles

On 2014/11/13 16:07, Syed Ahsan Ali wrote:
> Here is what I see when disabling openib support.\
>
>
> [pmdtest@pmd ~]$ mpirun --host --mca btl ^openib
> compute-01-01,compute-01-06 ring_c
> ssh:  orted: Temporary failure in name resolution
> ssh:  orted: Temporary failure in name resolution
> --
> A daemon (pid 7608) died unexpectedly with status 255 while attempting
> to launch so we are aborting.
>
> While nodes can still ssh each other
>
> [pmdtest@compute-01-01 ~]$ ssh compute-01-06
> Last login: Thu Nov 13 12:05:58 2014 from compute-01-01.private.dns.zone
> [pmdtest@compute-01-06 ~]$
>
>
>
>
> On Thu, Nov 13, 2014 at 12:03 PM, Syed Ahsan Ali  
> wrote:
>>  Hi Jefff
>>
>> No firewall is enabled. Running the diagnostics I found that non
>> communication mpi job is running . While ring_c remains stuck. There
>> are of course warnings for open fabrics but in my case I an running
>> application by disabling openib., Please see below
>>
>>  [pmdtest@pmd ~]$ mpirun --host compute-01-01,compute-01-06 hello_c.out
>> --
>> WARNING: There is at least one OpenFabrics device found but there are
>> no active ports detected (or Open MPI was unable to use them).  This
>> is most certainly not what you wanted.  Check your cables, subnet
>> manager configuration, etc.  The openib BTL will be ignored for this
>> job.
>>   Local host: compute-01-01.private.dns.zone
>> --
>> Hello, world, I am 0 of 2
>> Hello, world, I am 1 of 2
>> [pmd.pakmet.com:06386] 1 more process has sent help message
>> help-mpi-btl-openib.txt / no active ports found
>> [pmd.pakmet.com:06386] Set MCA parameter "orte_base_help_aggregate" to
>> 0 to see all help / error messages
>>
>> [pmdtest@pmd ~]$ mpirun --host compute-01-01,compute-01-06 ring_c
>> --
>> WARNING: There is at least one OpenFabrics device found but there are
>> no active ports detected (or Open MPI was unable to use them).  This
>> is most certainly not what you wanted.  Check your cables, subnet
>> manager configuration, etc.  The openib BTL will be ignored for this
>> job.
>>   Local host: compute-01-01.private.dns.zone
>> --
>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>> Process 0 sent to 1
>> Process 0 decremented value: 9
>> [compute-01-01.private.dns.zone][[54687,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>> connect() to 192.168.108.10 failed: No route to host (113)
>> [pmd.pakmet.com:15965] 1 more process has sent help message
>> help-mpi-btl-openib.txt / no active ports found
>> [pmd.pakmet.com:15965] Set MCA parameter "orte_base_help_aggregate" to
>> 0 to see all help / error messages
>> 
>>
>>
>>
>>
>>
>> On Wed, Nov 12, 2014 at 7:32 PM, Jeff Squyres (jsquyres)
>>  wrote:
>>> Do you have firewalling enabled on either server?
>>>
>>> See this FAQ item:
>>>
>>> 
>>> http://www.open-mpi.org/faq/?category=running#diagnose-multi-host-problems
>>>
>>>
>>>
>>> On Nov 12, 2014, at 4:57 AM, Syed Ahsan Ali  wrote:
>>>
>>>> Dear All
>>>>
>>>> I need your advice. While trying to run mpirun job across nodes I get
>>>> following error. It seems that the two nodes i.e, compute-01-01 and
>>>> compute-01-06 are not able to communicate with each other. While nodes
>>>> see each other on ping.
>>>>
>>>> [pmdtest@pmd ERA_CLM45]$ mpirun -np 16 -hostfile hostlist --mca btl
>>>> ^openib ../bin/regcmMPICLM45 regcm.in
>>>>
>>>> [compute-01-06.private.dns.zone][[48897,1],7][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>>>> connect() to 192.168.108.14 failed: No route to host (113)
>>>> [compute-01-06.private.dns.zone][[48897,1],4][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>>>> connect() to 192.168.108.14 failed: No route to host (113)
>>>> [compute-01-06.private.dns.zone][[48897,1],5]

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
mpirun complains about the 192.168.108.10 ip address, but ping reports a
10.0.0.8 address

is the 192.168.* network a point to point network (for example between a
host and a mic) so two nodes
cannot ping each other via this address ?
/* e.g. from compute-01-01 can you ping the 192.168.108.* ip address of
compute-01-06 ? */

could you also run

mpirun --mca btl ^openib --host compute-01-01,compute-01-06 --mca
btl_tcp_if_include 10.0.0.0/8 ring_c

and see whether it helps ?


On 2014/11/13 16:24, Syed Ahsan Ali wrote:
> Same result in both cases
>
> [pmdtest@pmd ~]$ mpirun --mca btl ^openib --host
> compute-01-01,compute-01-06 ring_c
> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
> Process 0 sent to 1
> Process 0 decremented value: 9
> [compute-01-01.private.dns.zone][[47139,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
> connect() to 192.168.108.10 failed: No route to host (113)
>
>
> [pmdtest@compute-01-01 ~]$ mpirun --mca btl ^openib --host
> compute-01-01,compute-01-06 ring_c
> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
> Process 0 sent to 1
> Process 0 decremented value: 9
> [compute-01-01.private.dns.zone][[11064,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
> connect() to 192.168.108.10 failed: No route to host (113)
>
>
> On Thu, Nov 13, 2014 at 12:11 PM, Gilles Gouaillardet
>  wrote:
>> Hi,
>>
>> it seems you messed up the command line
>>
>> could you try
>>
>> $ mpirun --mca btl ^openib --host compute-01-01,compute-01-06 ring_c
>>
>>
>> can you also try to run mpirun from a compute node instead of the head
>> node ?
>>
>> Cheers,
>>
>> Gilles
>>
>> On 2014/11/13 16:07, Syed Ahsan Ali wrote:
>>> Here is what I see when disabling openib support.\
>>>
>>>
>>> [pmdtest@pmd ~]$ mpirun --host --mca btl ^openib
>>> compute-01-01,compute-01-06 ring_c
>>> ssh:  orted: Temporary failure in name resolution
>>> ssh:  orted: Temporary failure in name resolution
>>> --
>>> A daemon (pid 7608) died unexpectedly with status 255 while attempting
>>> to launch so we are aborting.
>>>
>>> While nodes can still ssh each other
>>>
>>> [pmdtest@compute-01-01 ~]$ ssh compute-01-06
>>> Last login: Thu Nov 13 12:05:58 2014 from compute-01-01.private.dns.zone
>>> [pmdtest@compute-01-06 ~]$
>>>
>>>
>>>
>>>
>>> On Thu, Nov 13, 2014 at 12:03 PM, Syed Ahsan Ali  
>>> wrote:
>>>>  Hi Jefff
>>>>
>>>> No firewall is enabled. Running the diagnostics I found that non
>>>> communication mpi job is running . While ring_c remains stuck. There
>>>> are of course warnings for open fabrics but in my case I an running
>>>> application by disabling openib., Please see below
>>>>
>>>>  [pmdtest@pmd ~]$ mpirun --host compute-01-01,compute-01-06 hello_c.out
>>>> --
>>>> WARNING: There is at least one OpenFabrics device found but there are
>>>> no active ports detected (or Open MPI was unable to use them).  This
>>>> is most certainly not what you wanted.  Check your cables, subnet
>>>> manager configuration, etc.  The openib BTL will be ignored for this
>>>> job.
>>>>   Local host: compute-01-01.private.dns.zone
>>>> --
>>>> Hello, world, I am 0 of 2
>>>> Hello, world, I am 1 of 2
>>>> [pmd.pakmet.com:06386] 1 more process has sent help message
>>>> help-mpi-btl-openib.txt / no active ports found
>>>> [pmd.pakmet.com:06386] Set MCA parameter "orte_base_help_aggregate" to
>>>> 0 to see all help / error messages
>>>>
>>>> [pmdtest@pmd ~]$ mpirun --host compute-01-01,compute-01-06 ring_c
>>>> --
>>>> WARNING: There is at least one OpenFabrics device found but there are
>>>> no active ports detected (or Open MPI was unable to use them).  This
>>>> is most certainly not what you wanted.  Check your cables, subnet
>>>> manager configuration, etc.  The openib BTL will be ignored for this
>>>> job.
>>>>   Local host: compute-01-01.private.dns.zone
>>>> --
>>>> Process 0 

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
--mca btl ^openib
disables the openib btl, which is native infiniband only.

ib0 is treated as any TCP interface and then handled by the tcp btl

an other option is you to use
--mca btl_tcp_if_exclude ib0

On 2014/11/13 16:43, Syed Ahsan Ali wrote:
> You are right it is running on 10.0.0.0 interface [pmdtest@pmd ~]$
> mpirun --mca btl ^openib --host compute-01-01,compute-01-06 --mca
> btl_tcp_if_include 10.0.0.0/8 ring_c
> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
> Process 0 sent to 1
> Process 0 decremented value: 9
> Process 0 decremented value: 8
> Process 0 decremented value: 7
> Process 0 decremented value: 6
> Process 1 exiting
> Process 0 decremented value: 5
> Process 0 decremented value: 4
> Process 0 decremented value: 3
> Process 0 decremented value: 2
> Process 0 decremented value: 1
> Process 0 decremented value: 0
> Process 0 exiting
> [pmdtest@pmd ~]$
>
> While the ip addresses 192.168.108* are for ib interface.
>
>  [root@compute-01-01 ~]# ifconfig
> eth0  Link encap:Ethernet  HWaddr 00:24:E8:59:4C:2A
>   inet addr:10.0.0.3  Bcast:10.255.255.255  Mask:255.0.0.0
>   inet6 addr: fe80::224:e8ff:fe59:4c2a/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:65588 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:14184 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000
>   RX bytes:18692977 (17.8 MiB)  TX bytes:1834122 (1.7 MiB)
>   Interrupt:169 Memory:dc00-dc012100
> ib0   Link encap:InfiniBand  HWaddr
> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>   inet addr:192.168.108.14  Bcast:192.168.108.255  Mask:255.255.255.0
>   UP BROADCAST MULTICAST  MTU:65520  Metric:1
>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:256
>   RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>
>
>
> So the point is why mpirun is following the ib  path while I it has
> been disabled. Possible solutions?
>
> On Thu, Nov 13, 2014 at 12:32 PM, Gilles Gouaillardet
>  wrote:
>> mpirun complains about the 192.168.108.10 ip address, but ping reports a
>> 10.0.0.8 address
>>
>> is the 192.168.* network a point to point network (for example between a
>> host and a mic) so two nodes
>> cannot ping each other via this address ?
>> /* e.g. from compute-01-01 can you ping the 192.168.108.* ip address of
>> compute-01-06 ? */
>>
>> could you also run
>>
>> mpirun --mca btl ^openib --host compute-01-01,compute-01-06 --mca
>> btl_tcp_if_include 10.0.0.0/8 ring_c
>>
>> and see whether it helps ?
>>
>>
>> On 2014/11/13 16:24, Syed Ahsan Ali wrote:
>>> Same result in both cases
>>>
>>> [pmdtest@pmd ~]$ mpirun --mca btl ^openib --host
>>> compute-01-01,compute-01-06 ring_c
>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>> Process 0 sent to 1
>>> Process 0 decremented value: 9
>>> [compute-01-01.private.dns.zone][[47139,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>>> connect() to 192.168.108.10 failed: No route to host (113)
>>>
>>>
>>> [pmdtest@compute-01-01 ~]$ mpirun --mca btl ^openib --host
>>> compute-01-01,compute-01-06 ring_c
>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>> Process 0 sent to 1
>>> Process 0 decremented value: 9
>>> [compute-01-01.private.dns.zone][[11064,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>>> connect() to 192.168.108.10 failed: No route to host (113)
>>>
>>>
>>> On Thu, Nov 13, 2014 at 12:11 PM, Gilles Gouaillardet
>>>  wrote:
>>>> Hi,
>>>>
>>>> it seems you messed up the command line
>>>>
>>>> could you try
>>>>
>>>> $ mpirun --mca btl ^openib --host compute-01-01,compute-01-06 ring_c
>>>>
>>>>
>>>> can you also try to run mpirun from a compute node instead of the head
>>>> node ?
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>>
>>>> On 2014/11/13 16:07, Syed Ahsan Ali wrote:
>>>>> Here is what I see when disabling openib support.\
>>>>>
>>>>>
>>>>> [pmdtest@pmd ~]$ mpirun --host --mca btl ^openib
>>>>> compute-01-01,compute-01-06 ring_c
>>>>> ssh:  orted: Temporary failure in name resolution
>>>>> ssh: 

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
This is really weird ?

is the loopback interface up and running on both nodes and with the same
ip ?

can you run on both compute nodes ?
netstat -nr


On 2014/11/13 16:50, Syed Ahsan Ali wrote:
> Now it looks through the loopback address
>
> [pmdtest@pmd ~]$ mpirun --host compute-01-01,compute-01-06 --mca
> btl_tcp_if_exclude ib0 ring_c
> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
> [compute-01-01.private.dns.zone][[37713,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
> connect() to 127.0.0.1 failed: Connection refused (111)
> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
> [pmd.pakmet.com:30867] 1 more process has sent help message
> help-mpi-btl-openib.txt / no active ports found
> [pmd.pakmet.com:30867] Set MCA parameter "orte_base_help_aggregate" to
> 0 to see all help / error messages
>
>
>
> On Thu, Nov 13, 2014 at 12:46 PM, Gilles Gouaillardet
>  wrote:
>> --mca btl ^openib
>> disables the openib btl, which is native infiniband only.
>>
>> ib0 is treated as any TCP interface and then handled by the tcp btl
>>
>> an other option is you to use
>> --mca btl_tcp_if_exclude ib0
>>
>> On 2014/11/13 16:43, Syed Ahsan Ali wrote:
>>> You are right it is running on 10.0.0.0 interface [pmdtest@pmd ~]$
>>> mpirun --mca btl ^openib --host compute-01-01,compute-01-06 --mca
>>> btl_tcp_if_include 10.0.0.0/8 ring_c
>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>> Process 0 sent to 1
>>> Process 0 decremented value: 9
>>> Process 0 decremented value: 8
>>> Process 0 decremented value: 7
>>> Process 0 decremented value: 6
>>> Process 1 exiting
>>> Process 0 decremented value: 5
>>> Process 0 decremented value: 4
>>> Process 0 decremented value: 3
>>> Process 0 decremented value: 2
>>> Process 0 decremented value: 1
>>> Process 0 decremented value: 0
>>> Process 0 exiting
>>> [pmdtest@pmd ~]$
>>>
>>> While the ip addresses 192.168.108* are for ib interface.
>>>
>>>  [root@compute-01-01 ~]# ifconfig
>>> eth0  Link encap:Ethernet  HWaddr 00:24:E8:59:4C:2A
>>>   inet addr:10.0.0.3  Bcast:10.255.255.255  Mask:255.0.0.0
>>>   inet6 addr: fe80::224:e8ff:fe59:4c2a/64 Scope:Link
>>>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>   RX packets:65588 errors:0 dropped:0 overruns:0 frame:0
>>>   TX packets:14184 errors:0 dropped:0 overruns:0 carrier:0
>>>   collisions:0 txqueuelen:1000
>>>   RX bytes:18692977 (17.8 MiB)  TX bytes:1834122 (1.7 MiB)
>>>   Interrupt:169 Memory:dc00-dc012100
>>> ib0   Link encap:InfiniBand  HWaddr
>>> 80:00:00:48:FE:80:00:00:00:00:00:00:00:00:00:00:00:00:00:00
>>>   inet addr:192.168.108.14  Bcast:192.168.108.255  
>>> Mask:255.255.255.0
>>>   UP BROADCAST MULTICAST  MTU:65520  Metric:1
>>>   RX packets:0 errors:0 dropped:0 overruns:0 frame:0
>>>   TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
>>>   collisions:0 txqueuelen:256
>>>   RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
>>>
>>>
>>>
>>> So the point is why mpirun is following the ib  path while I it has
>>> been disabled. Possible solutions?
>>>
>>> On Thu, Nov 13, 2014 at 12:32 PM, Gilles Gouaillardet
>>>  wrote:
>>>> mpirun complains about the 192.168.108.10 ip address, but ping reports a
>>>> 10.0.0.8 address
>>>>
>>>> is the 192.168.* network a point to point network (for example between a
>>>> host and a mic) so two nodes
>>>> cannot ping each other via this address ?
>>>> /* e.g. from compute-01-01 can you ping the 192.168.108.* ip address of
>>>> compute-01-06 ? */
>>>>
>>>> could you also run
>>>>
>>>> mpirun --mca btl ^openib --host compute-01-01,compute-01-06 --mca
>>>> btl_tcp_if_include 10.0.0.0/8 ring_c
>>>>
>>>> and see whether it helps ?
>>>>
>>>>
>>>> On 2014/11/13 16:24, Syed Ahsan Ali wrote:
>>>>> Same result in both cases
>>>>>
>>>>> [pmdtest@pmd ~]$ mpirun --mca btl ^openib --host
>>>>> compute-01-01,compute-01-06 ring_c
>>>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>>>> Process 0 sent to 1
>>>>> Process 0 decremented value: 9
>>>>> [compute-01-

Re: [OMPI users] mpirun fails across nodes

2014-11-13 Thread Gilles Gouaillardet
but it is running on your head node isnt't it ?

you might want to double check why there is no loopback interface on
your compute nodes.
in the mean time, you can disable lo and ib0 interfaces

Cheers,

Gilles

On 2014/11/13 16:59, Syed Ahsan Ali wrote:
>  I don't see it running
>
> [pmdtest@compute-01-01 ~]$ netstat -nr
> Kernel IP routing table
> Destination Gateway Genmask Flags   MSS Window  irtt Iface
> 192.168.108.0   0.0.0.0 255.255.255.0   U 0 0  0 ib0
> 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0  0 ib0
> 239.0.0.0   0.0.0.0 255.0.0.0   U 0 0  0 eth0
> 10.0.0.00.0.0.0 255.0.0.0   U 0 0  0 eth0
> 0.0.0.0 10.0.0.10.0.0.0 UG0 0  0 eth0
> [pmdtest@compute-01-01 ~]$ exit
> logout
> Connection to compute-01-01 closed.
> [pmdtest@pmd ~]$ ssh compute-01-06
> Last login: Thu Nov 13 12:06:14 2014 from compute-01-01.private.dns.zone
> [pmdtest@compute-01-06 ~]$ netstat -nr
> Kernel IP routing table
> Destination Gateway Genmask Flags   MSS Window  irtt Iface
> 192.168.108.0   0.0.0.0 255.255.255.0   U 0 0  0 ib0
> 169.254.0.0 0.0.0.0 255.255.0.0 U 0 0  0 ib0
> 239.0.0.0   0.0.0.0 255.0.0.0   U 0 0  0 eth0
> 10.0.0.00.0.0.0 255.0.0.0   U 0 0  0 eth0
> 0.0.0.0 10.0.0.10.0.0.0 UG0 0  0 eth0
> [pmdtest@compute-01-06 ~]$
> 
>
> On Thu, Nov 13, 2014 at 12:56 PM, Gilles Gouaillardet
>  wrote:
>> This is really weird ?
>>
>> is the loopback interface up and running on both nodes and with the same
>> ip ?
>>
>> can you run on both compute nodes ?
>> netstat -nr
>>
>>
>> On 2014/11/13 16:50, Syed Ahsan Ali wrote:
>>> Now it looks through the loopback address
>>>
>>> [pmdtest@pmd ~]$ mpirun --host compute-01-01,compute-01-06 --mca
>>> btl_tcp_if_exclude ib0 ring_c
>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>> [compute-01-01.private.dns.zone][[37713,1],0][btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect]
>>> connect() to 127.0.0.1 failed: Connection refused (111)
>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>> [pmd.pakmet.com:30867] 1 more process has sent help message
>>> help-mpi-btl-openib.txt / no active ports found
>>> [pmd.pakmet.com:30867] Set MCA parameter "orte_base_help_aggregate" to
>>> 0 to see all help / error messages
>>>
>>>
>>>
>>> On Thu, Nov 13, 2014 at 12:46 PM, Gilles Gouaillardet
>>>  wrote:
>>>> --mca btl ^openib
>>>> disables the openib btl, which is native infiniband only.
>>>>
>>>> ib0 is treated as any TCP interface and then handled by the tcp btl
>>>>
>>>> an other option is you to use
>>>> --mca btl_tcp_if_exclude ib0
>>>>
>>>> On 2014/11/13 16:43, Syed Ahsan Ali wrote:
>>>>> You are right it is running on 10.0.0.0 interface [pmdtest@pmd ~]$
>>>>> mpirun --mca btl ^openib --host compute-01-01,compute-01-06 --mca
>>>>> btl_tcp_if_include 10.0.0.0/8 ring_c
>>>>> Process 0 sending 10 to 1, tag 201 (2 processes in ring)
>>>>> Process 0 sent to 1
>>>>> Process 0 decremented value: 9
>>>>> Process 0 decremented value: 8
>>>>> Process 0 decremented value: 7
>>>>> Process 0 decremented value: 6
>>>>> Process 1 exiting
>>>>> Process 0 decremented value: 5
>>>>> Process 0 decremented value: 4
>>>>> Process 0 decremented value: 3
>>>>> Process 0 decremented value: 2
>>>>> Process 0 decremented value: 1
>>>>> Process 0 decremented value: 0
>>>>> Process 0 exiting
>>>>> [pmdtest@pmd ~]$
>>>>>
>>>>> While the ip addresses 192.168.108* are for ib interface.
>>>>>
>>>>>  [root@compute-01-01 ~]# ifconfig
>>>>> eth0  Link encap:Ethernet  HWaddr 00:24:E8:59:4C:2A
>>>>>   inet addr:10.0.0.3  Bcast:10.255.255.255  Mask:255.0.0.0
>>>>>   inet6 addr: fe80::224:e8ff:fe59:4c2a/64 Scope:Link
>>>>>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>>>>>   RX packets:65588 errors:0 dropped:0 overruns:0 frame:0
>>>>>   TX packets:141

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-13 Thread Gilles Gouaillardet
My 0.02 US$

first, the root cause of the problem was a default gateway was
configured on the node,
but this gateway was unreachable.
imho, this is incorrect system setting that can lead to unpredictable
results :
- openmpi 1.8.1 works (you are lucky, good for you)
- openmpi 1.8.3 fails (no luck this time, too bad)
so i believe it is incorrect to blame openmpi for this.

that being said, you raise some good points of how to improve user
friendliness for end users
that have limited skills and/or interest in OpenMPI and system
administration.

basically, i agree with Gus. HPC is complex, not every clusters are the same
and imho some minimal config/tuning might not be avoided to get OpenMPI
working,
or operating at full speed.


let me give a few examples :

you recommend OpenMPI uses only the interfaces that matches the
hostnames in the machinefile.
what if you submit from the head node ? should you use the interface
that matches the hostname ?
what if this interface is the public interface, there is a firewall
and/or compute nodes have no default gateway ?
that will simply not work ...
so mpirun needs to pass orted all its interfaces.
which one should be picked by orted ?
- the first one ? it might be the unreachable public interface ...
- the one on the same subnet ? what if none is on the same subnet ?
  on the cluster i am working, eth0 are in different subnets, ib0 is on
a single subnet
  and i do *not* want to use ib0. but on some other clusters, the
ethernet network is so cheap
  they *want* to use ib0.

on your cluster, you want to use eth0 for oob and mpi, and eth1 for NFS.
that is legitimate.
in my case, i want to use eth0 (gigE) for oob and eth2 (10gigE) for MPI.
that is legitimate too.

we both want OpenMPI works *and* with best performance out of the box.
it is a good thing to have high expectations, but they might not all be met.

i'd rather implement some pre-defined policies that rules how ethernet
interfaces should be picked up,
and add a FAQ that mentions : if it does not work (or does not work as
fast as expected) out of the box, you should
at first try an other policy.

then the next legitimate question will be "what is the default policy" ?
regardless the answer, it will be good for some and bad for others.


imho, posting a mail to the OMPI users mailing list was the right thing
to do :
- you got help on how to troubleshot and fix the issue
- we got some valuable feedback on endusers expectations.

Cheers,

Gilles

On 2014/11/14 3:36, Gus Correa wrote:
> On 11/13/2014 11:14 AM, Ralph Castain wrote:
>> Hmmm…I’m beginning to grok the issue. It is a tad unusual for people to
>> assign different hostnames to their interfaces - I’ve seen it in the
>> Hadoop world, but not in HPC. Still, no law against it.
>
> No, not so unusual.
> I have clusters from respectable vendors that come with
> /etc/hosts for name resolution of the various interfaces.
> If I remember right, Rocks clusters also does that (or actually
> allow the sys admin to setup additional networks and at that point
> will append /etc/hosts with the additional names, or perhaps put those
> names in DHCP).
> I am not so familiar to xcat, but I think it has similar DHCP
> functionality, or maybe DNS on the head node.
>
> Having said that, I don't think this is an obstacle to setting up the
> right "if_include/if_exlculde" choices (along with the btl, oob, etc),
> for each particular cluster in the mca parameter configuration file.
> That is what my parallel conversation with Reuti was about.
>
> I believe the current approach w.r.t. interfaces:
> "use everythint, let the sysadmin/user restrict as
> (s)he sees fit" is both a wise and flexible way to do it.
> Guessing the "right interface to use" sounds risky to me (wrong
> choices may happen), and a bit of a cast.
>
>>
>> This will take a little thought to figure out a solution. One problem
>> that immediately occurs is if someone includes a hostfile that has lines
>> which refer to the same physical server, but using different interface
>> names. We’ll think those are completely distinct servers, and so the
>> process placement will be totally messed up.
>>
>
> Sure, and besides this, there will be machines with
> inconsistent/wrong/conflicting name resolution schemes
> that the current OMPI approach simply (and wisely) ignores.
>
>
>> We’ll also encounter issues with the daemon when it reports back, as the
>> hostname it gets will almost certainly differ from the hostname we were
>> expecting. Not as critical, but need to check to see where that will
>> impact the code base
>>
>
> I'm sure that will happen.
> Torque uses hostname by default for several things, and it can be a
> configuration nightmare to workaround that when what hostnam

Re: [OMPI users] OMPI users] error building openmpi-dev-274-g2177f9e withgcc-4.9.2

2014-11-16 Thread Gilles Gouaillardet
Siegmar,

This is correct, --enable-heterogenous is now fixed in the trunk.

Please also note that -D_REENTRANT is now automatically set on solaris

Cheers

Gilles

Siegmar Gross  wrote:
>Hi Jeff, hi Ralph,
>
>> This issue should now be fixed, too.
>
>Yes, it is. Thank you very much for your help. It seems that even
>--enable-heterogeneous is working once more (Gilles told me last
>month that it was broken in the trunk), because I could successfully
>multiply a small matrix in my heterogeneous environment.
>
>
>Kind regards and thank you very much once more
>
>Siegmar
>
>
>
>> On Nov 14, 2014, at 12:04 PM, Siegmar Gross 
>>  wrote:
>> 
>> > Hi,
>> > 
>> > today I tried to install openmpi-dev-274-g2177f9e on my machines
>> > (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1
>> > x86_64) with gcc-4.9.2 and got the following error on all three
>> > platforms.
>> > 
>> > tyr openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc 117 tail -25 
>> > log.make.Linux.x86_64.64_gcc
>> >  SED  mpi/man/man3/MPI_Wtime.3
>> >  SED  mpi/man/man3/OpenMPI.3
>> > make[2]: Leaving directory 
>`/export2/src/openmpi-1.9/openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc/ompi'
>> > Making all in mpi/cxx
>> > make[2]: Entering directory 
>`/export2/src/openmpi-1.9/openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc/ompi/mpi/cxx'
>> >  CXX  mpicxx.lo
>> > In file included from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/mca/rte/orte/rte_orte.h:33:0,
>> > from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/mca/rte/rte.h:195,
>> > from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/errhandler/errhandler.h:34,
>> > from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/mpi/cxx/mpicxx.cc:37:
>> > ../../../../openmpi-dev-274-g2177f9e/orte/mca/routed/routed.h:52:8: error: 
>> > using typedef-name 
>'orte_process_name_t' after 'struct'
>> > struct orte_process_name_t;
>> >^
>> > In file included from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/mca/rte/orte/rte_orte.h:29:0,
>> > from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/mca/rte/rte.h:195,
>> > from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/errhandler/errhandler.h:34,
>> > from 
>> > ../../../../openmpi-dev-274-g2177f9e/ompi/mpi/cxx/mpicxx.cc:37:
>> > ../../../../openmpi-dev-274-g2177f9e/orte/include/orte/types.h:102:29: 
>> > note: 'orte_process_name_t' 
>has a previous declaration here
>> > typedef opal_process_name_t orte_process_name_t;
>> > ^
>> > make[2]: *** [mpicxx.lo] Error 1
>> > make[2]: Leaving directory 
>`/export2/src/openmpi-1.9/openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc/ompi/mpi/cxx'
>> > make[1]: *** [all-recursive] Error 1
>> > make[1]: Leaving directory 
>`/export2/src/openmpi-1.9/openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc/ompi'
>> > make: *** [all-recursive] Error 1
>> > tyr openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc 118 
>> > 
>> > 
>> > I used the following configure command.
>> > 
>> > tyr openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc 118 head config.log | 
>> > grep openmpi
>> >  $ ../openmpi-dev-274-g2177f9e/configure 
>> > --prefix=/usr/local/openmpi-1.9.0_64_gcc 
>--libdir=/usr/local/openmpi-1.9.0_64_gcc/lib64 
>> > --with-jdk-bindir=/usr/local/jdk1.8.0/bin 
>> > --with-jdk-headers=/usr/local/jdk1.8.0/include 
>JAVA_HOME=/usr/local/jdk1.8.0 LDFLAGS=-m64 
>> > CC=gcc CXX=g++ FC=gfortran CFLAGS=-m64 -D_REENTRANT CXXFLAGS=-m64 
>> > FCFLAGS=-m64 CPP=cpp CXXCPP=cpp 
>CPPFLAGS= -D_REENTRANT 
>> > CXXCPPFLAGS= --enable-mpi-cxx --enable-cxx-exceptions --enable-mpi-java 
>--enable-mpi-thread-multiple --with-threads=posix 
>> > --with-hwloc=internal --without-verbs --with-wrapper-cflags=-std=c11 -m64 
>--with-wrapper-cxxflags=-m64 --enable-debug
>> > tyr openmpi-dev-274-g2177f9e-Linux.x86_64.64_gcc 119 
>> > 
>> > 
>> > I would be grateful. if somebody can fix the problem. Thank
>> > you very much for any help in advance.
>> > 
>> > 
>> > Kind regards
>> > 
>> > Siegmar
>> > 
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post: 
>> > http://www.open-mpi.org/community/lists/users/2014/11/25815.php
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25819.php


Re: [OMPI users] Fortran and OpenMPI 1.8.3 compiled with Intel-15 does nothing silently

2014-11-17 Thread Gilles Gouaillardet
Hi John,

do you MPI_Init() or do you MPI_Init_thread(MPI_THREAD_MULTIPLE) ?

does your program calls MPI anywhere from an OpenMP region ?
does your program calls MPI only within an !$OMP MASTER section ?
does your program does not invoke MPI at all from any OpenMP region ?

can you reproduce this issue with a simple fortran program ? or can you
publish all your files ?

Cheers,

Gilles

On 2014/11/18 1:41, John Bray wrote:
> I have succesfully been using OpenMPI 1.8.3 compiled with Intel-14, using
>
> ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix
> --enable-mpi-thread-multiple --disable-vt --with-scif=no
>
> I have now switched to Intel 15.0.1, and configuring with the same options,
> I get minor changes in config.log about warning spotting, but it makes all
> the binaries, and I can compile my own fortran code with mpif90/mpicc
>
> but a command 'mpiexec --verbose -n 12 ./fortran_binary' does nothing
>
> I checked the FAQ and started using
>
> ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix
> --enable-mpi-thread-multiple --disable-vt --with-scif=no CC=icc CXX=icpc
> F77=ifort FC=ifort
>
> but that makes no difference.
>
> Only with -d do I get any more information
>
> mpirun -d --verbose -n 12
> /home/jbray/5.0/mic2/one/intel-15_openmpi-1.8.3/one_f_debug.exe
> [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0
> [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0
> [mic2:21851] top: openmpi-sessions-jbray@mic2_0
> [mic2:21851] tmp: /tmp
> [mic2:21851] sess_dir_cleanup: job session dir does not exist
> [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0
> [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0
> [mic2:21851] top: openmpi-sessions-jbray@mic2_0
> [mic2:21851] tmp: /tmp
> [mic2:21851] sess_dir_finalize: proc session dir does not exist
> <12 times>
>
>
> [mic2:21851] sess_dir_cleanup: job session dir does not exist
> exiting with status 139
>
> My C codes do not have this problem
>
> Compiler options are
>
> mpicxx -g -O0 -fno-inline-functions -openmp -o one_c_debug.exe async.c
> collective.c compute.c memory.c one.c openmp.c p2p.c variables.c
> auditmpi.c   control.c inout.c perfio.c ring.c wave.c io.c   leak.c mpiio.c
> pthreads.c -openmp -lpthread
>
> mpif90 -g -O0  -fno-inline-functions -openmp -o one_f_debug.exe control.o
> io.f90 leak.f90 memory.f90 one.f90 ring.f90 slow.f90 swapbounds.f90
> variables.f90 wave.f90 *.F90 -openmp
>
> Any suggestions as to what is upsetting Fortran with Intel-15
>
> John
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25823.php



Re: [OMPI users] collective algorithms

2014-11-17 Thread Gilles Gouaillardet
Daniel,

you can run
$ ompi_info --parseable --all | grep _algorithm: | grep enumerator

that will give you the list of supported algo for the collectives,
here is a sample output :

mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:0:ignore
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:1:basic_linear
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:2:nonoverlapping
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:3:recursive_doubling
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:4:ring
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:5:segmented_ring


the decision (which algo is used based on communicator size/message
size/...) is made in
ompi/mca/coll/tuned/coll_tuned_decision_fixed.c
and can be overriden via config file or environment variable

i cannot point you to a paper, and hopefully someone else will

Cheers,

Gilles


On 2014/11/18 12:53, Faraj, Daniel A wrote:
> I am trying to survey the collective algorithms in Open MPI.
> I looked at the src code but could not make out the guts of the communication 
> algorithms.
> There are some open mpi papers but not detailed, where they talk about what 
> algorithms are using in certain collectives.
> Has anybody done this sort of work, or point me to a paper?
>
> Basically, for a given collective operation, what are:
>
> a)  Communication algorithm being used for a given criteria (i.e. message 
> size or np)
>
> b)  What is theoretical algorithm cost
>
> Thanx
>
>
> ---
> Daniel Faraj
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25831.php



Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

2014-11-21 Thread Gilles Gouaillardet
Hi Ghislain,

that sound like a but in MPI_Dist_graph_create :-(

you can use MPI_Dist_graph_create_adjacent instead :

MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD, degrees, &targets[0],
&weights[0],
degrees, &targets[0], &weights[0], info,
rankReordering, &commGraph);

it does not crash and as far as i understand, it produces correct results,

according the the mpi standard (example 7.3) that should do the same
thing, that's why
i think there is a bug in MPI_Dist_graph_create

Cheers,

Gilles



On 2014/11/21 2:21, Howard Pritchard wrote:
> Hi Ghislain,
>
> I tried to run your test with mvapich 1.9 and get a "message truncated"
> failure at three ranks.
>
> Howard
>
>
> 2014-11-20 8:51 GMT-07:00 Ghislain Viguier :
>
>> Dear support,
>>
>> I'm encountering an issue with the MPI_Neighbor_alltoallw request of
>> mpi-1.8.3.
>> I have enclosed a test case with information of my workstation.
>>
>> In this test, I define a weighted topology for 5 processes, where the
>> weight represent the number of buffers to send/receive :
>> rank
>>   0 : | x |
>>   1 : | 2 | x |
>>   2 : | 1 | 1 | x |
>>   3 : | 3 | 2 | 3 | x |
>>   4 : | 5 | 2 | 2 | 2 | x |
>>
>> In this topology, the rank 1 will send/receive :
>>2 buffers to/from the rank 0,
>>1 buffer to/from the rank 2,
>>2 buffers to/from the rank 3,
>>2 buffers to/from the rank 4,
>>
>> The send buffer are defined with the MPI_Type_create_hindexed_block. This
>> allows to use a same buffer for several communications without duplicating
>> it (read only).
>> Here the rank 1 will have 2 send buffers (the max of 2, 1, 2, 2).
>> The receiver buffer is a contiguous buffer defined with
>> MPI_Type_contiguous request.
>> Here, the receiver buffer of the rank 1 is of size : 7 (2+1+2+2)
>>
>> This test case succesful for 2 or 3 processes. For 4 processes, the test
>> fails 1 times for 3 successes. For 5 processes, the test fails all the time.
>>
>> The error code is : *** MPI_ERR_IN_STATUS: error code in status
>>
>> I don't understand what I am doing wrong.
>>
>> Could you please have a look on it?
>>
>> Thank you very much.
>>
>> Best regards,
>> Ghislain Viguier
>>
>> --
>> Ghislain Viguier
>> Tél. 06 31 95 03 17
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/11/25850.php
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25852.php

diff --git a/ompi/mca/coll/basic/coll_basic_neighbor_alltoallw.c 
b/ompi/mca/coll/basic/coll_basic_neighbor_alltoallw.c
index 28ecf04..4069212 100644
--- a/ompi/mca/coll/basic/coll_basic_neighbor_alltoallw.c
+++ b/ompi/mca/coll/basic/coll_basic_neighbor_alltoallw.c
@@ -181,7 +181,7 @@ mca_coll_basic_neighbor_alltoallw_dist_graph(const void 
*sbuf, const int scounts
 /* post all receives first */
 for (neighbor = 0, reqs = basic_module->mccb_reqs ; neighbor < indegree ; 
++neighbor) {
 rc = MCA_PML_CALL(irecv((char *) rbuf + rdisps[neighbor], 
rcounts[neighbor], rdtypes[neighbor],
-inedges[neighbor], MCA_COLL_BASE_TAG_ALLTOALL, 
comm, reqs++));
+outedges[neighbor], 
MCA_COLL_BASE_TAG_ALLTOALL, comm, reqs++));
 if (OMPI_SUCCESS != rc) break;
 }

//
// Name: 027_MPI_Neighbor_alltoallw_synthetic.cpp
// Author  : 
// Version :
// Copyright   : Your copyright notice
// Description : Hello World in C++, Ansi-style
//

#include 
#include 
#include 
#include 
#include 
#include 
#include 

using namespace std;

int main(int argc, char *argv[]) {

	const int sendBufferSize = 1;

	///   MPI initialization   ///

	int ierr;
	int nbProc;
	int rank;
	ierr = MPI_Init(&argc, &argv);
	assert(!ierr);
	ierr = MPI_Comm_size(MPI_COMM_WORLD, &nbProc);
	assert(!ierr);
	ierr = MPI_Comm_rank(MPI_COMM_WORLD, &rank);
	assert(!ierr);

	assert(nbProc <= 5);

	///   weighted topology   ///
	//   0  | x |
	//   1  | 2 | x |
	//   2  | 1 | 1 | x |
	//   3  | 3 | 2 | 3 | x |
	//   4  |

Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

2014-11-21 Thread Gilles Gouaillardet
Ghislain,

i can confirm there is a bug in mca_topo_base_dist_graph_distribute

FYI a proof of concept is available at
https://github.com/open-mpi/ompi/pull/283
and i recommend you use MPI_Dist_graph_create_adjacent if this meets
your needs.

as a side note, the right way to set the info is
MPI_Info info = MPI_INFO_NULL;

/* mpich is more picky and crashes with info = NULL */

Cheers,

Gilles

On 2014/11/21 18:21, Ghislain Viguier wrote:
> Hi Gilles and Howard,
>
> The use of MPI_Dist_graph_create_adjacent solves the issue :)
>
> Thanks for your help!
>
> Best reagrds,
> Ghislain
>
> 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet > :
>>  Hi Ghislain,
>>
>> that sound like a but in MPI_Dist_graph_create :-(
>>
>> you can use MPI_Dist_graph_create_adjacent instead :
>>
>> MPI_Dist_graph_create_adjacent(MPI_COMM_WORLD, degrees, &targets[0],
>> &weights[0],
>> degrees, &targets[0], &weights[0], info,
>> rankReordering, &commGraph);
>>
>> it does not crash and as far as i understand, it produces correct results,
>>
>> according the the mpi standard (example 7.3) that should do the same
>> thing, that's why
>> i think there is a bug in MPI_Dist_graph_create
>>
>> Cheers,
>>
>> Gilles
>>
>>
>>
>>
>> On 2014/11/21 2:21, Howard Pritchard wrote:
>>
>> Hi Ghislain,
>>
>> I tried to run your test with mvapich 1.9 and get a "message truncated"
>> failure at three ranks.
>>
>> Howard
>>
>>
>> 2014-11-20 8:51 GMT-07:00 Ghislain Viguier  
>> :
>>
>>
>>  Dear support,
>>
>> I'm encountering an issue with the MPI_Neighbor_alltoallw request of
>> mpi-1.8.3.
>> I have enclosed a test case with information of my workstation.
>>
>> In this test, I define a weighted topology for 5 processes, where the
>> weight represent the number of buffers to send/receive :
>> rank
>>   0 : | x |
>>   1 : | 2 | x |
>>   2 : | 1 | 1 | x |
>>   3 : | 3 | 2 | 3 | x |
>>   4 : | 5 | 2 | 2 | 2 | x |
>>
>> In this topology, the rank 1 will send/receive :
>>2 buffers to/from the rank 0,
>>1 buffer to/from the rank 2,
>>2 buffers to/from the rank 3,
>>2 buffers to/from the rank 4,
>>
>> The send buffer are defined with the MPI_Type_create_hindexed_block. This
>> allows to use a same buffer for several communications without duplicating
>> it (read only).
>> Here the rank 1 will have 2 send buffers (the max of 2, 1, 2, 2).
>> The receiver buffer is a contiguous buffer defined with
>> MPI_Type_contiguous request.
>> Here, the receiver buffer of the rank 1 is of size : 7 (2+1+2+2)
>>
>> This test case succesful for 2 or 3 processes. For 4 processes, the test
>> fails 1 times for 3 successes. For 5 processes, the test fails all the time.
>>
>> The error code is : *** MPI_ERR_IN_STATUS: error code in status
>>
>> I don't understand what I am doing wrong.
>>
>> Could you please have a look on it?
>>
>> Thank you very much.
>>
>> Best regards,
>> Ghislain Viguier
>>
>> --
>> Ghislain Viguier
>> Tél. 06 31 95 03 17
>>
>> ___
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this 
>> post:http://www.open-mpi.org/community/lists/users/2014/11/25850.php
>>
>>
>>
>> ___
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25852.php
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/11/25853.php
>>
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25855.php



Re: [OMPI users] MPI_Neighbor_alltoallw fails with mpi-1.8.3

2014-11-25 Thread Gilles Gouaillardet
George,

imho, you are right !

here is attached a new version of Ghislain's program and that uses
MPI_Dist_graph_neighbors_count and MPI_Dist_graph_neighbors
as you suggested.

it produces correct results

/* note that in this case, realDestinations is similar to targets,
so i might have left some silent bugs in the program */

Bottom line, though Open MPI implementation of MPI_Dist_graph_create is not
deterministic, it is compliant with the MPI standard.
/* not to mention this is not the right place to argue what the standard
could or should have been ... */

Cheers,

Gilles


On 2014/11/24 12:47, George Bosilca wrote:
> I would argue this is a typical user level bug.
>
> The major difference between the dist_create and dist_create_adjacent is
> that in the later each process provides its neighbors in an order that is
> expected (and that match the info provided to the MPI_Neighbor_alltoallw
> call. When the topology is created with dist_create, every process will
> end-up having the correct partial topology, but in an order that doesn't
> match what the user expected (not in the rank-order of the neighbors).
> However, I can't find anything in the standard that would require from the
> MPI library to sort the neighbors. I would assume is the user
> responsibility, to make sure that they are using the topology in the right
> order, where the right order is what the communicator really contains and
> not what the user expect based on prior knowledge.
>
>   George.
>
>
> On Fri, Nov 21, 2014 at 3:48 AM, Gilles Gouaillardet <
> gilles.gouaillar...@iferc.org> wrote:
>
>>  Ghislain,
>>
>> i can confirm there is a bug in mca_topo_base_dist_graph_distribute
>>
>> FYI a proof of concept is available at
>> https://github.com/open-mpi/ompi/pull/283
>> and i recommend you use MPI_Dist_graph_create_adjacent if this meets your
>> needs.
>>
>> as a side note, the right way to set the info is
>> MPI_Info info = MPI_INFO_NULL;
>>
>> /* mpich is more picky and crashes with info = NULL */
>>
>> Cheers,
>>
>> Gilles
>>
>>
>> On 2014/11/21 18:21, Ghislain Viguier wrote:
>>
>> Hi Gilles and Howard,
>>
>> The use of MPI_Dist_graph_create_adjacent solves the issue :)
>>
>> Thanks for your help!
>>
>> Best reagrds,
>> Ghislain
>>
>> 2014-11-21 7:23 GMT+01:00 Gilles Gouaillardet >
>>  :
>>
>>Hi Ghislain,
>>
>> that sound like a but in MPI_Dist_graph_create :-(
>>
>> you can use MPI_Dist_graph_create_adjacent instead :
>>
>> MPI_Dist_graph_create_
>> adjacent(MPI_COMM_WORLD, degrees, &targets[0],
>> &weights[0],
>>     degrees, &targets[0], &weights[0], info,
>> rankReordering, &commGraph);
>>
>> it does not crash and as far as i understand, it produces correct results,
>>
>> according the the mpi standard (example 7.3) that should do the same
>> thing, that's why
>> i think there is a bug in MPI_Dist_graph_create
>>
>> Cheers,
>>
>> Gilles
>>
>>
>>
>>
>> On 2014/11/21 2:21, Howard Pritchard wrote:
>>
>> Hi Ghislain,
>>
>> I tried to run your test with mvapich 1.9 and get a "message truncated"
>> failure at three ranks.
>>
>> Howard
>>
>>
>> 2014-11-20 8:51 GMT-07:00 Ghislain Viguier  
>>   
>> :
>>
>>
>>  Dear support,
>>
>> I'm encountering an issue with the MPI_Neighbor_alltoallw request of
>> mpi-1.8.3.
>> I have enclosed a test case with information of my workstation.
>>
>> In this test, I define a weighted topology for 5 processes, where the
>> weight represent the number of buffers to send/receive :
>> rank
>>   0 : | x |
>>   1 : | 2 | x |
>>   2 : | 1 | 1 | x |
>>   3 : | 3 | 2 | 3 | x |
>>   4 : | 5 | 2 | 2 | 2 | x |
>>
>> In this topology, the rank 1 will send/receive :
>>2 buffers to/from the rank 0,
>>1 buffer to/from the rank 2,
>>2 buffers to/from the rank 3,
>>2 buffers to/from the rank 4,
>>
>> The send buffer are defined with the MPI_Type_create_hindexed_block. This
>> allows to use a same buffer for several communications without duplicating
>> it (read only).
>> Here the rank 1 will have 2 send buffers (the max of 2, 1, 2, 2).
>> The receiver buffer is a contiguous buffer defined with
>> MPI_Type_contiguous request.
>> Here, the receiver buffer of the rank 1 is of size : 7 (2+1+2+2)
>>
>> This test case succe

Re: [OMPI users] mpi_wtime implementation

2014-11-27 Thread Gilles Gouaillardet
Folks,

one drawback of retrieving time with rdtsc is that this value is core
specific :
if a task is not bound to a core, then the value returned by MPI_Wtime()
might go backward.

if i run the following program with
taskset -c 1 ./time

and then move it accross between cores
(taskset -cp 0  ; taskset -cp 2 ; ...)
then the program can abort. in my environment, i can measure up to 150ms
difference.

/* some mtt tests will abort if this condition is met */


i was unable to observe this behavior with gettimeofday()

/* though it could occur when ntpd synchronizes the clock */

is there any plan to make the timer function selectable via a mca param ?
or to automatically fallback to gettimeofday if a task is not bound on a
core ?

Cheers,

Gilles

$ cat time.c
#include 
#include 

int main (int argc, char *argv[]) {
int i;
double t = 0;
MPI_Init(&argc, &argv);
for (;;) {
double _t = MPI_Wtime();
if (_t < t) {
fprintf(stderr, "going back in time %lf < %lf\n", _t, t);
MPI_Abort(MPI_COMM_WORLD, 1);
}
t = _t;
}
MPI_Finalize();
return 0;
}

On 2014/11/25 1:59, Dave Goodell (dgoodell) wrote:
> On Nov 24, 2014, at 12:06 AM, George Bosilca  wrote:
>
>> https://github.com/open-mpi/ompi/pull/285 is a potential answer. I would 
>> like to hear Dave Goodell comment on this before pushing it upstream.
>>
>>   George.
> I'll take a look at it today.  My notification settings were messed up when 
> you originally CCed me on the PR, so I didn't see this until now.
>
> -Dave
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25863.php



Re: [OMPI users] "default-only MCA variable"?

2014-11-27 Thread Gilles Gouaillardet
It could be because configure did not find the knem headers and hence knem is 
not supported and hence this mca parameter is read-only

My 0.2 us$ ...

Dave Love さんのメール:
>Why can't I set parameters like this (not the only one) with 1.8.3?
>
>  WARNING: A user-supplied value attempted to override the default-only MCA
>  variable named "btl_sm_use_knem".
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25882.php


Re: [OMPI users] Warning about not enough registerable memory on SL6.6

2014-12-08 Thread Gilles Gouaillardet
Folks,

FWIW, i observe a similar behaviour on my system.

imho, the root cause is OFED has been upgraded from a (quite) older
version to latest 3.12 version

here is the relevant part of code (btl_openib.c from the master) :


static uint64_t calculate_max_reg (void)
{
if (0 == stat("/sys/module/mlx4_core/parameters/log_num_mtt",
&statinfo)) {
} else if (0 == stat("/sys/module/ib_mthca/parameters/num_mtt",
&statinfo)) {
mtts_per_seg = 1 <<
read_module_param("/sys/module/ib_mthca/parameters/log_mtts_per_seg", 1);
num_mtt =
read_module_param("/sys/module/ib_mthca/parameters/num_mtt", 1 << 20);
reserved_mtt =
read_module_param("/sys/module/ib_mthca/parameters/fmr_reserved_mtts", 0);

max_reg = (num_mtt - reserved_mtt) * opal_getpagesize () *
mtts_per_seg;
} else if (
(0 == stat("/sys/module/mlx5_core", &statinfo)) ||
(0 == stat("/sys/module/mlx4_core/parameters", &statinfo)) ||
(0 == stat("/sys/module/ib_mthca/parameters", &statinfo))
) {
/* mlx5 means that we have ofed 2.0 and it can always register
2xmem_total for any mlx hca */
max_reg = 2 * mem_total;
} else {
}

/* Print a warning if we can't register more than 75% of physical
   memory.  Abort if the abort_not_enough_reg_mem MCA param was
   set. */
if (max_reg < mem_total * 3 / 4) {
}
return (max_reg * 7) >> 3;
}

with OFED 3.12, the /sys/module/mlx4_core/parameters/log_num_mtt pseudo
file does *not* exist any more
/sys/module/ib_mthca/parameters/num_mtt exists so the second path is taken
and mtts_per_seg is read from
/sys/module/ib_mthca/parameters/log_mtts_per_seg

i noted that log_mtts_per_seg is also a parameter of mlx4_core :
/sys/module/mlx4_core/parameters/log_mtts_per_seg

the value is 3 in ib_mthca (and leads to a warning) but 5 in mlx4_core
(big enough, and does not lead to a warning if this value is read)


i had no time to read the latest ofed doc, so i cannot answer :
- should log_mtts_per_seg be read from mlx4_core instead ?
- is the warning a false positive ?


my only point is this warning *might* be a false positive and the root
cause *might* be calculate_max_reg logic
*could* be wrong with the latest OFED stack.

Could the Mellanox folks comment on this ?

Cheers,

Gilles




On 2014/12/09 3:18, Götz Waschk wrote:
> Hi,
>
> here's another test with openmpi 1.8.3. With 1.8.1, 32GB was detected, now
> it is just 16:
> % mpirun -np 2 /usr/lib64/openmpi-intel/bin/mpitests-osu_get_bw
> --
> WARNING: It appears that your OpenFabrics subsystem is configured to only
> allow registering part of your physical memory.  This can cause MPI jobs to
> run with erratic performance, hang, and/or crash.
>
> This may be caused by your OpenFabrics vendor limiting the amount of
> physical memory that can be registered.  You should investigate the
> relevant Linux kernel module parameters that control how much physical
> memory can be registered, and increase them to allow registering all
> physical memory on your machine.
>
> See this Open MPI FAQ item for more information on these Linux kernel module
> parameters:
>
> http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>
>   Local host:  pax95
>   Registerable memory: 16384 MiB
>   Total memory:49106 MiB
>
> Your MPI job will continue, but may be behave poorly and/or hang.
> --
> # OSU MPI_Get Bandwidth Test v4.3
> # Window creation: MPI_Win_allocate
> # Synchronization: MPI_Win_flush
> # Size  Bandwidth (MB/s)
> 1  28.56
> 2  58.74
>
>
> So it wasn't fixed for RHEL 6.6.
>
> Regards, Götz
>
> On Mon, Dec 8, 2014 at 4:00 PM, Götz Waschk  wrote:
>
>> Hi,
>>
>> I had tested 1.8.4rc1 and it wasn't fixed. I can try again though,
>> maybe I had made an error.
>>
>> Regards, Götz Waschk
>>
>> On Mon, Dec 8, 2014 at 3:17 PM, Joshua Ladd  wrote:
>>> Hi,
>>>
>>> This should be fixed in OMPI 1.8.3. Is it possible for you to give 1.8.3
>> a
>>> shot?
>>>
>>> Best,
>>>
>>> Josh
>>>
>>> On Mon, Dec 8, 2014 at 8:43 AM, Götz Waschk 
>> wrote:
>>>> Dear Open-MPI experts,
>>>>
>>>> I have updated my little cluster from Scientific Linux 6.5 to 6.6,
>>>> this included extensive changes in the Infiniband drivers and a newer
>>>> openmpi version (1.8.1). Now I'm getting this message on

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Gilles Gouaillardet
Luca,

your email mentions openmpi 1.6.5
but gdb output points to openmpi 1.8.1.

could the root cause be a mix of versions that does not occur with root
account ?

which openmpi version are you expecting ?

you can run
pmap 
when your binary is running and/or under gdb to confirm the openmpi library
that is really used

Cheers,

Gilles

On Wed, Dec 10, 2014 at 7:21 PM, Luca Fini  wrote:

> I've a problem running a well tested MPI based application.
>
> The program has been used for years with no problems. Suddenly the
> executable which was run many times with no problems crashed with
> SIGSEGV. The very same executable if run with root privileges works
> OK. The same happens with other executables and across various
> recompilation attempts.
>
> We could not find any relevant difference in the O.S. since a few days
> ago when the program worked also under unprivileged user ID. Actually
> about in the same span of time we changed the GID of the user
> experiencing the fault, but we think this is not relevant because the
> same SIGSEGV happens to another user which was not modified. Moreover
> we cannot see how that change can affect the running executabe (we
> checked all file permissions in the directory tree where the program
> is used).
>
> Running the program under GDB we get the trace reported below. The
> segfault happens at the very beginning during MPI initialization.
>
> We can use the program with sudo, but I'd like to find out what
> happened to go back to "normal" usage.
>
> I'd appreciate any hint on the issue.
>
> Many thanks,
>
>Luca Fini
>
> ==
> Here follows a few environment details:
>
> Program started with: mpirun -debug -debugger gdb  -np 1
>
> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD
>
> OPEN-MPI 1.6.5
>
> Linux 2.6.32-431.29.2.2.6.32-431.29.2.el6.x86_64
>
> Intel fortran Compiler: 2011.7.256
>
> =
> Here follows the stack trace:
>
> Starting program:
>
> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD
>
> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD
> [Thread debugging using libthread_db enabled]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x2af652c7 in mca_base_component_find (directory=0x0,
> type=0x3b914a7fb5 "rte", static_components=0x3b916cb040,
> requested_component_names=0x0, include_mode=128, found_components=0x1,
> open_dso_components=16)
> at mca_base_component_find.c:162
> 162OBJ_CONSTRUCT(found_components, opal_list_t);
> Missing separate debuginfos, use: debuginfo-install
> glibc-2.12-1.149.el6.x86_64 libgcc-4.4.7-11.el6.x86_64
> libgfortran-4.4.7-11.el6.x86_64 libtool-ltdl-2.2.6-15.5.el6.x86_64
> openmpi-1.8.1-1.el6.x86_64
> (gdb) where
> #0  0x2af652c7 in mca_base_component_find (directory=0x0,
> type=0x3b914a7fb5 "rte", static_components=0x3b916cb040,
> requested_component_names=0x0, include_mode=128, found_components=0x1,
> open_dso_components=16)
> at mca_base_component_find.c:162
> #1  0x003b90c4870a in mca_base_framework_components_register ()
> from /usr/lib64/openmpi/lib/libopen-pal.so.6
> #2  0x003b90c48c06 in mca_base_framework_register () from
> /usr/lib64/openmpi/lib/libopen-pal.so.6
> #3  0x003b90c48def in mca_base_framework_open () from
> /usr/lib64/openmpi/lib/libopen-pal.so.6
> #4  0x003b914407e7 in ompi_mpi_init () from
> /usr/lib64/openmpi/lib/libmpi.so.1
> #5  0x003b91463200 in PMPI_Init () from
> /usr/lib64/openmpi/lib/libmpi.so.1
> #6  0x2acd9295 in mpi_init_f (ierr=0x7fffd268) at pinit_f.c:75
> #7  0x005bb159 in MODE_MNH_WORLD::init_nmnh_comm_world
> (kinfo_ll=Cannot access memory at address 0x0
> ) at
> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/MASTER/spll_mode_mnh_world.f90:45
> #8  0x005939d3 in MODE_IO_LL::initio_ll () at
>
> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/MASTER/spll_mode_io_ll.f90:107
> #9  0x0049d02f in prep_pgd () at
>
> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/MASTER/spll_prep_pgd.f90:130
> #10 0x0049cf8c in main ()
>
> --
> Luca Fini.  INAF - Oss. Astrofisico di Arcetri
> L.go E.Fermi, 5. 50125 Firenze. Italy
> Tel: +39 055 2752 307 Fax: +39 055 2752 292
> Skype: l.fini
> Web: http://www.arcetri.inaf.it/~lfini
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2014/12/25945.php
>


Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-11 Thread Gilles Gouaillardet
Luca,

you might want to double check the environment :
env | grep ^OMPI
and the per user config
ls $HOME/.openmpi

Cheers,

Gilles

On 2014/12/11 17:40, Luca Fini wrote:
> Many thanks for the replies.
>
> The mismatch in OpeMPI version is my fault: while writing the request
> for help I looked at the name of the directory where OpenMPI was built
> (I did not build it myself) and did not notice that the name of the
> directory did not reflect the version actually compiled.
>
> I had already checked the ulimits defined for the account where the
> SIGSEGV happens and they seems OK.
>
> Moreover I have a further result: I created a brand new account with
> default privileges and tried to run the program under that one, and it
> works!
>
> I'm still trying to spot out the differences between the two
> unprivileged accounts.
>
> Cheers,
>l.
>
> On Wed, Dec 10, 2014 at 6:12 PM, Gus Correa  wrote:
>> Hi Luca
>>
>> Another possibility that comes to mind,
>> besides mixed versions mentioned by Gilles,
>> is the OS limits.
>> Limits may vary according to the user and user privileges.
>>
>> Large programs tend to require big stacksize (even unlimited),
>> and typically segfault when the stack is not large enough.
>> Max number of open files is yet another hurdle.
>> And if you're using Infinband, the max locked memory size should be
>> unlimited.
>> Check /etc/security/limits.conf and "ulimit -a".
>>
>> I hope this helps,
>> Gus Correa
>>
>> On 12/10/2014 08:28 AM, Gilles Gouaillardet wrote:
>>> Luca,
>>>
>>> your email mentions openmpi 1.6.5
>>> but gdb output points to openmpi 1.8.1.
>>>
>>> could the root cause be a mix of versions that does not occur with root
>>> account ?
>>>
>>> which openmpi version are you expecting ?
>>>
>>> you can run
>>> pmap 
>>> when your binary is running and/or under gdb to confirm the openmpi
>>> library that is really used
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On Wed, Dec 10, 2014 at 7:21 PM, Luca Fini >> <mailto:lf...@arcetri.astro.it>> wrote:
>>>
>>> I've a problem running a well tested MPI based application.
>>>
>>> The program has been used for years with no problems. Suddenly the
>>> executable which was run many times with no problems crashed with
>>> SIGSEGV. The very same executable if run with root privileges works
>>> OK. The same happens with other executables and across various
>>> recompilation attempts.
>>>
>>> We could not find any relevant difference in the O.S. since a few days
>>> ago when the program worked also under unprivileged user ID. Actually
>>> about in the same span of time we changed the GID of the user
>>> experiencing the fault, but we think this is not relevant because the
>>> same SIGSEGV happens to another user which was not modified. Moreover
>>> we cannot see how that change can affect the running executabe (we
>>> checked all file permissions in the directory tree where the program
>>> is used).
>>>
>>> Running the program under GDB we get the trace reported below. The
>>> segfault happens at the very beginning during MPI initialization.
>>>
>>> We can use the program with sudo, but I'd like to find out what
>>> happened to go back to "normal" usage.
>>>
>>> I'd appreciate any hint on the issue.
>>>
>>> Many thanks,
>>>
>>> Luca Fini
>>>
>>> ==
>>> Here follows a few environment details:
>>>
>>> Program started with: mpirun -debug -debugger gdb  -np 1
>>>
>>> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD
>>>
>>> OPEN-MPI 1.6.5
>>>
>>> Linux 2.6.32-431.29.2.2.6.32-431.29.2.el6.x86_64
>>>
>>> Intel fortran Compiler: 2011.7.256
>>>
>>> =
>>> Here follows the stack trace:
>>>
>>> Starting program:
>>>
>>> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_PGD
>>>
>>> /home/lascaux/MNH-V5-1-2/src/dir_obj-LXifortI4-MNH-V5-1-2-OMPI12X-O2/M51b2_OT_2POINT_RH_v1_mod/PREP_

Re: [OMPI users] MPI inside MPI (still)

2014-12-11 Thread Gilles Gouaillardet
Alex,

can you try something like
call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name')

-i start with an empty environment
that being said, you might need to set a few environment variables
manually :
env -i PATH=/bin ...

and that being also said, this "trick" could be just a bad idea :
you might be using a scheduler, and if you empty the environment, the
scheduler
will not be aware of the "inside" run.

on top of that, invoking system might fail depending on the interconnect
you use.

Bottom line, i believe Ralph's reply is still valid, even if five years
have passed :
changing your workflow, or using MPI_Comm_spawn is a much better approach.

Cheers,

Gilles

On 2014/12/12 11:22, Alex A. Schmidt wrote:
> Dear OpenMPI users,
>
> Regarding to this previous post
> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> from 2009,
> I wonder if the reply
> from Ralph Castain is still valid. My need is similar but quite simpler:
> to make a system call from an openmpi fortran application to run a
> third party openmpi application. I don't need to exchange mpi messages
> with the application. I just need to read the resulting output file
> generated
> by it. I have tried to do the following system call from my fortran openmpi
> code:
>
> call system("sh -c 'mpirun -n 2 app_name")
>
> but I get
>
> **
>
> Open MPI does not support recursive calls of mpirun
>
> **
>
> Is there a way to make this work?
>
> Best regards,
>
> Alex
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/25966.php



Re: [OMPI users] MPI inside MPI (still)

2014-12-11 Thread Gilles Gouaillardet
Alex,

just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs
parameter :

   int MPI_Comm_spawn(char *command, char *argv[], int maxprocs,
MPI_Info info,
 int root, MPI_Comm comm, MPI_Comm *intercomm,
 int array_of_errcodes[])

INPUT PARAMETERS
   maxprocs
  - maximum number of processes to start (integer,
significant only at root)

Cheers,

Gilles

On 2014/12/12 12:23, Alex A. Schmidt wrote:
> Hello Gilles,
>
> Thanks for your reply. The "env -i PATH=..." stuff seems to work!!!
>
> call system("sh -c 'env -i PATH=/usr/lib64/openmpi/bin:/bin mpirun -n 2
> hello_world' ")
>
> did produce the expected result with a simple openmi "hello_world" code I
> wrote.
>
> I might be harder though with the real third party app I have in mind. And
> I realize
> getting passed over a job scheduler with this approach might not work at
> all...
>
> I have looked at the MPI_Comm_spawn call but I failed to understand how it
> could help here. For instance, can I use it to launch an mpi app with the
> option "-n 5" ?
>
> Alex
>
> 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet > :
>>
>>  Alex,
>>
>> can you try something like
>> call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name')
>>
>> -i start with an empty environment
>> that being said, you might need to set a few environment variables
>> manually :
>> env -i PATH=/bin ...
>>
>> and that being also said, this "trick" could be just a bad idea :
>> you might be using a scheduler, and if you empty the environment, the
>> scheduler
>> will not be aware of the "inside" run.
>>
>> on top of that, invoking system might fail depending on the interconnect
>> you use.
>>
>> Bottom line, i believe Ralph's reply is still valid, even if five years
>> have passed :
>> changing your workflow, or using MPI_Comm_spawn is a much better approach.
>>
>> Cheers,
>>
>> Gilles
>>
>> On 2014/12/12 11:22, Alex A. Schmidt wrote:
>>
>> Dear OpenMPI users,
>>
>> Regarding to this previous 
>> post<http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> from 2009,
>> I wonder if the reply
>> from Ralph Castain is still valid. My need is similar but quite simpler:
>> to make a system call from an openmpi fortran application to run a
>> third party openmpi application. I don't need to exchange mpi messages
>> with the application. I just need to read the resulting output file
>> generated
>> by it. I have tried to do the following system call from my fortran openmpi
>> code:
>>
>> call system("sh -c 'mpirun -n 2 app_name")
>>
>> but I get
>>
>> **
>>
>> Open MPI does not support recursive calls of mpirun
>>
>> **
>>
>> Is there a way to make this work?
>>
>> Best regards,
>>
>> Alex
>>
>>
>>
>>
>> ___
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/12/25966.php
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/12/25967.php
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/25968.php



Re: [OMPI users] MPI inside MPI (still)

2014-12-11 Thread Gilles Gouaillardet
Alex,

just to make sure ...
this is the behavior you expected, right ?

Cheers,

Gilles

On 2014/12/12 13:27, Alex A. Schmidt wrote:
> Gilles,
>
> Ok, very nice!
>
> When I excute
>
> do rank=1,3
> call  MPI_Comm_spawn('hello_world','
> ',5,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status)
> enddo
>
> I do get 15 instances of the 'hello_world' app running: 5 for each parent
> rank 1, 2 and 3.
>
> Thanks a lot, Gilles.
>
> Best regargs,
>
> Alex
>
>
>
>
> 2014-12-12 1:32 GMT-02:00 Gilles Gouaillardet > :
>>
>>  Alex,
>>
>> just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs
>> parameter :
>>
>>int MPI_Comm_spawn(char *command, char *argv[], int maxprocs,
>> MPI_Info info,
>>  int root, MPI_Comm comm, MPI_Comm *intercomm,
>>  int array_of_errcodes[])
>>
>> INPUT PARAMETERS
>>maxprocs
>>   - maximum number of processes to start (integer, significant
>> only at root)
>>
>> Cheers,
>>
>> Gilles
>>
>>
>> On 2014/12/12 12:23, Alex A. Schmidt wrote:
>>
>> Hello Gilles,
>>
>> Thanks for your reply. The "env -i PATH=..." stuff seems to work!!!
>>
>> call system("sh -c 'env -i PATH=/usr/lib64/openmpi/bin:/bin mpirun -n 2
>> hello_world' ")
>>
>> did produce the expected result with a simple openmi "hello_world" code I
>> wrote.
>>
>> I might be harder though with the real third party app I have in mind. And
>> I realize
>> getting passed over a job scheduler with this approach might not work at
>> all...
>>
>> I have looked at the MPI_Comm_spawn call but I failed to understand how it
>> could help here. For instance, can I use it to launch an mpi app with the
>> option "-n 5" ?
>>
>> Alex
>>
>> 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet >
>>  :
>>
>>  Alex,
>>
>> can you try something like
>> call system(sh -c 'env -i /.../mpirun -np 2 /.../app_name')
>>
>> -i start with an empty environment
>> that being said, you might need to set a few environment variables
>> manually :
>> env -i PATH=/bin ...
>>
>> and that being also said, this "trick" could be just a bad idea :
>> you might be using a scheduler, and if you empty the environment, the
>> scheduler
>> will not be aware of the "inside" run.
>>
>> on top of that, invoking system might fail depending on the interconnect
>> you use.
>>
>> Bottom line, i believe Ralph's reply is still valid, even if five years
>> have passed :
>> changing your workflow, or using MPI_Comm_spawn is a much better approach.
>>
>> Cheers,
>>
>> Gilles
>>
>> On 2014/12/12 11:22, Alex A. Schmidt wrote:
>>
>> Dear OpenMPI users,
>>
>> Regarding to this previous 
>> post<http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
>> <http://www.open-mpi.org/community/lists/users/2009/06/9560.php> from 2009,
>> I wonder if the reply
>> from Ralph Castain is still valid. My need is similar but quite simpler:
>> to make a system call from an openmpi fortran application to run a
>> third party openmpi application. I don't need to exchange mpi messages
>> with the application. I just need to read the resulting output file
>> generated
>> by it. I have tried to do the following system call from my fortran openmpi
>> code:
>>
>> call system("sh -c 'mpirun -n 2 app_name")
>>
>> but I get
>>
>> **
>>
>> Open MPI does not support recursive calls of mpirun
>>
>> **
>>
>> Is there a way to make this work?
>>
>> Best regards,
>>
>> Alex
>>
>>
>>
>>
>> ___
>> users mailing listus...@open-mpi.org
>>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/12/25966.php
>>
>>
>>
>> ___
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this 
>> post:http://www.open-mpi.org/community/lists/users/2014/12/25967.php
>>
>>
>>
>> ___
>> users mailing listus...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/12/25968.php
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2014/12/25969.php
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/25970.php



Re: [OMPI users] OMPI users] MPI inside MPI (still)

2014-12-12 Thread Gilles Gouaillardet
Alex,

You need MPI_Comm_disconnect at least.
I am not sure if this is 100% correct nor working.

If you are using third party apps, why dont you do something like
system("env -i qsub ...")
with the right options to make qsub blocking or you manually wait for the end 
of the job ?

That looks like a much cleaner and simpler approach to me.

Cheers,

Gilles

"Alex A. Schmidt"  wrote:
>Hello Gilles,
>
>Ok, I believe I have a simple toy app running as I think it should:
>'n' parent processes running under mpi_comm_world, each one
>
>spawning its own 'm' child processes (each child group work 
>together nicely, returning the expected result for an mpi_allreduce call).
>
>Now, as I mentioned before, the apps I want to run in the spawned 
>
>processes are third party mpi apps and I don't think it will be possible 
>to exchange messages with them from my app. So, I do I tell 
>when the spawned processes have finnished running? All I have to work
>
>with is the intercommunicator returned from the mpi_comm_spawn call...
>
>
>Alex
>
>
>
>
>
>2014-12-12 2:42 GMT-02:00 Alex A. Schmidt :
>
>Gilles,
>
>Well, yes, I guess
>
>I'll do tests with the real third party apps and let you know.
>
>These are huge quantum chemistry codes (dftb+, siesta and Gaussian)
>
>which greatly benefits from a parallel environment. My code is just
>a front end to use those, but since we have a lot of data to process
>
>it also benefits from a parallel environment. 
>
>
>Alex
>
> 
>
>
>2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet :
>
>Alex,
>
>just to make sure ...
>this is the behavior you expected, right ?
>
>Cheers,
>
>Gilles
>
>
>
>On 2014/12/12 13:27, Alex A. Schmidt wrote:
>
>Gilles, Ok, very nice! When I excute do rank=1,3 call 
>MPI_Comm_spawn('hello_world',' 
>',5,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status) 
>enddo I do get 15 instances of the 'hello_world' app running: 5 for each 
>parent rank 1, 2 and 3. Thanks a lot, Gilles. Best regargs, Alex 2014-12-12 
>1:32 GMT-02:00 Gilles Gouaillardet 
>: Alex, just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs 
>parameter : int MPI_Comm_spawn(char *command, char *argv[], int maxprocs, 
>MPI_Info info, int root, MPI_Comm comm, MPI_Comm *intercomm, int 
>array_of_errcodes[]) INPUT PARAMETERS maxprocs - maximum number of processes 
>to start (integer, significant only at root) Cheers, Gilles On 2014/12/12 
>12:23, Alex A. Schmidt wrote: Hello Gilles, Thanks for your reply. The "env -i 
>PATH=..." stuff seems to work!!! call system("sh -c 'env -i 
>PATH=/usr/lib64/openmpi/bin:/bin mpirun -n 2 hello_world' ") did produce the 
>expected result with a simple openmi "hello_world" code I wrote. I might be 
>harder though with the real third party app I have in mind. And I realize 
>getting passed over a job scheduler with this approach might not work at 
>all... I have looked at the MPI_Comm_spawn call but I failed to understand how 
>it could help here. For instance, can I use it to launch an mpi app with the 
>option "-n 5" ? Alex 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet 
>
>: Alex, can you try something like call system(sh -c 'env -i /.../mpirun -np 2 
>/.../app_name') -i start with an empty environment that being said, you might 
>need to set a few environment variables manually : env -i PATH=/bin ... and 
>that being also said, this "trick" could be just a bad idea : you might be 
>using a scheduler, and if you empty the environment, the scheduler will not be 
>aware of the "inside" run. on top of that, invoking system might fail 
>depending on the interconnect you use. Bottom line, i believe Ralph's reply is 
>still valid, even if five years have passed : changing your workflow, or using 
>MPI_Comm_spawn is a much better approach. Cheers, Gilles On 2014/12/12 11:22, 
>Alex A. Schmidt wrote: Dear OpenMPI users, Regarding to this previous 
>post<http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
><http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
><http://www.open-mpi.org/community/lists/users/2009/06/9560.php> 
><http://www.open-mpi.org/community/lists/users/2009/06/9560.php> from 2009, I 
>wonder if the reply from Ralph Castain is still valid. My need is similar but 
>quite simpler: to make a system call from an openmpi fortran application to 
>run a third party openmpi application. I don't need to exchange mpi messages 
>with the application. I just need to read the resulting output file generated 
>by it. I have tried to do 

Re: [OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Gilles Gouaillardet
George is right about the semantic

However i am surprised it returns immediatly...
That should either work or hang imho

The second point is no more mpi related, and is batch manager specific.

You will likely find a submit parameter to make the command block until the job 
completes. Or you can write your own wrapper.
Or you can retrieve the jobid and qstat periodically to get the job state.
If an api is available, this is also an option.

Cheers,

Gilles

George Bosilca  wrote:
>You have to call MPI_Comm_disconnect on both sides of the intercommunicator. 
>On the spawner processes you should call it on the intercom, while on the 
>spawnees you should call it on the MPI_Comm_get_parent.
>
>
>  George.
>
>
>On Dec 12, 2014, at 20:43 , Alex A. Schmidt  wrote:
>
>
>Gilles,
>
>MPI_comm_disconnect seem to work but not quite.
>
>The call to it returns almost immediatly while
>
>the spawn processes keep piling up in the background
>
>until they are all done...
>
>I think system('env -i qsub...') to launch the third party apps
>
>would take the execution of every call back to the scheduler 
>queue. How would I track each one for their completion?
>
>Alex
>
>
>2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet :
>
>Alex,
>
>You need MPI_Comm_disconnect at least.
>I am not sure if this is 100% correct nor working.
>
>If you are using third party apps, why dont you do something like
>system("env -i qsub ...")
>with the right options to make qsub blocking or you manually wait for the end 
>of the job ?
>
>That looks like a much cleaner and simpler approach to me.
>
>Cheers,
>
>Gilles
>
>"Alex A. Schmidt"  wrote:
>
>Hello Gilles,
>
>Ok, I believe I have a simple toy app running as I think it should:
>'n' parent processes running under mpi_comm_world, each one
>
>spawning its own 'm' child processes (each child group work 
>together nicely, returning the expected result for an mpi_allreduce call).
>
>Now, as I mentioned before, the apps I want to run in the spawned 
>
>processes are third party mpi apps and I don't think it will be possible 
>to exchange messages with them from my app. So, I do I tell 
>when the spawned processes have finnished running? All I have to work
>
>with is the intercommunicator returned from the mpi_comm_spawn call...
>
>
>Alex
>
>
>
>
>
>2014-12-12 2:42 GMT-02:00 Alex A. Schmidt :
>
>Gilles,
>
>Well, yes, I guess
>
>I'll do tests with the real third party apps and let you know.
>
>These are huge quantum chemistry codes (dftb+, siesta and Gaussian)
>
>which greatly benefits from a parallel environment. My code is just
>a front end to use those, but since we have a lot of data to process
>
>it also benefits from a parallel environment. 
>
>
>Alex
>
> 
>
>
>2014-12-12 2:30 GMT-02:00 Gilles Gouaillardet :
>
>Alex,
>
>just to make sure ...
>this is the behavior you expected, right ?
>
>Cheers,
>
>Gilles
>
>
>
>On 2014/12/12 13:27, Alex A. Schmidt wrote:
>
>Gilles, Ok, very nice! When I excute do rank=1,3 call 
>MPI_Comm_spawn('hello_world',' 
>',5,MPI_INFO_NULL,rank,MPI_COMM_WORLD,my_intercomm,MPI_ERRCODES_IGNORE,status) 
>enddo I do get 15 instances of the 'hello_world' app running: 5 for each 
>parent rank 1, 2 and 3. Thanks a lot, Gilles. Best regargs, Alex 2014-12-12 
>1:32 GMT-02:00 Gilles Gouaillardet 
>: Alex, just ask MPI_Comm_spawn to start (up to) 5 tasks via the maxprocs 
>parameter : int MPI_Comm_spawn(char *command, char *argv[], int maxprocs, 
>MPI_Info info, int root, MPI_Comm comm, MPI_Comm *intercomm, int 
>array_of_errcodes[]) INPUT PARAMETERS maxprocs - maximum number of processes 
>to start (integer, significant only at root) Cheers, Gilles On 2014/12/12 
>12:23, Alex A. Schmidt wrote: Hello Gilles, Thanks for your reply. The "env -i 
>PATH=..." stuff seems to work!!! call system("sh -c 'env -i 
>PATH=/usr/lib64/openmpi/bin:/bin mpirun -n 2 hello_world' ") did produce the 
>expected result with a simple openmi "hello_world" code I wrote. I might be 
>harder though with the real third party app I have in mind. And I realize 
>getting passed over a job scheduler with this approach might not work at 
>all... I have looked at the MPI_Comm_spawn call but I failed to understand how 
>it could help here. For instance, can I use it to launch an mpi app with the 
>option "-n 5" ? Alex 2014-12-12 0:36 GMT-02:00 Gilles Gouaillardet 
>
>: Alex, can you try something like call system(sh -c 'env -i /.../mpirun -np 2 
>/.../app_name') -i start with an empty

Re: [OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-13 Thread Gilles Gouaillardet
Alex,

Are you calling MPI_Comm_disconnect in the 3 "master" tasks and with the same 
remote communicator ?

I also read the man page again, and MPI_Comm_disconnect does not ensure the 
remote processes have finished or called MPI_Comm_disconnect, so that might not 
be the thing you need.
George, can you please comment on that ?

Cheers,

Gilles

George Bosilca  wrote:
>MPI_Comm_disconnect should be a local operation, there is no reason for it to 
>deadlock. I looked at the code and everything is local with the exception of a 
>call to PMIX.FENCE. Can you attach to your deadlocked processes and confirm 
>that they are stopped in the pmix.fence?
>
>
>  George.
>
>
>
>On Sat, Dec 13, 2014 at 8:47 AM, Alex A. Schmidt  wrote:
>
>Hi
>
>Sorry, I was calling mpi_comm_disconnect on the group comm handler, not
>on the intercomm handler returned from the spawn call as it should be.
>
>Well, calling the disconnect on the intercomm handler does halt the spwaner
>side but the wait is never completed since, as George points out, there is no
>disconnect call being made on the spawnee side and that brings me back
>to the beginning of the problem since, being a third party app, that call would
>never be there. I guess an mpi wrapper to deal with that could be made for
>the app, but I fell the wrapper itself, at the end, would face the same problem
>we face right now.
>
>My application is a genetic algorithm code that search optimal configuration
>(minimum or maximum energy) of cluster of atoms. The work flow bottleneck
>is the calculation of the cluster energy. For the cases which an analytical
>potential is available the calculation can be made internally and the workload
>is distributed among slaves nodes from a master node. This is also done
>when an analytical potential is not available and the energy calculation must
>be done externally by a quantum chemistry code like dftb+, siesta and Gaussian.
>So far, we have been running these codes in serial mode. No need to say that
>we could do a lot better if they could be executed in parallel.
>
>I am not familiar with DMRAA but it seems to be the right choice to deal with
>job schedulers as it covers the ones I am interested in (pbs/torque and 
>loadlever).
>
>Alex
>
>
>2014-12-13 7:49 GMT-02:00 Gilles Gouaillardet :
>
>George is right about the semantic
>
>However i am surprised it returns immediatly...
>That should either work or hang imho
>
>The second point is no more mpi related, and is batch manager specific.
>
>You will likely find a submit parameter to make the command block until the 
>job completes. Or you can write your own wrapper.
>Or you can retrieve the jobid and qstat periodically to get the job state.
>If an api is available, this is also an option.
>
>Cheers,
>
>Gilles
>
>George Bosilca  wrote:
>You have to call MPI_Comm_disconnect on both sides of the intercommunicator. 
>On the spawner processes you should call it on the intercom, while on the 
>spawnees you should call it on the MPI_Comm_get_parent.
>
>
>  George.
>
>
>On Dec 12, 2014, at 20:43 , Alex A. Schmidt  wrote:
>
>
>Gilles,
>
>MPI_comm_disconnect seem to work but not quite.
>
>The call to it returns almost immediatly while
>
>the spawn processes keep piling up in the background
>
>until they are all done...
>
>I think system('env -i qsub...') to launch the third party apps
>
>would take the execution of every call back to the scheduler 
>queue. How would I track each one for their completion?
>
>Alex
>
>
>2014-12-12 22:35 GMT-02:00 Gilles Gouaillardet :
>
>Alex,
>
>You need MPI_Comm_disconnect at least.
>I am not sure if this is 100% correct nor working.
>
>If you are using third party apps, why dont you do something like
>system("env -i qsub ...")
>with the right options to make qsub blocking or you manually wait for the end 
>of the job ?
>
>That looks like a much cleaner and simpler approach to me.
>
>Cheers,
>
>Gilles
>
>"Alex A. Schmidt"  wrote:
>
>Hello Gilles,
>
>Ok, I believe I have a simple toy app running as I think it should:
>'n' parent processes running under mpi_comm_world, each one
>
>spawning its own 'm' child processes (each child group work 
>together nicely, returning the expected result for an mpi_allreduce call).
>
>Now, as I mentioned before, the apps I want to run in the spawned 
>
>processes are third party mpi apps and I don't think it will be possible 
>to exchange messages with them from my app. So, I do I tell 
>when the spawned processes have finnished running? All I have to work
>
>with is the intercommunicator returned from th

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Gilles Gouaillardet
Eric,

can you make your test case (source + input file + howto) available so i
can try to reproduce and fix this ?

Based on the stack trace, i assume this is a complete end user application.
have you tried/been able to reproduce the same kind of crash with a
trimmed test program ?

BTW, what kind of filesystem is hosting Resultats.Eta1 ? (e.g. ext4 /
nfs / lustre / other)

Cheers,

Gilles

On 2014/12/15 4:06, Eric Chamberland wrote:
> Hi,
>
> I finally (thanks for fixing oversubscribing) tested with 1.8.4rc3 for
> my problem with collective MPI I/O.
>
> A problem still there.  In this 2 processes example, process rank 1
> dies with segfault while process rank 0 wait indefinitely...
>
> Running with valgrind, I found these errors which may gives hints:
>
> *
> Rank 1:
> *
> On process rank 1, without valgrind it ends with either a segmentation
> violation or memory corruption or invalide free without valgrind).
>
> But running with valgrind, it tells:
>
> ==16715== Invalid write of size 2
> ==16715==at 0x4C2E793: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
> ==16715==by 0x1F60AA91: opal_convertor_unpack (opal_convertor.c:321)
> ==16715==by 0x25AA8DD3: mca_pml_ob1_recv_frag_callback_match
> (pml_ob1_recvfrag.c:225)
> ==16715==by 0x2544110C: mca_btl_vader_check_fboxes
> (btl_vader_fbox.h:220)
> ==16715==by 0x25443577: mca_btl_vader_component_progress
> (btl_vader_component.c:695)
> ==16715==by 0x1F5F0F27: opal_progress (opal_progress.c:207)
> ==16715==by 0x1ACB40B3: opal_condition_wait (condition.h:93)
> ==16715==by 0x1ACB4201: ompi_request_wait_completion (request.h:381)
> ==16715==by 0x1ACB4305: ompi_request_default_wait (req_wait.c:39)
> ==16715==by 0x26BA2FFB: ompi_coll_tuned_bcast_intra_generic
> (coll_tuned_bcast.c:254)
> ==16715==by 0x26BA36F7: ompi_coll_tuned_bcast_intra_binomial
> (coll_tuned_bcast.c:385)
> ==16715==by 0x26B94289: ompi_coll_tuned_bcast_intra_dec_fixed
> (coll_tuned_decision_fixed.c:258)
> ==16715==by 0x1ACD55F2: PMPI_Bcast (pbcast.c:110)
> ==16715==by 0x2FE1CC48: ADIOI_Shfp_fname (shfp_fname.c:67)
> ==16715==by 0x2FDEB493: mca_io_romio_dist_MPI_File_open (open.c:177)
> ==16715==by 0x2FDE3B0D: mca_io_romio_file_open
> (io_romio_file_open.c:40)
> ==16715==by 0x1AD52344: module_init (io_base_file_select.c:455)
> ==16715==by 0x1AD51DFA: mca_io_base_file_select
> (io_base_file_select.c:238)
> ==16715==by 0x1ACA582F: ompi_file_open (file.c:130)
> ==16715==by 0x1AD30DA3: PMPI_File_open (pfile_open.c:94)
> ==16715==by 0x13F9B36F:
> PAIO::ouvreFichierMPIIO(PAGroupeProcessus&, std::string const&, int,
> ompi_file_t*&, bool) (PAIO.cc:290)
> ==16715==by 0xCA44252:
> GISLectureEcriture::litGISMPI(std::string,
> GroupeInfoSur&, std::string&) (GISLectureEcriture.icc:411)
> ==16715==by 0xCA23F0D: Champ::importeParallele(std::string const&)
> (Champ.cc:951)
> ==16715==by 0x4D0DEE: main (Test.NormesEtProjectionChamp.cc:789)
> ==16715==  Address 0x32ef3e50 is 0 bytes after a block of size 256
> alloc'd
> ==16715==at 0x4C2C5A4: malloc (vg_replace_malloc.c:296)
> ==16715==by 0x2FE1C78E: ADIOI_Malloc_fn (malloc.c:50)
> ==16715==by 0x2FE1C951: ADIOI_Shfp_fname (shfp_fname.c:25)
> ==16715==by 0x2FDEB493: mca_io_romio_dist_MPI_File_open (open.c:177)
> ==16715==by 0x2FDE3B0D: mca_io_romio_file_open
> (io_romio_file_open.c:40)
> ==16715==by 0x1AD52344: module_init (io_base_file_select.c:455)
> ==16715==by 0x1AD51DFA: mca_io_base_file_select
> (io_base_file_select.c:238)
> ==16715==by 0x1ACA582F: ompi_file_open (file.c:130)
> ==16715==by 0x1AD30DA3: PMPI_File_open (pfile_open.c:94)
> ==16715==by 0x13F9B36F:
> PAIO::ouvreFichierMPIIO(PAGroupeProcessus&, std::string const&, int,
> ompi_file_t*&, bool) (PAIO.cc:290)
> ==16715==by 0xCA44252:
> GISLectureEcriture::litGISMPI(std::string,
> GroupeInfoSur&, std::string&) (GISLectureEcriture.icc:411)
> ==16715==by 0xCA23F0D: Champ::importeParallele(std::string const&)
> (Champ.cc:951)
> ==16715==by 0x4D0DEE: main (Test.NormesEtProjectionChamp.cc:789)
> ...
> ...
> ==16715== Invalid write of size 1
> ==16715==at 0x4C2E7BB: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
> ==16715==by 0x1F60AA91: opal_convertor_unpack (opal_convertor.c:321)
> ==16715==by 0x25AA8DD3: mca_pml_ob1_recv_frag_callback_match
> (pml_ob1_recvfrag.c:225)
> ==16715==by 0x2544110C: mca_btl_vader_check_fboxes
> (btl_vader_fbox.h:220)
> ==16715==by 0x25443577: mca_btl_vader_component_progress
> (btl_vader_compo

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Gilles Gouaillardet
Eric,

i checked the source code (v1.8) and the limit for the shared_fp_fname
is 256 (hard coded).

i am now checking if the overflow is correctly detected (that could
explain the one byte overflow reported by valgrind)

Cheers,

Gilles

On 2014/12/15 11:52, Eric Chamberland wrote:
> Hi again,
>
> some new hints that might help:
>
> 1- With valgrind : If I run the same test case, same data, but
> moved to a shorter path+filename, then *valgrind* does *not*
> complains!!
> 2- Without valgrind: *Sometimes*, the test case with long
> path+filename passes without "segfaulting"!
> 3- It seems to happen at the fourth file I try to open using the
> following described procedure:
>
> Also, I was wondering about this: In this 2 processes test case
> (running in the same node), I :
>
> 1- open the file collectively (which resides on the same ssd drive on
> my computer)
> 2-  MPI_File_read_at_all a long int and 3 chars (11 bytes)
> 3- stop (because I detect I am not reading my MPIIO file format)
> 4- close the file
>
> A guess (FWIW): Can process rank 0, for example close the file too
> quickly, which destroys the string reserved for the filename that is
> used by process rank 1 which could be using shared memory on the same
> node?
>
> Thanks,
>
> Eric
>
> On 12/14/2014 02:06 PM, Eric Chamberland wrote:
>> Hi,
>>
>> I finally (thanks for fixing oversubscribing) tested with 1.8.4rc3 for
>> my problem with collective MPI I/O.
>>
>> A problem still there.  In this 2 processes example, process rank 1
>> dies with segfault while process rank 0 wait indefinitely...
>>
>> Running with valgrind, I found these errors which may gives hints:
>>
>> *
>> Rank 1:
>> *
>> On process rank 1, without valgrind it ends with either a segmentation
>> violation or memory corruption or invalide free without valgrind).
>>
>> But running with valgrind, it tells:
>>
>> ==16715== Invalid write of size 2
>> ==16715==at 0x4C2E793: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
>> ==16715==by 0x1F60AA91: opal_convertor_unpack (opal_convertor.c:321)
>> ==16715==by 0x25AA8DD3: mca_pml_ob1_recv_frag_callback_match
>> (pml_ob1_recvfrag.c:225)
>> ==16715==by 0x2544110C: mca_btl_vader_check_fboxes
>> (btl_vader_fbox.h:220)
>> ==16715==by 0x25443577: mca_btl_vader_component_progress
>> (btl_vader_component.c:695)
>> ==16715==by 0x1F5F0F27: opal_progress (opal_progress.c:207)
>> ==16715==by 0x1ACB40B3: opal_condition_wait (condition.h:93)
>> ==16715==by 0x1ACB4201: ompi_request_wait_completion (request.h:381)
>> ==16715==by 0x1ACB4305: ompi_request_default_wait (req_wait.c:39)
>> ==16715==by 0x26BA2FFB: ompi_coll_tuned_bcast_intra_generic
>> (coll_tuned_bcast.c:254)
>> ==16715==by 0x26BA36F7: ompi_coll_tuned_bcast_intra_binomial
>> (coll_tuned_bcast.c:385)
>> ==16715==by 0x26B94289: ompi_coll_tuned_bcast_intra_dec_fixed
>> (coll_tuned_decision_fixed.c:258)
>> ==16715==by 0x1ACD55F2: PMPI_Bcast (pbcast.c:110)
>> ==16715==by 0x2FE1CC48: ADIOI_Shfp_fname (shfp_fname.c:67)
>> ==16715==by 0x2FDEB493: mca_io_romio_dist_MPI_File_open (open.c:177)
>> ==16715==by 0x2FDE3B0D: mca_io_romio_file_open
>> (io_romio_file_open.c:40)
>> ==16715==by 0x1AD52344: module_init (io_base_file_select.c:455)
>> ==16715==by 0x1AD51DFA: mca_io_base_file_select
>> (io_base_file_select.c:238)
>> ==16715==by 0x1ACA582F: ompi_file_open (file.c:130)
>> ==16715==by 0x1AD30DA3: PMPI_File_open (pfile_open.c:94)
>> ==16715==by 0x13F9B36F:
>> PAIO::ouvreFichierMPIIO(PAGroupeProcessus&, std::string const&, int,
>> ompi_file_t*&, bool) (PAIO.cc:290)
>> ==16715==by 0xCA44252:
>> GISLectureEcriture::litGISMPI(std::string,
>> GroupeInfoSur&, std::string&) (GISLectureEcriture.icc:411)
>> ==16715==by 0xCA23F0D: Champ::importeParallele(std::string const&)
>> (Champ.cc:951)
>> ==16715==by 0x4D0DEE: main (Test.NormesEtProjectionChamp.cc:789)
>> ==16715==  Address 0x32ef3e50 is 0 bytes after a block of size 256
>> alloc'd
>> ==16715==at 0x4C2C5A4: malloc (vg_replace_malloc.c:296)
>> ==16715==by 0x2FE1C78E: ADIOI_Malloc_fn (malloc.c:50)
>> ==16715==by 0x2FE1C951: ADIOI_Shfp_fname (shfp_fname.c:25)
>> ==16715==by 0x2FDEB493: mca_io_romio_dist_MPI_File_open (open.c:177)
>> ==16715==by 0x2FDE3B0D: mca_io_romio_file_open
>> (io_romio_file_open.c:40)
>> ==

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Gilles Gouaillardet
Eric,

here is a patch for the v1.8 series, it fixes a one byte overflow.

valgrind should stop complaining, and assuming this is the root cause of
the memory corruption,
that could also fix your program.

that being said, shared_fp_fname is limited to 255 characters (this is
hard coded) so even if
it gets truncated to 255 characters (instead of 256), the behavior could
be kind of random.

/* from ADIOI_Shfp_fname :
  If the real file is /tmp/thakur/testfile, the shared-file-pointer
   file will be /tmp/thakur/.testfile.shfp., where  is

FWIW,  is a random number that takes between 1 and 10 characters

could you please give this patch a try and let us know the results ?

Cheers,

Gilles

On 2014/12/15 11:52, Eric Chamberland wrote:
> Hi again,
>
> some new hints that might help:
>
> 1- With valgrind : If I run the same test case, same data, but
> moved to a shorter path+filename, then *valgrind* does *not*
> complains!!
> 2- Without valgrind: *Sometimes*, the test case with long
> path+filename passes without "segfaulting"!
> 3- It seems to happen at the fourth file I try to open using the
> following described procedure:
>
> Also, I was wondering about this: In this 2 processes test case
> (running in the same node), I :
>
> 1- open the file collectively (which resides on the same ssd drive on
> my computer)
> 2-  MPI_File_read_at_all a long int and 3 chars (11 bytes)
> 3- stop (because I detect I am not reading my MPIIO file format)
> 4- close the file
>
> A guess (FWIW): Can process rank 0, for example close the file too
> quickly, which destroys the string reserved for the filename that is
> used by process rank 1 which could be using shared memory on the same
> node?
>
> Thanks,
>
> Eric
>
> On 12/14/2014 02:06 PM, Eric Chamberland wrote:
>> Hi,
>>
>> I finally (thanks for fixing oversubscribing) tested with 1.8.4rc3 for
>> my problem with collective MPI I/O.
>>
>> A problem still there.  In this 2 processes example, process rank 1
>> dies with segfault while process rank 0 wait indefinitely...
>>
>> Running with valgrind, I found these errors which may gives hints:
>>
>> *
>> Rank 1:
>> *
>> On process rank 1, without valgrind it ends with either a segmentation
>> violation or memory corruption or invalide free without valgrind).
>>
>> But running with valgrind, it tells:
>>
>> ==16715== Invalid write of size 2
>> ==16715==at 0x4C2E793: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
>> ==16715==by 0x1F60AA91: opal_convertor_unpack (opal_convertor.c:321)
>> ==16715==by 0x25AA8DD3: mca_pml_ob1_recv_frag_callback_match
>> (pml_ob1_recvfrag.c:225)
>> ==16715==by 0x2544110C: mca_btl_vader_check_fboxes
>> (btl_vader_fbox.h:220)
>> ==16715==by 0x25443577: mca_btl_vader_component_progress
>> (btl_vader_component.c:695)
>> ==16715==by 0x1F5F0F27: opal_progress (opal_progress.c:207)
>> ==16715==by 0x1ACB40B3: opal_condition_wait (condition.h:93)
>> ==16715==by 0x1ACB4201: ompi_request_wait_completion (request.h:381)
>> ==16715==by 0x1ACB4305: ompi_request_default_wait (req_wait.c:39)
>> ==16715==by 0x26BA2FFB: ompi_coll_tuned_bcast_intra_generic
>> (coll_tuned_bcast.c:254)
>> ==16715==by 0x26BA36F7: ompi_coll_tuned_bcast_intra_binomial
>> (coll_tuned_bcast.c:385)
>> ==16715==by 0x26B94289: ompi_coll_tuned_bcast_intra_dec_fixed
>> (coll_tuned_decision_fixed.c:258)
>> ==16715==by 0x1ACD55F2: PMPI_Bcast (pbcast.c:110)
>> ==16715==by 0x2FE1CC48: ADIOI_Shfp_fname (shfp_fname.c:67)
>> ==16715==by 0x2FDEB493: mca_io_romio_dist_MPI_File_open (open.c:177)
>> ==16715==by 0x2FDE3B0D: mca_io_romio_file_open
>> (io_romio_file_open.c:40)
>> ==16715==by 0x1AD52344: module_init (io_base_file_select.c:455)
>> ==16715==by 0x1AD51DFA: mca_io_base_file_select
>> (io_base_file_select.c:238)
>> ==16715==by 0x1ACA582F: ompi_file_open (file.c:130)
>> ==16715==by 0x1AD30DA3: PMPI_File_open (pfile_open.c:94)
>> ==16715==by 0x13F9B36F:
>> PAIO::ouvreFichierMPIIO(PAGroupeProcessus&, std::string const&, int,
>> ompi_file_t*&, bool) (PAIO.cc:290)
>> ==16715==by 0xCA44252:
>> GISLectureEcriture::litGISMPI(std::string,
>> GroupeInfoSur&, std::string&) (GISLectureEcriture.icc:411)
>> ==16715==by 0xCA23F0D: Champ::importeParallele(std::string const&)
>> (Champ.cc:951)
>> ==16715==by 0x4D0DEE: main (Test.NormesEtProjectionChamp.cc:789)
>> ==16715==  Address 0x32ef3e50 is 0 byte

Re: [OMPI users] ERROR: C_FUNLOC function

2014-12-15 Thread Gilles Gouaillardet
Hi Siegmar,

a similar issue was reported in mpich with xlf compilers :
http://trac.mpich.org/projects/mpich/ticket/2144

They concluded this is a compiler issue (e.g. the compiler does not
implement TS 29113 subclause 8.1)


Jeff,
i made PR 315 https://github.com/open-mpi/ompi/pull/315
f08 binding support is disabled if TS29113 subclause 8.1 is not implemented
could you please review/comment on this ?


Cheers,

Gilles


On 2014/12/12 2:28, Siegmar Gross wrote:
> Hi Jeff,
>
> will you have the time to fix the Fortran problem for the new Oracle
> Solaris Studio 12.4 compiler suite in openmpi-1.8.4?
>
> tyr openmpi-1.8.4rc2-SunOS.sparc.64_cc 103 tail -15 
> log.make.SunOS.sparc.64_cc 
>   PPFC comm_compare_f08.lo
>   PPFC comm_connect_f08.lo
>   PPFC comm_create_errhandler_f08.lo
>
>fn = c_funloc(comm_errhandler_fn)
>  ^   
> "../../../../../openmpi-1.8.4rc2/ompi/mpi/fortran/use-mpi-f08/comm_create_errhan
> dler_f08.F90", Line = 22, Column = 18: ERROR: C_FUNLOC function argument must 
> be 
> a procedure that is interoperable or a procedure pointer associated with an 
> interoperable procedure.
> ...
>
>
> Kind regards
>
> Siegmar
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/25963.php



Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-15 Thread Gilles Gouaillardet
Eric,

thanks for the simple test program.

i think i see what is going wrong and i will make some changes to avoid
the memory overflow.

that being said, there is a hard coded limit of 256 characters, and your
path is bigger than 300 characters.
bottom line, and even if there is no more memory overflow, that cannot
work as expected.

i will report this to the mpich folks, since romio is currently imported
from mpich.

Cheers,

Gilles

On 2014/12/16 0:16, Eric Chamberland wrote:
> Hi Gilles,
>
> just created a very simple test case!
>
> with this setup, you will see the bug with valgrind:
>
> export
> too_long=./this/is/a_very/long/path/that/contains/a/not/so/long/filename/but/trying/to/collectively/mpi_file_open/it/you/will/have/a/memory/corruption/resulting/of/invalide/writing/or/reading/past/the/end/of/one/or/some/hidden/strings/in/mpio/Simple/user/would/like/to/have/the/parameter/checked/and/an/error/returned/or/this/limit/removed
>
> mpicc -o bug_MPI_File_open_path_too_long
> bug_MPI_File_open_path_too_long.c
>
> mkdir -p $too_long
> echo "header of a text file" > $too_long/toto.txt
>
> mpirun -np 2 valgrind ./bug_MPI_File_open_path_too_long 
> $too_long/toto.txt
>
> and watch the errors!
>
> unfortunately, the memory corruptions here doesn't seem to segfault
> this simple test case, but in my case, it is fatal and with valgrind,
> it is reported...
>
> OpenMPI 1.6.5, 1.8.3rc3 are affected
>
> MPICH-3.1.3 also have the error!
>
> thanks,
>
> Eric
>



Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-15 Thread Gilles Gouaillardet
Eric and all,

That is clearly a limitation in romio, and this is being tracked at
https://trac.mpich.org/projects/mpich/ticket/2212

in the mean time, what we can do in OpenMPI is update
mca_io_romio_file_open() and fails with a user friendly error message
if strlen(filename) is larger that 225.

Cheers,

Gilles

On 2014/12/16 12:43, Gilles Gouaillardet wrote:
> Eric,
>
> thanks for the simple test program.
>
> i think i see what is going wrong and i will make some changes to avoid
> the memory overflow.
>
> that being said, there is a hard coded limit of 256 characters, and your
> path is bigger than 300 characters.
> bottom line, and even if there is no more memory overflow, that cannot
> work as expected.
>
> i will report this to the mpich folks, since romio is currently imported
> from mpich.
>
> Cheers,
>
> Gilles
>
> On 2014/12/16 0:16, Eric Chamberland wrote:
>> Hi Gilles,
>>
>> just created a very simple test case!
>>
>> with this setup, you will see the bug with valgrind:
>>
>> export
>> too_long=./this/is/a_very/long/path/that/contains/a/not/so/long/filename/but/trying/to/collectively/mpi_file_open/it/you/will/have/a/memory/corruption/resulting/of/invalide/writing/or/reading/past/the/end/of/one/or/some/hidden/strings/in/mpio/Simple/user/would/like/to/have/the/parameter/checked/and/an/error/returned/or/this/limit/removed
>>
>> mpicc -o bug_MPI_File_open_path_too_long
>> bug_MPI_File_open_path_too_long.c
>>
>> mkdir -p $too_long
>> echo "header of a text file" > $too_long/toto.txt
>>
>> mpirun -np 2 valgrind ./bug_MPI_File_open_path_too_long 
>> $too_long/toto.txt
>>
>> and watch the errors!
>>
>> unfortunately, the memory corruptions here doesn't seem to segfault
>> this simple test case, but in my case, it is fatal and with valgrind,
>> it is reported...
>>
>> OpenMPI 1.6.5, 1.8.3rc3 are affected
>>
>> MPICH-3.1.3 also have the error!
>>
>> thanks,
>>
>> Eric
>>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/26005.php



Re: [OMPI users] OMPI users] OMPI users] OMPI users] OMPI users] MPI inside MPI (still)

2014-12-17 Thread Gilles Gouaillardet
Alex,

You do not want to spawn mpirun.
Or if this is really what you want, then just use system("env -i ...")

I think what you need is spawn a shell that do the redirection and then invoke 
your app.
This is something like
MPI_Comm_spawn("/bin/sh", "-c", "siesta < infile")

That being said, i strongly recommend you patch siesta so it can be invoked 
like this
siesta -in infile
(plus the MPI_Comm_disconnect call explained by George)
That would make everything so much easier

Cheers,

Gilles

"Alex A. Schmidt"  wrote:
>Let me rephrase the previous message:
>
>Putting "/bin/sh" in command with info key "ompi_non_mpi"  set to  ".true." 
>(if command is empty, mpi_comm_spawn tries to execute ' ') of 
>mpi_comm_spawn and "-c" "mpirun -n 1 myapp" in args results in 
>this message:
>
>**
>
>Open MPI does not support recursive calls of mpirun
>
>**
>
>Putting a single string in args as "-c mpirun -n 1 myapp" or  "-c 'mpirun -n 1 
>myapp' "
>
>returns
>
>/usr/bin/sh: - : invalid option
>
>Alex
>
>
>
>
>2014-12-17 21:47 GMT-02:00 Alex A. Schmidt :
>
>Putting "/bin/sh" in command with info key "ompi_non_mpi"  set to  ".true." 
>(if command is empty, mpi_comm_spawn tries to execute ' ') of 
>mpi_comm_spawn and "-c" "mpirun -n 1 myapp" in args results in 
>this message:
>
>/usr/bin/sh: -c: option requires an argument
>
>Putting a single string in args as "-c mpirun -n 1 myapp" or  "-c 'mpirun -n 1 
>myapp' "
>
>returns
>
>/usr/bin/sh: - : invalid option
>
>Alex
>
>
>2014-12-17 20:17 GMT-02:00 George Bosilca :
>
>I don't think this has any chance of working. The redirection is something 
>interpreted by the shell, and when Open MPI "fork-exec" a process it does not 
>behave as the shell.
>
>
>Thus a potentially non-portable solution would be to instead of launching the 
>mpirun directly to launch it through a shell. Maybe something like "/bin/sh", 
>"-c", "mpirun -n 1 myapp". 
>
>
>  George.
>
>
>
>On Wed, Dec 17, 2014 at 5:02 PM, Alex A. Schmidt  wrote:
>
>Ralph,
>
>Sorry, "<" as an element of argv to mpi_comm_spawn is interpreted just the
>
>same, as another parameter by the spawnee process.
>
>But I am confused: wouldn't it be redundant to put "mpirun" "-n" "1" "myapp" 
>as elements of argv, considering role of the other parameters of mpi_comm_spawn
>
>like the 1st and 3rd ?
>
>
>Imho, it seems to me that stdin and stdout should be keys to "info" and the
>
>respective filenames the values of those keys, if that is possible to 
>implement...
>
>Alex
>
>
>
>
>2014-12-17 15:16 GMT-02:00 Ralph Castain :
>
>Have you tried putting the "<" as a separate parameter? In other words, since 
>you are specifying the argv, you have to specify each of them separately. So 
>it looks more like:
>
>
>"mpirun", "-n", "1", "myapp", "<", "stdinfile"
>
>
>Does that work?
>
>Ralph
>
>
>
>On Wed, Dec 17, 2014 at 8:07 AM, Alex A. Schmidt  wrote:
>
>Ralph,
>
>I am afraid I will have to insist on i/o redirection matter 
>for the spawnee process. 
>
>I have a "child" mpi code that do just 2 things: read the 3 parameters
>
>passed to it and print them, and then read data from stdin and show it. 
>So, if "stdin_file" is a text file with two lines, say:
>
>
>10
>20
>
>executing "mpirun -n 1 child A B < stdin_file" wiil ouput two lines:
>
>[A]  [B]  []  
>10  20
>
>On the other hand , calling "child" from MPI_Comm_spawn("child",args,...)
>
>where
>
>args(1) = "A"
>
>args(2) = "B"
>
>args(3) ="< stdin_file"
>
>args(4) = " "
>
>will make "child" outputs only 1 line
>
>[A] [B] [< stdin_file]
>
>and then fails because there is not stdin data to read from. 
>
>Please, note that surprisingly the whole string "< stdin_file" is interpreted 
>as a third parameter to "child" and not a stdin...
>
>Alex
>
>
>
>
>
>
>
>
>2014-12-15 17:26 GMT-02:00 Alex A. Schmidt :
>
>Ralph, 
>
>I guess you mean "call mpi_comm_spawn( 'siesta', &

Re: [OMPI users] OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-17 Thread Gilles Gouaillardet
Eric,

As long as lFileNameWithoutTooLongPath length is less than 226 characters and 
you do not run into some threads related race conditions, that should be just 
fine, and that roughly covers 99% cases.

Thanks for sharing this workaround !

Cheers,

Gilles

Eric Chamberland  wrote:
>Hi!
>
>Here is a "poor man's fix" that works for me (the idea is not from me, 
>thanks to Thomas H.):
>
>#1- char* lCwd = getcwd(0,0);
>#2- chdir(lPathToFile);
>#3- MPI_File_open(...,lFileNameWithoutTooLongPath,...);
>#4- chdir(lCwd);
>#5- ...
>
>I think there are some limitations but it works very well for our 
>uses... and until a "real" fix is proposed...
>
>Thanks for helping!
>
>Eric
>
>
>On 12/15/2014 11:42 PM, Gilles Gouaillardet wrote:
>> Eric and all,
>>
>> That is clearly a limitation in romio, and this is being tracked at
>> https://trac.mpich.org/projects/mpich/ticket/2212
>>
>> in the mean time, what we can do in OpenMPI is update
>> mca_io_romio_file_open() and fails with a user friendly error message
>> if strlen(filename) is larger that 225.
>>
>> Cheers,
>>
>> Gilles
>>
>> On 2014/12/16 12:43, Gilles Gouaillardet wrote:
>>> Eric,
>>>
>>> thanks for the simple test program.
>>>
>>> i think i see what is going wrong and i will make some changes to avoid
>>> the memory overflow.
>>>
>>> that being said, there is a hard coded limit of 256 characters, and your
>>> path is bigger than 300 characters.
>>> bottom line, and even if there is no more memory overflow, that cannot
>>> work as expected.
>>>
>>> i will report this to the mpich folks, since romio is currently imported
>>> from mpich.
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> On 2014/12/16 0:16, Eric Chamberland wrote:
>>>> Hi Gilles,
>>>>
>>>> just created a very simple test case!
>>>>
>>>> with this setup, you will see the bug with valgrind:
>>>>
>>>> export
>>>> too_long=./this/is/a_very/long/path/that/contains/a/not/so/long/filename/but/trying/to/collectively/mpi_file_open/it/you/will/have/a/memory/corruption/resulting/of/invalide/writing/or/reading/past/the/end/of/one/or/some/hidden/strings/in/mpio/Simple/user/would/like/to/have/the/parameter/checked/and/an/error/returned/or/this/limit/removed
>>>>
>>>> mpicc -o bug_MPI_File_open_path_too_long
>>>> bug_MPI_File_open_path_too_long.c
>>>>
>>>> mkdir -p $too_long
>>>> echo "header of a text file" > $too_long/toto.txt
>>>>
>>>> mpirun -np 2 valgrind ./bug_MPI_File_open_path_too_long
>>>> $too_long/toto.txt
>>>>
>>>> and watch the errors!
>>>>
>>>> unfortunately, the memory corruptions here doesn't seem to segfault
>>>> this simple test case, but in my case, it is fatal and with valgrind,
>>>> it is reported...
>>>>
>>>> OpenMPI 1.6.5, 1.8.3rc3 are affected
>>>>
>>>> MPICH-3.1.3 also have the error!
>>>>
>>>> thanks,
>>>>
>>>> Eric
>>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/12/26005.php
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/12/26006.php
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/12/26022.php


Re: [OMPI users] OMPI users] ERROR: C_FUNLOC function

2014-12-18 Thread Gilles Gouaillardet
FWIW

I faced a simlar issue on my linux virtualbox.
My shared folder is a vboxfs filesystem, but statfs returns the nfs magic id.
That causes some mess and the test fails.
At this stage i cannot tell whether i should blame the glibc, the kernel, a 
virtualbox driver or myself

Cheer,

Gilles

Mike Dubman さんのメール:
>Hi Siegmar,
>
>Could you please check the /etc/mtab file for real FS type for the following 
>mount points:
>
>
>get_mounts: dirs[16]:/misc fs:autofs nfs:No
>get_mounts: dirs[17]:/net fs:autofs nfs:No
>get_mounts: dirs[18]:/home fs:autofs nfs:No
>
>
>could you please check if mntent.h and paths.h were detected by "configure" in 
>config.log?
>
>
>Thanks
>
>
>
>On Thu, Dec 18, 2014 at 12:39 AM, Jeff Squyres (jsquyres)  
>wrote:
>
>Siegmar --
>
>I filed https://github.com/open-mpi/ompi/issues/317 and 
>https://github.com/open-mpi/ompi/issues/318.
>
>
>
>
>On Dec 17, 2014, at 3:33 PM, Siegmar Gross 
> wrote:
>
>> Hi Jeff,
>>
>>> This fix was just pushed to the OMPI master.  A new master tarball
>>> should be available shortly (probably within an hour or so -- look
>>> for a tarball dated Dec 17 at http://www.open-mpi.org/nightly/master/).
>>
>> Yes, I could build it now. Thank you very much to everybody who helped
>> to fix the problem. I get an error for "make check" on Solaris 10 Sparc,
>> Solaris 10 x86_64, and OpenSUSE Linux with both gcc-4.9.2 and Sun C 5.13.
>> Hopefully I have some time tomorrow to to test this version with some
>> simple programs.
>>
>> Linux, Sun C 5.13:
>> ==
>> ...
>> PASS: opal_bit_ops
>> Failure : Mismatch: input "/home", expected:0 got:1
>>
>> Failure : Mismatch: input "/net", expected:0 got:1
>>
>> Failure : Mismatch: input "/misc", expected:0 got:1
>>
>> SUPPORT: OMPI Test failed: opal_path_nfs() (3 of 20 failed)
>> Test usage: ./opal_path_nfs [DIR]
>> On Linux interprets output from mount(8) to check for nfs and verify 
>> opal_path_nfs()
>> Additionally, you may specify multiple DIR on the cmd-line, of which you the 
>> output
>> get_mounts: dirs[0]:/dev fs:devtmpfs nfs:No
>> get_mounts: dirs[1]:/dev/shm fs:tmpfs nfs:No
>> get_mounts: dirs[2]:/run fs:tmpfs nfs:No
>> get_mounts: dirs[3]:/dev/pts fs:devpts nfs:No
>> get_mounts: dirs[4]:/ fs:ext4 nfs:No
>> get_mounts: dirs[5]:/proc fs:proc nfs:No
>> get_mounts: dirs[6]:/sys fs:sysfs nfs:No
>> get_mounts: dirs[7]:/sys/kernel/debug fs:debugfs nfs:No
>> get_mounts: dirs[8]:/sys/kernel/security fs:securityfs nfs:No
>> get_mounts: dirs[9]:/local fs:ext4 nfs:No
>> get_mounts: dirs[10]:/var/lock fs:tmpfs nfs:No
>> get_mounts: dirs[11]:/var/run fs:tmpfs nfs:No
>> get_mounts: dirs[12]:/media fs:tmpfs nfs:No
>> get_mounts: dirs[13]:/usr/local fs:nfs nfs:Yes
>> get_mounts: dirs[14]:/opt/global fs:nfs nfs:Yes
>> get_mounts: already know dir[13]:/usr/local
>> get_mounts: dirs[13]:/usr/local fs:nfs nfs:Yes
>> get_mounts: dirs[15]:/export2 fs:nfs nfs:Yes
>> get_mounts: already know dir[14]:/opt/global
>> get_mounts: dirs[14]:/opt/global fs:nfs nfs:Yes
>> get_mounts: dirs[16]:/misc fs:autofs nfs:No
>> get_mounts: dirs[17]:/net fs:autofs nfs:No
>> get_mounts: dirs[18]:/home fs:autofs nfs:No
>> get_mounts: dirs[19]:/home/fd1026 fs:nfs nfs:Yes
>> test(): file:/home/fd1026 bool:1
>> test(): file:/home bool:0
>> test(): file:/net bool:0
>> test(): file:/misc bool:0
>> test(): file:/export2 bool:1
>> test(): file:/opt/global bool:1
>> test(): file:/usr/local bool:1
>> test(): file:/media bool:0
>> test(): file:/var/run bool:0
>> test(): file:/var/lock bool:0
>> test(): file:/local bool:0
>> test(): file:/sys/kernel/security bool:0
>> test(): file:/sys/kernel/debug bool:0
>> test(): file:/sys bool:0
>> test(): file:/proc bool:0
>> test(): file:/ bool:0
>> test(): file:/dev/pts bool:0
>> test(): file:/run bool:0
>> test(): file:/dev/shm bool:0
>> test(): file:/dev bool:0
>> FAIL: opal_path_nfs
>> 
>> 1 of 2 tests failed
>> Please report to http://www.open-mpi.org/community/help/
>> 
>> make[3]: *** [check-TESTS] Error 1
>> make[3]: Leaving directory
>> `/export2/src/openmpi-1.9/openmpi-dev-557-g01a24c4-Linux.x86_64.64_cc/test/util'
>> make[2]: *** [check-am] Error 2
>> make[2]: Leaving directory
>> `/export2/src/openmpi-1.9/openmpi-dev-557-g01a24c4-Linu

Re: [OMPI users] processes hang with openmpi-dev-602-g82c02b4

2014-12-24 Thread Gilles Gouaillardet
Siegmar,

could you please give a try to the attached patch ?
/* and keep in mind this is just a workaround that happen to work */

Cheers,

Gilles

On 2014/12/22 22:48, Siegmar Gross wrote:
> Hi,
>
> today I installed openmpi-dev-602-g82c02b4 on my machines (Solaris 10 Sparc,
> Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-4.9.2 and the
> new Solaris Studio 12.4 compilers. All build processes finished without
> errors, but I have a problem running a very small program. It works for
> three processes but hangs for six processes. I have the same behaviour
> for both compilers.
>
> tyr small_prog 139 time; mpiexec -np 3 --host sunpc1,linpc1,tyr 
> init_finalize; time
> 827.161u 210.126s 30:51.08 56.0%0+0k 4151+20io 2898pf+0w
> Hello!
> Hello!
> Hello!
> 827.886u 210.335s 30:54.68 55.9%0+0k 4151+20io 2898pf+0w
> tyr small_prog 140 time; mpiexec -np 6 --host sunpc1,linpc1,tyr 
> init_finalize; time
> 827.946u 210.370s 31:15.02 55.3%0+0k 4151+20io 2898pf+0w
> ^CKilled by signal 2.
> Killed by signal 2.
> 869.242u 221.644s 33:40.54 53.9%0+0k 4151+20io 2898pf+0w
> tyr small_prog 141 
>
> tyr small_prog 145 ompi_info | grep -e "Open MPI repo revision:" -e "C 
> compiler:"
>   Open MPI repo revision: dev-602-g82c02b4
>   C compiler: cc
> tyr small_prog 146 
>
>
> tyr small_prog 146 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
> GNU gdb (GDB) 7.6.1
> ...
> (gdb) run -np 3 --host sunpc1,linpc1,tyr init_finalize
> Starting program: /usr/local/openmpi-1.9.0_64_cc/bin/mpiexec -np 3 --host 
> sunpc1,linpc1,tyr 
> init_finalize
> [Thread debugging using libthread_db enabled]
> [New Thread 1 (LWP 1)]
> [New LWP2]
> Hello!
> Hello!
> Hello!
> [LWP2 exited]
> [New Thread 2]
> [Switching to Thread 1 (LWP 1)]
> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to 
> satisfy query
> (gdb) run -np 6 --host sunpc1,linpc1,tyr init_finalize
> The program being debugged has been started already.
> Start it from the beginning? (y or n) y
>
> Starting program: /usr/local/openmpi-1.9.0_64_cc/bin/mpiexec -np 6 --host 
> sunpc1,linpc1,tyr 
> init_finalize
> [Thread debugging using libthread_db enabled]
> [New Thread 1 (LWP 1)]
> [New LWP2]
> ^CKilled by signal 2.
> Killed by signal 2.
>
> Program received signal SIGINT, Interrupt.
> [Switching to Thread 1 (LWP 1)]
> 0x7d1dc6b0 in __pollsys () from /lib/sparcv9/libc.so.1
> (gdb) bt
> #0  0x7d1dc6b0 in __pollsys () from /lib/sparcv9/libc.so.1
> #1  0x7d1cb468 in _pollsys () from /lib/sparcv9/libc.so.1
> #2  0x7d170ed8 in poll () from /lib/sparcv9/libc.so.1
> #3  0x7e69a630 in poll_dispatch ()
>from /usr/local/openmpi-1.9.0_64_cc/lib64/libopen-pal.so.0
> #4  0x7e6894ec in opal_libevent2021_event_base_loop ()
>from /usr/local/openmpi-1.9.0_64_cc/lib64/libopen-pal.so.0
> #5  0x0001eb14 in orterun (argc=1757447168, argv=0xff7ed8550cff)
> at ../../../../openmpi-dev-602-g82c02b4/orte/tools/orterun/orterun.c:1090
> #6  0x00014e2c in main (argc=256, argv=0xff7ed8af5c00)
> at ../../../../openmpi-dev-602-g82c02b4/orte/tools/orterun/main.c:13
> (gdb) 
>
> Any ideas? Unfortunately I'm leaving for vaccation so that I cannot test
> any patches until the end of the year. Neverthess I wanted to report the
> problem. At the moment I cannot test if I have the same behaviour in a
> homogeneous environment with three machines because the new version isn't
> available before tomorrow on the other machines. I used the following
> configure command.
>
> ../openmpi-dev-602-g82c02b4/configure --prefix=/usr/local/openmpi-1.9.0_64_cc 
> \
>   --libdir=/usr/local/openmpi-1.9.0_64_cc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.8.0/bin \
>   --with-jdk-headers=/usr/local/jdk1.8.0/include \
>   JAVA_HOME=/usr/local/jdk1.8.0 \
>   LDFLAGS="-m64 -mt" \
>   CC="cc" CXX="CC" FC="f95" \
>   CFLAGS="-m64 -mt" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   CPPFLAGS="" CXXCPPFLAGS="" \
>   --enable-mpi-cxx \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-mpi-thread-multiple \
>   --with-threads=posix \
>   --with-hwloc=internal \
>   --without-verbs \
>   --with-wrapper-cflags="-m64 -mt" \
>   --with-wrapper-cxxflags="-m64 -library=stlport4" \
>   --with-wrapper-ldflags="-mt" \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_EN

Re: [OMPI users] processes hang with openmpi-dev-602-g82c02b4

2014-12-24 Thread Gilles Gouaillardet
Kawashima-san,

i'd rather consider this as a bug in the README (!)


heterogenous support has been broken for some time, but it was
eventually fixed.

truth is there are *very* limited resources (both human and hardware)
maintaining heterogeneous
support, but that does not mean heterogeneous support should not be
used, nor that bug report
will be ignored.

Cheers,

Gilles

On 2014/12/24 9:26, Kawashima, Takahiro wrote:
> Hi Siegmar,
>
> Heterogeneous environment is not supported officially.
>
> README of Open MPI master says:
>
> --enable-heterogeneous
>   Enable support for running on heterogeneous clusters (e.g., machines
>   with different endian representations).  Heterogeneous support is
>   disabled by default because it imposes a minor performance penalty.
>
>   *** THIS FUNCTIONALITY IS CURRENTLY BROKEN - DO NOT USE ***
>
>> Hi,
>>
>> today I installed openmpi-dev-602-g82c02b4 on my machines (Solaris 10 Sparc,
>> Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-4.9.2 and the
>> new Solaris Studio 12.4 compilers. All build processes finished without
>> errors, but I have a problem running a very small program. It works for
>> three processes but hangs for six processes. I have the same behaviour
>> for both compilers.
>>
>> tyr small_prog 139 time; mpiexec -np 3 --host sunpc1,linpc1,tyr 
>> init_finalize; time
>> 827.161u 210.126s 30:51.08 56.0%0+0k 4151+20io 2898pf+0w
>> Hello!
>> Hello!
>> Hello!
>> 827.886u 210.335s 30:54.68 55.9%0+0k 4151+20io 2898pf+0w
>> tyr small_prog 140 time; mpiexec -np 6 --host sunpc1,linpc1,tyr 
>> init_finalize; time
>> 827.946u 210.370s 31:15.02 55.3%0+0k 4151+20io 2898pf+0w
>> ^CKilled by signal 2.
>> Killed by signal 2.
>> 869.242u 221.644s 33:40.54 53.9%0+0k 4151+20io 2898pf+0w
>> tyr small_prog 141 
>>
>> tyr small_prog 145 ompi_info | grep -e "Open MPI repo revision:" -e "C 
>> compiler:"
>>   Open MPI repo revision: dev-602-g82c02b4
>>   C compiler: cc
>> tyr small_prog 146 
>>
>>
>> tyr small_prog 146 /usr/local/gdb-7.6.1_64_gcc/bin/gdb mpiexec
>> GNU gdb (GDB) 7.6.1
>> ...
>> (gdb) run -np 3 --host sunpc1,linpc1,tyr init_finalize
>> Starting program: /usr/local/openmpi-1.9.0_64_cc/bin/mpiexec -np 3 --host 
>> sunpc1,linpc1,tyr 
>> init_finalize
>> [Thread debugging using libthread_db enabled]
>> [New Thread 1 (LWP 1)]
>> [New LWP2]
>> Hello!
>> Hello!
>> Hello!
>> [LWP2 exited]
>> [New Thread 2]
>> [Switching to Thread 1 (LWP 1)]
>> sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to 
>> satisfy query
>> (gdb) run -np 6 --host sunpc1,linpc1,tyr init_finalize
>> The program being debugged has been started already.
>> Start it from the beginning? (y or n) y
>>
>> Starting program: /usr/local/openmpi-1.9.0_64_cc/bin/mpiexec -np 6 --host 
>> sunpc1,linpc1,tyr 
>> init_finalize
>> [Thread debugging using libthread_db enabled]
>> [New Thread 1 (LWP 1)]
>> [New LWP2]
>> ^CKilled by signal 2.
>> Killed by signal 2.
>>
>> Program received signal SIGINT, Interrupt.
>> [Switching to Thread 1 (LWP 1)]
>> 0x7d1dc6b0 in __pollsys () from /lib/sparcv9/libc.so.1
>> (gdb) bt
>> #0  0x7d1dc6b0 in __pollsys () from /lib/sparcv9/libc.so.1
>> #1  0x7d1cb468 in _pollsys () from /lib/sparcv9/libc.so.1
>> #2  0x7d170ed8 in poll () from /lib/sparcv9/libc.so.1
>> #3  0x7e69a630 in poll_dispatch ()
>>from /usr/local/openmpi-1.9.0_64_cc/lib64/libopen-pal.so.0
>> #4  0x7e6894ec in opal_libevent2021_event_base_loop ()
>>from /usr/local/openmpi-1.9.0_64_cc/lib64/libopen-pal.so.0
>> #5  0x0001eb14 in orterun (argc=1757447168, argv=0xff7ed8550cff)
>> at ../../../../openmpi-dev-602-g82c02b4/orte/tools/orterun/orterun.c:1090
>> #6  0x00014e2c in main (argc=256, argv=0xff7ed8af5c00)
>> at ../../../../openmpi-dev-602-g82c02b4/orte/tools/orterun/main.c:13
>> (gdb) 
>>
>> Any ideas? Unfortunately I'm leaving for vaccation so that I cannot test
>> any patches until the end of the year. Neverthess I wanted to report the
>> problem. At the moment I cannot test if I have the same behaviour in a
>> homogeneous environment with three machines because the new version isn't
>> available before tomorrow on the other machines. I used the following
>> configure command.
>>
>> ../openmpi-dev-602-g82c02b4/configure 
>

Re: [OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-28 Thread Gilles Gouaillardet
Where does the error occurs ?
MPI_Init ?
MPI_Finalize ?
In between ?

In the first case, the bug is likely a mishandled error case,
which means OpenMPI is unlikely the root cause of the crash.

Did you check infniband is up and running on your cluster ?

Cheers,

Gilles 

Saliya Ekanayake さんのメール:
>It's been a while on this, but we are still having trouble getting OpenMPI to 
>work with Infiniband on this cluster. We tried with latest 1.8.4 as well, but 
>it's still the same.
>
>
>To recap, we get the following error when MPI initializes (in the simple Hello 
>world C example) with Infiniband. Everything works fine if we explicitly turn 
>off openib with --mca btl ^openib
>
>
>This is the error I got after debugging with gdb as you suggested.
>
>
>hello_c: connect/btl_openib_connect_udcm.c:736: udcm_module_finalize: 
>Assertion `((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *) 
>(&m->cm_recv_msg_queue))->obj_magic_id' failed.
>
>
>Thank you,
>
>Saliya
>
>
>On Mon, Nov 10, 2014 at 10:01 AM, Saliya Ekanayake  wrote:
>
>Thank you Jeff, I'll try this and  let you know. 
>
>Saliya 
>
>On Nov 10, 2014 6:42 AM, "Jeff Squyres (jsquyres)"  wrote:
>
>I am sorry for the delay; I've been caught up in SC deadlines.  :-(
>
>I don't see anything blatantly wrong in this output.
>
>Two things:
>
>1. Can you try a nightly v1.8.4 snapshot tarball?  This will check to see if 
>whatever the bug is has been fixed for the upcoming release:
>
>    http://www.open-mpi.org/nightly/v1.8/
>
>2. Build Open MPI with the --enable-debug option (note that this adds a 
>slight-but-noticeable performance penalty).  When you run, it should dump a 
>core file.  Load that core file in a debugger and see where it is failing 
>(i.e., file and line in the OMPI source).
>
>We don't usually have to resort to asking users to perform #2, but there's no 
>additional information to give a clue as to what is happening.  :-(
>
>
>
>On Nov 9, 2014, at 11:43 AM, Saliya Ekanayake  wrote:
>
>> Hi Jeff,
>>
>> You are probably busy, but just checking if you had a chance to look at this.
>>
>> Thanks,
>> Saliya
>>
>> On Thu, Nov 6, 2014 at 9:19 AM, Saliya Ekanayake  wrote:
>> Hi Jeff,
>>
>> I've attached a tar file with information.
>>
>> Thank you,
>> Saliya
>>
>> On Tue, Nov 4, 2014 at 4:18 PM, Jeff Squyres (jsquyres)  
>> wrote:
>> Looks like it's failing in the openib BTL setup.
>>
>> Can you send the info listed here?
>>
>>     http://www.open-mpi.org/community/help/
>>
>>
>>
>> On Nov 4, 2014, at 1:10 PM, Saliya Ekanayake  wrote:
>>
>> > Hi,
>> >
>> > I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It 
>> > builds fine, but when I try to run even the simplest hello.c program it'll 
>> > cause a segfault. Any suggestions on how to correct this?
>> >
>> > The steps I did and error message are below.
>> >
>> > 1. Built OpenMPI 1.8.1 on the cluster. The ompi_info is attached.
>> > 2. cd to examples directory and mpicc hello_c.c
>> > 3. mpirun -np 2 ./a.out
>> > 4. Error text is attached.
>> >
>> > Please let me know if you need more info.
>> >
>> > Thank you,
>> > Saliya
>> >
>> >
>> > --
>> > Saliya Ekanayake esal...@gmail.com
>> > Cell 812-391-4914 Home 812-961-6383
>> > http://saliya.org
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post: 
>> > http://www.open-mpi.org/community/lists/users/2014/11/25668.php
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25672.php
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25717.php
>
>
>--
>Jeff Squyres
>jsquy...@cisco.com
>For corporate legal information go to: 
>http://www.cisco.com/web/about/doing_business/legal/cri/
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/11/25723.php
>
>
>
>
>-- 
>
>Saliya Ekanayake
>
>Ph.D. Candidate | Research Assistant
>
>School of Informatics and Computing | Digital Science Center
>
>Indiana University, Bloomington
>Cell 812-391-4914
>http://saliya.org
>


Re: [OMPI users] OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-29 Thread Gilles Gouaillardet
Do you mean OMPI did not issue such a warning ?

If there was no such warning, could you please detail
How much ram on your system
The value returned by ulimit -l
The ofed stack you are running
The kernel modules you loaded
The model of your ib card

Cheers

Gilles

Saliya Ekanayake さんのメール:
>I meant it works now, sorry for the confusion.
>
>
>Running the test revealed a warning on memory registration, which we fixed by 
>setting unlimited in ulimit -l. Then running OMPI sample worked too.
>
>
>Thank you,
>
>saliya
>
>
>
>
>On Sun, Dec 28, 2014 at 11:18 PM, Ralph Castain  wrote:
>
>So you are saying the test worked, but you are still encountering an error 
>when executing an MPI job? Or are you saying things now work?
>
>
>
>On Dec 28, 2014, at 5:58 PM, Saliya Ekanayake  wrote:
>
>
>Thank you Ralph. This produced the warning on memory limits similar to [1] and 
>setting ulimit -l unlimited worked.
>
>
>[1] http://lists.openfabrics.org/pipermail/general/2007-June/036941.html
>
>
>Saliya
>
>
>On Sun, Dec 28, 2014 at 5:57 PM, Ralph Castain  wrote:
>
>Have the admin try running the ibv_ud_pingpong test - that will exercise the 
>portion of the system under discussion.
>
>
>
>On Dec 28, 2014, at 2:31 PM, Saliya Ekanayake  wrote:
>
>
>What I heard from the administrator is that, 
>
>"The tests that work are the simple utilities ib_read_lat and ib_read_bw
>that measures latency and bandwith between two nodes. They are part of
>the "perftest" repo package."
>
>On Dec 28, 2014 10:20 AM, "Saliya Ekanayake"  wrote:
>
>This happens at MPI_Init. I've attached the full error message.
>
>
>The sys admin mentioned Infiniband utility tests ran OK. I'll contact him for 
>more details and let you know.
>
>
>Thank you,
>Saliya
>
>
>On Sun, Dec 28, 2014 at 3:18 AM, Gilles Gouaillardet 
> wrote:
>
>Where does the error occurs ?
>MPI_Init ?
>MPI_Finalize ?
>In between ?
>
>In the first case, the bug is likely a mishandled error case,
>which means OpenMPI is unlikely the root cause of the crash.
>
>Did you check infniband is up and running on your cluster ?
>
>Cheers,
>
>Gilles 
>
>Saliya Ekanayake さんのメール:
>
>It's been a while on this, but we are still having trouble getting OpenMPI to 
>work with Infiniband on this cluster. We tried with latest 1.8.4 as well, but 
>it's still the same.
>
>
>To recap, we get the following error when MPI initializes (in the simple Hello 
>world C example) with Infiniband. Everything works fine if we explicitly turn 
>off openib with --mca btl ^openib
>
>
>This is the error I got after debugging with gdb as you suggested.
>
>
>hello_c: connect/btl_openib_connect_udcm.c:736: udcm_module_finalize: 
>Assertion `((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *) 
>(&m->cm_recv_msg_queue))->obj_magic_id' failed.
>
>
>Thank you,
>
>Saliya
>
>
>On Mon, Nov 10, 2014 at 10:01 AM, Saliya Ekanayake  wrote:
>
>Thank you Jeff, I'll try this and  let you know. 
>
>Saliya 
>
>On Nov 10, 2014 6:42 AM, "Jeff Squyres (jsquyres)"  wrote:
>
>I am sorry for the delay; I've been caught up in SC deadlines.  :-(
>
>I don't see anything blatantly wrong in this output.
>
>Two things:
>
>1. Can you try a nightly v1.8.4 snapshot tarball?  This will check to see if 
>whatever the bug is has been fixed for the upcoming release:
>
>    http://www.open-mpi.org/nightly/v1.8/
>
>2. Build Open MPI with the --enable-debug option (note that this adds a 
>slight-but-noticeable performance penalty).  When you run, it should dump a 
>core file.  Load that core file in a debugger and see where it is failing 
>(i.e., file and line in the OMPI source).
>
>We don't usually have to resort to asking users to perform #2, but there's no 
>additional information to give a clue as to what is happening.  :-(
>
>
>
>On Nov 9, 2014, at 11:43 AM, Saliya Ekanayake  wrote:
>
>> Hi Jeff,
>>
>> You are probably busy, but just checking if you had a chance to look at this.
>>
>> Thanks,
>> Saliya
>>
>> On Thu, Nov 6, 2014 at 9:19 AM, Saliya Ekanayake  wrote:
>> Hi Jeff,
>>
>> I've attached a tar file with information.
>>
>> Thank you,
>> Saliya
>>
>> On Tue, Nov 4, 2014 at 4:18 PM, Jeff Squyres (jsquyres)  
>> wrote:
>> Looks like it's failing in the openib BTL setup.
>>
>> Can you send the info listed here?
>>
>>     http://www.open-mpi.org/community/help/
>>
>>
>>
>> On Nov 4

Re: [OMPI users] OMPI users] Icreasing OFED registerable memory

2014-12-30 Thread Gilles Gouaillardet
FWIW ompi does not yet support XRC with OFED 3.12.

Cheers,

Gilles

Deva さんのメール:
>Hi Waleed,
>
>
>It is highly recommended to upgrade to latest OFED.  Meanwhile, Can you try 
>latest OMPI release (v1.8.4), where this warning is ignored on older OFEDs
>
>
>-Devendar 
>
>
>On Sun, Dec 28, 2014 at 6:03 AM, Waleed Lotfy  wrote:
>
>I have a bunch of 8 GB memory nodes in a cluster who were lately
>upgraded to 16 GB. When I run any jobs I get the following warning:
>--
>WARNING: It appears that your OpenFabrics subsystem is configured to
>only
>allow registering part of your physical memory.  This can cause MPI jobs
>to
>run with erratic performance, hang, and/or crash.
>
>This may be caused by your OpenFabrics vendor limiting the amount of
>physical memory that can be registered.  You should investigate the
>relevant Linux kernel module parameters that control how much physical
>memory can be registered, and increase them to allow registering all
>physical memory on your machine.
>
>See this Open MPI FAQ item for more information on these Linux kernel
>module
>parameters:
>
>    http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>
>  Local host:              comp022.local
>  Registerable memory:     8192 MiB
>  Total memory:            16036 MiB
>
>Your MPI job will continue, but may be behave poorly and/or hang.
>--
>
>Searching for a fix to this issue, I found that I have to set
>log_num_mtt within the kernel module, so I added this line to
>modprobe.conf:
>
>options mlx4_core log_num_mtt=21
>
>But then ib0 interface fails to start showing this error:
>ib_ipoib device ib0 does not seem to be present, delaying
>initialization.
>
>Reducing the value of log_num_mtt to 20, allows ib0 to start but shows
>the registerable memory of 8 GB warning.
>
>I am using OFED 1.3.1, I know it is pretty old and we are planning to
>upgrade soon.
>
>Output on all nodes for 'ompi_info  -v ompi full --parsable':
>
>ompi:version:full:1.2.7
>ompi:version:svn:r19401
>orte:version:full:1.2.7
>orte:version:svn:r19401
>opal:version:full:1.2.7
>opal:version:svn:r19401
>
>Any help would be appreciated.
>
>Waleed Lotfy
>Bibliotheca Alexandrina
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2014/12/26076.php
>
>
>
>
>-- 
>
>
>
>-Devendar
>


Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-02 Thread Gilles Gouaillardet
Diego,

First, i recommend you redefine tParticle and add a padding integer so 
everything is aligned.


Before invoking MPI_Type_create_struct, you need to 
call MPI_Get_address(dummy, base, MPI%err)
displacements = displacements - base

MPI_Type_create_resized might be unnecessary if tParticle is aligned 
And the lower bound should be zero.

BTW, which compiler are you using ?
Is tParticle object a common ?
iirc, intel compiler aligns types automatically, but not commons, and that 
means MPI_Type_create_struct is not aligned as it should most of the time.

Cheers,

Gilles 

Diego Avesani さんのメール:
>dear all,
>
>
>I have a problem with MPI_Type_Create_Struct and MPI_TYPE_CREATE_RESIZED.
>
>
>I have this variable type:
>
>
>  TYPE tParticle
>
>     INTEGER  :: ip
>
>     REAL     :: RP(2)
>
>     REAL     :: QQ(2)
>
>  ENDTYPE tParticle
>
>
>Then I define:
>
>
>Nstruct=3
>
>ALLOCATE(TYPES(Nstruct))
>
>ALLOCATE(LENGTHS(Nstruct))
>
>ALLOCATE(DISPLACEMENTS(Nstruct))
>
>!set the types
>
>TYPES(1) = MPI_INTEGER
>
>TYPES(2) = MPI_DOUBLE_PRECISION
>
>TYPES(3) = MPI_DOUBLE_PRECISION
>
>!set the lengths
>
>LENGTHS(1) = 1
>
>LENGTHS(2) = 2
>
>LENGTHS(3) = 2
>
>
>As gently suggested by Nick Papior Andersen and George Bosilca some months 
>ago, I checked the variable adress to resize my struct variable to avoid empty 
>space and
>
>to have a more general definition.
>
>
> !
>
> CALL MPI_GET_ADDRESS(dummy%ip,    DISPLACEMENTS(1), MPI%iErr)
>
> CALL MPI_GET_ADDRESS(dummy%RP(1), DISPLACEMENTS(2), MPI%iErr)
>
> CALL MPI_GET_ADDRESS(dummy%QQ(1), DISPLACEMENTS(3), MPI%iErr)
>
> !
>
> CALL 
>MPI_Type_Create_Struct(Nstruct,LENGTHS,DISPLACEMENTS,TYPES,MPI_PARTICLE_TYPE_OLD,MPI%iErr)
>
> CALL MPI_Type_Commit(MPI_PARTICLE_TYPE_OLD,MPI%iErr)
>
> !
>
> CALL MPI_TYPE_CREATE_RESIZED(MPI_PARTICLE_TYPE_OLD, 
>DISPLACEMENTS(1),DISPLACEMENTS(2) - DISPLACEMENTS(1), MPI_PARTICLE_TYPE)
>
>
>
>This does not work. When my program run, I get an error:
>
>
>forrtl: severe (174): SIGSEGV, segmentation fault occurred.
>
>
>I have read the manual but probably I am not able to understand  
>MPI_TYPE_CREATE_RESIZED. 
>
>
>Someone could help me?
>
>                                                                               
>                                                              
>
>Thanks a lot
>
>Diego
>
>
>
>Diego
>


Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-02 Thread Gilles Gouaillardet
Diego,

George gave you the solution,

The snippet you posted has two mistakes
You did not remove mpi_get_address(dummy) from all displacements
(See my previous reply)
You pass incorrect values to mpi_type_create_resized

Can you post a trimmed version of your program instead of a snippet ?

Gus is right about using double precision vs real and -r8

Cheers,

Gilles

Diego Avesani さんのメール:
>Dear Gilles Dear all,
>
>
>I have done all that to avoid to pedding an integer, as suggested by George.
>
>I define tParticle as a common object. 
>
>I am using Intel fortran compiler. 
>
>
>George suggests:
>
>
>"" The displacements are relative to the benign of your particle type. Thus 
>the first one is not 0 but the displacement of “integer :: ip” due to the fact 
>that the compiler is allowed to introduce gaps in order to better align.
>
>  DISPLACEMENTS(1)=MPI_GET_ADDRESS(dummy%ip)
>
>  DISPLACEMENTS(2)=MPI_GET_ADDRESS(dummy%RP[1])
>
>  DISPLACEMENTS(3)=MPI_GET_ADDRESS(dummy%QQ[1])
>
>and then remove the MPI_GET_ADDRESS(dummy) from all of them.
>
>
>3. After creating the structure type you need to resize it in order to 
>correctly determine the span of the entire structure, and how an array of such 
>structures lays in memory. Something like:
>
>MPI_TYPE_CREATE_RESIZED(old type, DISPLACEMENT(1),
>
>   MPI_GET_ADDRESS(dummy[2]) - MPI_GET_ADDRESS(dummy[1]), newt) ""
>
>
>What do you think?
>
>George, Did i miss something?
>
>
>Thanks a lot
>
>
>
>
>Diego
>
>
>On 2 January 2015 at 12:51, Gilles Gouaillardet 
> wrote:
>
>Diego,
>
>First, i recommend you redefine tParticle and add a padding integer so 
>everything is aligned.
>
>
>Before invoking MPI_Type_create_struct, you need to 
>call MPI_Get_address(dummy, base, MPI%err)
>displacements = displacements - base
>
>MPI_Type_create_resized might be unnecessary if tParticle is aligned 
>And the lower bound should be zero.
>
>BTW, which compiler are you using ?
>Is tParticle object a common ?
>iirc, intel compiler aligns types automatically, but not commons, and that 
>means MPI_Type_create_struct is not aligned as it should most of the time.
>
>Cheers,
>
>Gilles 
>
>Diego Avesani さんのメール:
>
>
>dear all,
>
>
>I have a problem with MPI_Type_Create_Struct and MPI_TYPE_CREATE_RESIZED.
>
>
>I have this variable type:
>
>
>  TYPE tParticle
>
>     INTEGER  :: ip
>
>     REAL     :: RP(2)
>
>     REAL     :: QQ(2)
>
>  ENDTYPE tParticle
>
>
>Then I define:
>
>
>Nstruct=3
>
>ALLOCATE(TYPES(Nstruct))
>
>ALLOCATE(LENGTHS(Nstruct))
>
>ALLOCATE(DISPLACEMENTS(Nstruct))
>
>!set the types
>
>TYPES(1) = MPI_INTEGER
>
>TYPES(2) = MPI_DOUBLE_PRECISION
>
>TYPES(3) = MPI_DOUBLE_PRECISION
>
>!set the lengths
>
>LENGTHS(1) = 1
>
>LENGTHS(2) = 2
>
>LENGTHS(3) = 2
>
>
>As gently suggested by Nick Papior Andersen and George Bosilca some months 
>ago, I checked the variable adress to resize my struct variable to avoid empty 
>space and
>
>to have a more general definition.
>
>
> !
>
> CALL MPI_GET_ADDRESS(dummy%ip,    DISPLACEMENTS(1), MPI%iErr)
>
> CALL MPI_GET_ADDRESS(dummy%RP(1), DISPLACEMENTS(2), MPI%iErr)
>
> CALL MPI_GET_ADDRESS(dummy%QQ(1), DISPLACEMENTS(3), MPI%iErr)
>
> !
>
> CALL 
>MPI_Type_Create_Struct(Nstruct,LENGTHS,DISPLACEMENTS,TYPES,MPI_PARTICLE_TYPE_OLD,MPI%iErr)
>
> CALL MPI_Type_Commit(MPI_PARTICLE_TYPE_OLD,MPI%iErr)
>
> !
>
> CALL MPI_TYPE_CREATE_RESIZED(MPI_PARTICLE_TYPE_OLD, 
>DISPLACEMENTS(1),DISPLACEMENTS(2) - DISPLACEMENTS(1), MPI_PARTICLE_TYPE)
>
>
>
>This does not work. When my program run, I get an error:
>
>
>forrtl: severe (174): SIGSEGV, segmentation fault occurred.
>
>
>I have read the manual but probably I am not able to understand  
>MPI_TYPE_CREATE_RESIZED. 
>
>
>Someone could help me?
>
>                                                                               
>                                                              
>
>Thanks a lot
>
>Diego
>
>
>
>Diego
>
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: 
>http://www.open-mpi.org/community/lists/users/2015/01/26092.php
>
>


Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-04 Thread Gilles Gouaillardet
Diego,

here is an updated revision i will double check tomorrow
/* i dit not test it yet, so forgive me it it does not compile/work */

Cheers,

Gilles

On Sun, Jan 4, 2015 at 6:48 PM, Diego Avesani 
wrote:

> Dear Gilles, Dear all,
>
> in the attachment you can find the program.
>
> What do you meam "remove mpi_get_address(dummy) from all displacements".
>
> Thanks for all your help
>
> Diego
>
>
>
> Diego
>
>
> On 3 January 2015 at 00:45, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
>> Diego,
>>
>> George gave you the solution,
>>
>> The snippet you posted has two mistakes
>> You did not remove mpi_get_address(dummy) from all displacements
>> (See my previous reply)
>> You pass incorrect values to mpi_type_create_resized
>>
>> Can you post a trimmed version of your program instead of a snippet ?
>>
>> Gus is right about using double precision vs real and -r8
>>
>> Cheers,
>>
>> Gilles
>>
>> Diego Avesani さんのメール:
>> Dear Gilles Dear all,
>>
>> I have done all that to avoid to pedding an integer, as suggested by
>> George.
>> I define tParticle as a common object.
>> I am using Intel fortran compiler.
>>
>> George suggests:
>>
>> *"" The displacements are relative to the benign of your particle type.
>> Thus the first one is not 0 but the displacement of “integer :: ip” due to
>> the fact that the compiler is allowed to introduce gaps in order to better
>> align.*
>>
>> *  DISPLACEMENTS(1)=MPI_GET_ADDRESS(dummy%ip)*
>> *  DISPLACEMENTS(2)=**MPI_GET_ADDRESS(dummy%RP[1])*
>>
>> *  DISPLACEMENTS(3)=**MPI_GET_ADDRESS(dummy%QQ[1])*
>>
>> *and then remove the MPI_GET_ADDRESS(dummy) from all of them.*
>>
>> *3. After creating the structure type you need to resize it in order to
>> correctly determine the span of the entire structure, and how an array of
>> such structures lays in memory. Something like:*
>> *MPI_TYPE_CREATE_RESIZED(old type, DISPLACEMENT(1),*
>> *   MPI_GET_ADDRESS(dummy[2]) - MPI_GET_ADDRESS(dummy[1]), newt) ""*
>>
>> What do you think?
>> George, Did i miss something?
>>
>> Thanks a lot
>>
>>
>>
>> Diego
>>
>>
>> On 2 January 2015 at 12:51, Gilles Gouaillardet <
>> gilles.gouaillar...@gmail.com> wrote:
>>
>>> Diego,
>>>
>>> First, i recommend you redefine tParticle and add a padding integer so
>>> everything is aligned.
>>>
>>>
>>> Before invoking MPI_Type_create_struct, you need to
>>> call MPI_Get_address(dummy, base, MPI%err)
>>> displacements = displacements - base
>>>
>>> MPI_Type_create_resized might be unnecessary if tParticle is aligned
>>> And the lower bound should be zero.
>>>
>>> BTW, which compiler are you using ?
>>> Is tParticle object a common ?
>>> iirc, intel compiler aligns types automatically, but not commons, and
>>> that means MPI_Type_create_struct is not aligned as it should most of the
>>> time.
>>>
>>> Cheers,
>>>
>>> Gilles
>>>
>>> Diego Avesani さんのメール:
>>>
>>> dear all,
>>>
>>> I have a problem with MPI_Type_Create_Struct and MPI_TYPE_CREATE_RESIZED.
>>>
>>> I have this variable type:
>>>
>>> *  TYPE tParticle*
>>> * INTEGER  :: ip*
>>> * REAL :: RP(2)*
>>> * REAL :: QQ(2)*
>>> *  ENDTYPE tParticle*
>>>
>>> Then I define:
>>>
>>> Nstruct=3
>>> *ALLOCATE(TYPES(Nstruct))*
>>> *ALLOCATE(LENGTHS(Nstruct))*
>>> *ALLOCATE(DISPLACEMENTS(Nstruct))*
>>> *!set the types*
>>> *TYPES(1) = MPI_INTEGER*
>>> *TYPES(2) = MPI_DOUBLE_PRECISION*
>>> *TYPES(3) = MPI_DOUBLE_PRECISION*
>>> *!set the lengths*
>>> *LENGTHS(1) = 1*
>>> *LENGTHS(2) = 2*
>>> *LENGTHS(3) = 2*
>>>
>>> As gently suggested by Nick Papior Andersen and George Bosilca some
>>> months ago, I checked the variable adress to resize my struct variable to
>>> avoid empty space and
>>> to have a more general definition.
>>>
>>> * !*
>>> * CALL MPI_GET_ADDRESS(dummy%ip,DISPLACEMENTS(1), MPI%iErr)*
>>> * CALL MPI_GET_ADDRESS(dummy%RP(1), DISPLACEMENTS(2), MPI%iErr)*
>>> * CALL MPI_GET_ADDRESS(dummy%QQ(1), DISPLACEMENTS(3), MPI%iErr)*
>>> * !*
>>> * CALL
>>> MPI_Type_Create_Struct(

Re: [OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-04 Thread Gilles Gouaillardet
Diego,

MPI_Get_address was invoked with parameters in the wrong order

here is attached a fixed version

Cheers,

Gilles

On 2015/01/05 2:32, Diego Avesani wrote:
> Dear Gilles, Dear all,
>
> It works. The only thing that is missed is:
>
> *CALL MPI_Finalize(MPI%iErr)*
>
> at the end of the program.
>
> Now, I have to test it sending some data from a processor to another.
> I would like to ask you if you could explain me what you have done.
> I wrote in the program:
>
> *   IF(MPI%myrank==1)THEN*
> *  WRITE(*,*) DISPLACEMENTS*
> *   ENDIF*
>
> and the results is:
>
>*139835891001320  -139835852218120  -139835852213832*
> *  -139835852195016   8030673735967299609*
>
> I am not able to understand it.
>
> Thanks a lot.
>
> In the attachment you can find the program
>
>
>
>
>
>
>
>
> Diego
>
>
> On 4 January 2015 at 12:10, Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
>> Diego,
>>
>> here is an updated revision i will double check tomorrow
>> /* i dit not test it yet, so forgive me it it does not compile/work */
>>
>> Cheers,
>>
>> Gilles
>>
>> On Sun, Jan 4, 2015 at 6:48 PM, Diego Avesani 
>> wrote:
>>
>>> Dear Gilles, Dear all,
>>>
>>> in the attachment you can find the program.
>>>
>>> What do you meam "remove mpi_get_address(dummy) from all displacements".
>>>
>>> Thanks for all your help
>>>
>>> Diego
>>>
>>>
>>>
>>> Diego
>>>
>>>
>>> On 3 January 2015 at 00:45, Gilles Gouaillardet <
>>> gilles.gouaillar...@gmail.com> wrote:
>>>
>>>> Diego,
>>>>
>>>> George gave you the solution,
>>>>
>>>> The snippet you posted has two mistakes
>>>> You did not remove mpi_get_address(dummy) from all displacements
>>>> (See my previous reply)
>>>> You pass incorrect values to mpi_type_create_resized
>>>>
>>>> Can you post a trimmed version of your program instead of a snippet ?
>>>>
>>>> Gus is right about using double precision vs real and -r8
>>>>
>>>> Cheers,
>>>>
>>>> Gilles
>>>>
>>>> Diego Avesani ??:
>>>> Dear Gilles Dear all,
>>>>
>>>> I have done all that to avoid to pedding an integer, as suggested by
>>>> George.
>>>> I define tParticle as a common object.
>>>> I am using Intel fortran compiler.
>>>>
>>>> George suggests:
>>>>
>>>> *"" The displacements are relative to the benign of your particle type.
>>>> Thus the first one is not 0 but the displacement of "integer :: ip" due to
>>>> the fact that the compiler is allowed to introduce gaps in order to better
>>>> align.*
>>>>
>>>> *  DISPLACEMENTS(1)=MPI_GET_ADDRESS(dummy%ip)*
>>>> *  DISPLACEMENTS(2)=**MPI_GET_ADDRESS(dummy%RP[1])*
>>>>
>>>> *  DISPLACEMENTS(3)=**MPI_GET_ADDRESS(dummy%QQ[1])*
>>>>
>>>> *and then remove the MPI_GET_ADDRESS(dummy) from all of them.*
>>>>
>>>> *3. After creating the structure type you need to resize it in order to
>>>> correctly determine the span of the entire structure, and how an array of
>>>> such structures lays in memory. Something like:*
>>>> *MPI_TYPE_CREATE_RESIZED(old type, DISPLACEMENT(1),*
>>>> *   MPI_GET_ADDRESS(dummy[2]) - MPI_GET_ADDRESS(dummy[1]), newt) ""*
>>>>
>>>> What do you think?
>>>> George, Did i miss something?
>>>>
>>>> Thanks a lot
>>>>
>>>>
>>>>
>>>> Diego
>>>>
>>>>
>>>> On 2 January 2015 at 12:51, Gilles Gouaillardet <
>>>> gilles.gouaillar...@gmail.com> wrote:
>>>>
>>>>> Diego,
>>>>>
>>>>> First, i recommend you redefine tParticle and add a padding integer so
>>>>> everything is aligned.
>>>>>
>>>>>
>>>>> Before invoking MPI_Type_create_struct, you need to
>>>>> call MPI_Get_address(dummy, base, MPI%err)
>>>>> displacements = displacements - base
>>>>>
>>>>> MPI_Type_create_resized might be unnecessary if tParticle is aligned
>>>>> And the lower bound should be zero.
>>&

Re: [OMPI users] OMPI users] OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-05 Thread Gilles Gouaillardet
Diego,

The compiler likely added some padding after %ip to have data aligned on 128 
bits.

You need two dummies in case the compiler adds some padding at the end of the 
type.

Cheers,

Gilles

Diego Avesani さんのメール:
>Dear Gilles, Dear all,
>
>thanks, thanks a lot.
>
>
>Could you explain it to me, please? 
>
>
>I mean, when I print displacements I get:
>
>
>displacements(0)= 6922656
>
>displacements(1)= 0             
>
>displacements(2)= 16
>
>displacements(3)= 48
>
>displacements(4)= 112
>
> 
>
>Why do I have 16 spaces in displacements(2), I have only an integer in 
>dummy%ip?
>
>Why do you use dummy(1) and dummy(2)?
>
>
>Thanks a lot    
>
>
>
>Diego
>
>
>On 5 January 2015 at 02:44, Gilles Gouaillardet 
> wrote:
>
>Diego,
>
>MPI_Get_address was invoked with parameters in the wrong order
>
>here is attached a fixed version
>
>Cheers,
>
>Gilles
>
>On 2015/01/05 2:32, Diego Avesani wrote:
>
>Dear Gilles, Dear all, It works. The only thing that is missed is: *CALL 
>MPI_Finalize(MPI%iErr)* at the end of the program. Now, I have to test it 
>sending some data from a processor to another. I would like to ask you if you 
>could explain me what you have done. I wrote in the program: * 
>IF(MPI%myrank==1)THEN* * WRITE(*,*) DISPLACEMENTS* * ENDIF* and the results 
>is: *139835891001320 -139835852218120 -139835852213832* * -139835852195016 
>8030673735967299609* I am not able to understand it. Thanks a lot. In the 
>attachment you can find the program Diego On 4 January 2015 at 12:10, Gilles 
>Gouaillardet < gilles.gouaillar...@gmail.com> wrote: 
>
>Diego, here is an updated revision i will double check tomorrow /* i dit not 
>test it yet, so forgive me it it does not compile/work */ Cheers, Gilles On 
>Sun, Jan 4, 2015 at 6:48 PM, Diego Avesani  wrote: 
>
>Dear Gilles, Dear all, in the attachment you can find the program. What do you 
>meam "remove mpi_get_address(dummy) from all displacements". Thanks for all 
>your help Diego Diego On 3 January 2015 at 00:45, Gilles Gouaillardet < 
>gilles.gouaillar...@gmail.com> wrote: 
>
>Diego, George gave you the solution, The snippet you posted has two mistakes 
>You did not remove mpi_get_address(dummy) from all displacements (See my 
>previous reply) You pass incorrect values to mpi_type_create_resized Can you 
>post a trimmed version of your program instead of a snippet ? Gus is right 
>about using double precision vs real and -r8 Cheers, Gilles Diego Avesani 
>さんのメー
>
>ル: Dear Gilles Dear all, I have done all that to avoid to pedding an integer, 
>as suggested by George. I define tParticle as a common object. I am using 
>Intel fortran compiler. George suggests: *"" The displacements are relative to 
>the benign of your particle type. Thus the first one is not 0 but the 
>displacement of “integer :: ip” due to the fact that the compiler is allowed 
>to introduce gaps in order to better align.* * 
>DISPLACEMENTS(1)=MPI_GET_ADDRESS(dummy%ip)* * 
>DISPLACEMENTS(2)=**MPI_GET_ADDRESS(dummy%RP[1])* * 
>DISPLACEMENTS(3)=**MPI_GET_ADDRESS(dummy%QQ[1])* *and then remove the 
>MPI_GET_ADDRESS(dummy) from all of them.* *3. After creating the structure 
>type you need to resize it in order to correctly determine the span of the 
>entire structure, and how an array of such structures lays in memory. 
>Something like:* *MPI_TYPE_CREATE_RESIZED(old type, DISPLACEMENT(1),* * 
>MPI_GET_ADDRESS(dummy[2]) - MPI_GET_ADDRESS(dummy[1]), newt) ""* What do you 
>think? George, Did i miss something? Thanks a lot Diego On 2 January 2015 at 
>12:51, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: 
>
>Diego, First, i recommend you redefine tParticle and add a padding integer so 
>everything is aligned. Before invoking MPI_Type_create_struct, you need to 
>call MPI_Get_address(dummy, base, MPI%err) displacements = displacements - 
>base MPI_Type_create_resized might be unnecessary if tParticle is aligned And 
>the lower bound should be zero. BTW, which compiler are you using ? Is 
>tParticle object a common ? iirc, intel compiler aligns types automatically, 
>but not commons, and that means MPI_Type_create_struct is not aligned as it 
>should most of the time. Cheers, Gilles Diego Avesani 
>さんのメー
>
>ル: dear all, I have a problem with MPI_Type_Create_Struct and 
>MPI_TYPE_CREATE_RESIZED. I have this variable type: * TYPE tParticle* * 
>INTEGER :: ip* * REAL :: RP(2)* * REAL :: QQ(2)* * ENDTYPE tParticle* Then I 
>define: Nstruct=3 *ALLOCATE(TYPES(Nstruct))* *ALLOCATE(LENGTHS(Nstruct))* 
>*ALLOCATE(DISPLACEMENTS(Nstruct))* *!set the types* *TYPES(1) = MPI_INTEGER* 
>*TYPES(2) = MPI_DOUBLE_PRECISION* *TYPES(3

  1   2   3   4   5   6   7   8   9   10   >