[OMPI users] Ubuntu and MPI

2015-11-19 Thread dave
Hello-  I have a Ubuntu 12.04 distro, running on a 32 platform. I 
installed http://www.open-mpi.org/software/ompi/v1.10/downloads/openm . 
I have hello_c.c in the examples subdirectory. I installed 'c' compiler.


When I run mpicc hello_c.c screen dump shows:

dave@ubuntu-desk:~/Desktop/openmpi-1.10.1$ mpicc hello_c.c
The program 'mpicc' can be found in the following packages:
 * lam4-dev
 * libmpich-mpd1.0-dev
 * libmpich-shmem1.0-dev
 * libmpich1.0-dev
 * libmpich2-dev
 * libopenmpi-dev
 * libopenmpi1.5-dev
Try: sudo apt-get install 
dave@ubuntu-desk:~/Desktop/openmpi-1.10.1$

This code helloworld.c works:

/* Hello World C Program */

#include

main()
{
printf("Hello World!");

return 0;

}



I am at a stop point and was hoping for some assist from the group. What 
info/log file can I send that will help?


Newbee here


Re: [OMPI users] [OMPI devel] Slides from the Open MPI SC'15 State of the Union BOF

2015-11-19 Thread Jeff Squyres (jsquyres)
It appears that the PDF that was originally posted was corrupted.  Doh!

The file has been fixed -- you should be able to download and open it correctly 
now:

http://www.open-mpi.org/papers/sc-2015/

Sorry about that, folks!


> On Nov 19, 2015, at 9:03 AM, Jeff Squyres (jsquyres)  
> wrote:
> 
> Thanks to the over 100 people who came to the Open MPI State of the Union BOF 
> yesterday.  George Bosilca from U. Tennessee, Nathan Hjelm from Los Alamos 
> National Lab, and I presented where we are with Open MPI development, and 
> where we're going.
> 
> If you weren't able to join us, feel free to read through the slides:
> 
>http://www.open-mpi.org/papers/sc-2015/
> 
> Thank you!
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/11/18374.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
On Thu, Nov 19, 2015 at 4:11 PM, Howard Pritchard 
wrote:

> Hi Jeff H.
>
> Why don't you just try configuring with
>
> ./configure --prefix=my_favorite_install_dir
> --with-libfabric=install_dir_for_libfabric
> make -j 8 install
>
> and see what happens?
>
>
That was the first thing I tried.  However, it seemed to give me a
Verbs-oriented build, and Verbs is the Sith lord to us JedOFIs :-)

>From aforementioned Wiki:

../configure \
 --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \
 --disable-shared \
 --prefix=$HOME/MPI/install-ompi-ofi-gcc-gni-cori

Unfortunately, this (above) leads to an mpicc that indicates support for IB
Verbs, not OFI.
I will try again though just in case.


> Make sure before you configure that you have PrgEnv-gnu or PrgEnv-intel
> module loaded.
>
>
Yeah, I know better than to use the Cray compilers for such things (e.g.
https://github.com/jeffhammond/OpenPA/commit/965ca014ea3148ee5349e16d2cec1024271a7415
)


> Those were the configure/compiler options I used to do testing of ofi mtl
> on cori.
>
> Jeff S. - this thread has gotten intermingled with mpich setup as well,
> hence
> the suggestion for the mpich shm mechanism.
>
>
The first OSS implementation of MPI that I can use on Cray XC using OFI
gets a prize at the December MPI Forum.

Best,

Jeff



> Howard
>
>
>
> 2015-11-19 16:59 GMT-07:00 Jeff Hammond :
>
>>
>>> How did you configure for Cori?  You need to be using the slurm plm
>>> component for that system.  I know this sounds like gibberish.
>>>
>>>
>> ../configure --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \
>>  --enable-mca-static=mtl-ofi \
>>  --enable-mca-no-build=btl-openib,btl-vader,btl-ugni,btl-tcp \
>>  --enable-static --disable-shared --disable-dlopen \
>>  --prefix=$HOME/MPI/install-ompi-ofi-gcc-gni-xpmem-cori \
>>  --with-cray-pmi --with-alps --with-cray-xpmem --with-slurm \
>>  --without-verbs --without-fca --without-mxm --without-ucx \
>>  --without-portals4 --without-psm --without-psm2 \
>>  --without-udreg --without-ugni --without-munge \
>>  --without-sge --without-loadleveler --without-tm --without-lsf \
>>  --without-pvfs2 --without-plfs \
>>  --without-cuda --disable-oshmem \
>>  --disable-mpi-fortran --disable-oshmem-fortran \
>>  LDFLAGS="-L/opt/cray/ugni/default/lib64 -lugni \
>>   -L/opt/cray/alps/default/lib64 -lalps -lalpslli -lalpsutil \   
>>-ldl -lrt"
>>
>>
>> This is copied from
>> https://github.com/jeffhammond/HPCInfo/blob/master/ofi/README.md#open-mpi,
>> which I note in case you want to see what changes I've made at any point in
>> the future.
>>
>>
>>> There should be a with-slurm configure option to pick up this component.
>>>
>>> Indeed there is.
>>
>>
>>> Doesn't mpich have the option to use sysv memory?  You may want to try
>>> that
>>>
>>>
>> MPICH?  Look, I may have earned my way onto Santa's naughty list more
>> than a few times, but at least I have the decency not to post MPICH
>> questions to the Open-MPI list ;-)
>>
>> If there is a way to tell Open-MPI to use shm_open without filesystem
>> backing (if that is even possible) at configure time, I'd love to do that.
>>
>>
>>> Oh for tuning params you can use env variables.  For example lets say
>>> rather than using the gni provider in ofi mtl you want to try sockets. Then
>>> do
>>>
>>> Export OMPI_MCA_mtl_ofi_provider_include=sockets
>>>
>>>
>> Thanks.  I'm glad that there is an option to set them this way.
>>
>>
>>> In the spirit OMPI - may the force be with you.
>>>
>>>
>> All I will say here is that Open-MPI has a Vader BTL :-)
>>
>>>
>>> > On Thu 19.11.2015 09:44:20 Jeff Hammond wrote:
>>> > > I have no idea what this is trying to tell me. Help?
>>> > >
>>> > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
>>> > > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
>>> > > ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418
>>> > >
>>> > > I can run the same job with srun without incident:
>>> > >
>>> > > jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
>>> > > MPI was initialized.
>>> > >
>>> > > This is on the NERSC Cori Cray XC40 system. I build Open-MPI git
>>> head from
>>> > > source for OFI libfabric.
>>> > >
>>> > > I have many other issues, which I will report later. As a spoiler,
>>> if I
>>> > > cannot use your mpirun, I cannot set any of the MCA options there. Is
>>> > > there a method to set MCA options with environment variables? I
>>> could not
>>> > > find this documented anywhere.
>>> > >
>>> > > In particular, is there a way to cause shm to not use the global
>>> > > filesystem? I see this issue comes up a lot and I read the list
>>> archives,
>>> > > but the warning message (
>>> > >
>>> 

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard Pritchard
Hi Jeff,

I finally got an allocation on cori - its one busy machine.

Anyway, using the ompi i'd built on edison with the above recommended
configure options
I was able to run using either srun or mpirun on cori provided that in the
later case I used

mpirun -np X -N Y --mca plm slurm ./my_favorite_app

I will make an adjustment to the alps plm launcher to disqualify itself if
the wlm_detect
facility on the cray reports that srun is the launcher.  That's a minor fix
and should make
it in to v2.x in a week or so.  It will be a runtime selection so you only
have to build ompi
once for use either on edison or cori.

Howard


2015-11-19 17:11 GMT-07:00 Howard Pritchard :

> Hi Jeff H.
>
> Why don't you just try configuring with
>
> ./configure --prefix=my_favorite_install_dir
> --with-libfabric=install_dir_for_libfabric
> make -j 8 install
>
> and see what happens?
>
> Make sure before you configure that you have PrgEnv-gnu or PrgEnv-intel
> module loaded.
>
> Those were the configure/compiler options I used to do testing of ofi mtl
> on cori.
>
> Jeff S. - this thread has gotten intermingled with mpich setup as well,
> hence
> the suggestion for the mpich shm mechanism.
>
>
> Howard
>
>
>
> 2015-11-19 16:59 GMT-07:00 Jeff Hammond :
>
>>
>>> How did you configure for Cori?  You need to be using the slurm plm
>>> component for that system.  I know this sounds like gibberish.
>>>
>>>
>> ../configure --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \
>>  --enable-mca-static=mtl-ofi \
>>  --enable-mca-no-build=btl-openib,btl-vader,btl-ugni,btl-tcp \
>>  --enable-static --disable-shared --disable-dlopen \
>>  --prefix=$HOME/MPI/install-ompi-ofi-gcc-gni-xpmem-cori \
>>  --with-cray-pmi --with-alps --with-cray-xpmem --with-slurm \
>>  --without-verbs --without-fca --without-mxm --without-ucx \
>>  --without-portals4 --without-psm --without-psm2 \
>>  --without-udreg --without-ugni --without-munge \
>>  --without-sge --without-loadleveler --without-tm --without-lsf \
>>  --without-pvfs2 --without-plfs \
>>  --without-cuda --disable-oshmem \
>>  --disable-mpi-fortran --disable-oshmem-fortran \
>>  LDFLAGS="-L/opt/cray/ugni/default/lib64 -lugni \
>>   -L/opt/cray/alps/default/lib64 -lalps -lalpslli -lalpsutil \   
>>-ldl -lrt"
>>
>>
>> This is copied from
>> https://github.com/jeffhammond/HPCInfo/blob/master/ofi/README.md#open-mpi,
>> which I note in case you want to see what changes I've made at any point in
>> the future.
>>
>>
>>> There should be a with-slurm configure option to pick up this component.
>>>
>>> Indeed there is.
>>
>>
>>> Doesn't mpich have the option to use sysv memory?  You may want to try
>>> that
>>>
>>>
>> MPICH?  Look, I may have earned my way onto Santa's naughty list more
>> than a few times, but at least I have the decency not to post MPICH
>> questions to the Open-MPI list ;-)
>>
>> If there is a way to tell Open-MPI to use shm_open without filesystem
>> backing (if that is even possible) at configure time, I'd love to do that.
>>
>>
>>> Oh for tuning params you can use env variables.  For example lets say
>>> rather than using the gni provider in ofi mtl you want to try sockets. Then
>>> do
>>>
>>> Export OMPI_MCA_mtl_ofi_provider_include=sockets
>>>
>>>
>> Thanks.  I'm glad that there is an option to set them this way.
>>
>>
>>> In the spirit OMPI - may the force be with you.
>>>
>>>
>> All I will say here is that Open-MPI has a Vader BTL :-)
>>
>>>
>>> > On Thu 19.11.2015 09:44:20 Jeff Hammond wrote:
>>> > > I have no idea what this is trying to tell me. Help?
>>> > >
>>> > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
>>> > > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
>>> > > ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418
>>> > >
>>> > > I can run the same job with srun without incident:
>>> > >
>>> > > jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
>>> > > MPI was initialized.
>>> > >
>>> > > This is on the NERSC Cori Cray XC40 system. I build Open-MPI git
>>> head from
>>> > > source for OFI libfabric.
>>> > >
>>> > > I have many other issues, which I will report later. As a spoiler,
>>> if I
>>> > > cannot use your mpirun, I cannot set any of the MCA options there. Is
>>> > > there a method to set MCA options with environment variables? I
>>> could not
>>> > > find this documented anywhere.
>>> > >
>>> > > In particular, is there a way to cause shm to not use the global
>>> > > filesystem? I see this issue comes up a lot and I read the list
>>> archives,
>>> > > but the warning message (
>>> > >
>>> https://github.com/hpc/cce-mpi-openmpi-1.6.4/blob/master/ompi/mca/common/sm/
>>> > > help-mpi-common-sm.txt) suggested that I could override it by
>>> setting TMP,
>>> 

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
>
>
> How did you configure for Cori?  You need to be using the slurm plm
> component for that system.  I know this sounds like gibberish.
>
>
../configure --with-libfabric=$HOME/OFI/install-ofi-gcc-gni-cori \
 --enable-mca-static=mtl-ofi \
 --enable-mca-no-build=btl-openib,btl-vader,btl-ugni,btl-tcp \
 --enable-static --disable-shared --disable-dlopen \
 --prefix=$HOME/MPI/install-ompi-ofi-gcc-gni-xpmem-cori \
 --with-cray-pmi --with-alps --with-cray-xpmem --with-slurm \
 --without-verbs --without-fca --without-mxm --without-ucx \
 --without-portals4 --without-psm --without-psm2 \
 --without-udreg --without-ugni --without-munge \
 --without-sge --without-loadleveler --without-tm --without-lsf \
 --without-pvfs2 --without-plfs \
 --without-cuda --disable-oshmem \
 --disable-mpi-fortran --disable-oshmem-fortran \
 LDFLAGS="-L/opt/cray/ugni/default/lib64 -lugni \
-L/opt/cray/alps/default/lib64 -lalps -lalpslli -lalpsutil
\  -ldl -lrt"


This is copied from
https://github.com/jeffhammond/HPCInfo/blob/master/ofi/README.md#open-mpi,
which I note in case you want to see what changes I've made at any point in
the future.


> There should be a with-slurm configure option to pick up this component.
>
> Indeed there is.


> Doesn't mpich have the option to use sysv memory?  You may want to try that
>
>
MPICH?  Look, I may have earned my way onto Santa's naughty list more than
a few times, but at least I have the decency not to post MPICH questions to
the Open-MPI list ;-)

If there is a way to tell Open-MPI to use shm_open without filesystem
backing (if that is even possible) at configure time, I'd love to do that.


> Oh for tuning params you can use env variables.  For example lets say
> rather than using the gni provider in ofi mtl you want to try sockets. Then
> do
>
> Export OMPI_MCA_mtl_ofi_provider_include=sockets
>
>
Thanks.  I'm glad that there is an option to set them this way.


> In the spirit OMPI - may the force be with you.
>
>
All I will say here is that Open-MPI has a Vader BTL :-)

>
> > On Thu 19.11.2015 09:44:20 Jeff Hammond wrote:
> > > I have no idea what this is trying to tell me. Help?
> > >
> > > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
> > > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
> > > ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418
> > >
> > > I can run the same job with srun without incident:
> > >
> > > jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
> > > MPI was initialized.
> > >
> > > This is on the NERSC Cori Cray XC40 system. I build Open-MPI git head
> from
> > > source for OFI libfabric.
> > >
> > > I have many other issues, which I will report later. As a spoiler, if I
> > > cannot use your mpirun, I cannot set any of the MCA options there. Is
> > > there a method to set MCA options with environment variables? I could
> not
> > > find this documented anywhere.
> > >
> > > In particular, is there a way to cause shm to not use the global
> > > filesystem? I see this issue comes up a lot and I read the list
> archives,
> > > but the warning message (
> > >
> https://github.com/hpc/cce-mpi-openmpi-1.6.4/blob/master/ompi/mca/common/sm/
> > > help-mpi-common-sm.txt) suggested that I could override it by setting
> TMP,
> > > TEMP or TEMPDIR, which I did to no avail.
> >
> > From my experience on edison: the one environment variable that does
> works is TMPDIR - the one that is not listed in the error message :-)
>

That's great.  I will try that now.  Is there a Github issue open already
to fix that documentation?  If not...


> > Can't help you with your mpirun problem though ...
>
> No worries.  I appreciate all the help I can get.

Thanks,

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with Intel-Ftn-compiler

2015-11-19 Thread Gilles Gouaillardet

Michael,

in the mean time, you can use 'mpi_f08' instead of 'use mpi'
this is really a f90 binding issue, and f08 is safe

Cheers,

Gilles

On 11/19/2015 10:21 PM, michael.rach...@dlr.de wrote:


Thank You,  Nick and Gilles,

I hope the administrators of the cluster will be so kind  and will 
update OpenMPI for me (and others) soon.


Greetings

Michael

*Von:*users [mailto:users-boun...@open-mpi.org] *Im Auftrag von 
*Gilles Gouaillardet

*Gesendet:* Donnerstag, 19. November 2015 12:59
*An:* Open MPI Users
*Betreff:* Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 
1.10.0 with Intel-Ftn-compiler


Thanks Nick for the pointer !

Michael,

good news is you do not have to upgrade ifort,

but you have to update to 1.10.1

(intel 16 changed the way gcc pragmas are handled, and ompi has been 
made aware in 1.10.1)


1.10.1 fixes many bugs from 1.10.0, so I strongly encourage anyone to 
use 1.10.1


Cheers,

Gilles

On Thursday, November 19, 2015, Nick Papior > wrote:


Maybe I can chip in,

We use OpenMPI 1.10.1 with Intel /2016.1.0.423501 without problems.

I could not get 1.10.0 to work, one reason is: 
http://www.open-mpi.org/community/lists/users/2015/09/27655.php


On a side-note, please note that if you require scalapack you may need 
to follow this approach:


https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302

2015-11-19 11:24 GMT+01:00 >:


Sorry, Gilles,

I cannot  update to more recent versions, because what I used is the 
newest combination of OpenMPI and Intel-Ftn  available on that cluster.


When looking at the list of improvements  on the OpenMPI website for 
 OpenMPI 1.10.1 compared to 1.10.0, I do not remember having seen this 
item to be corrected.


Greeting

Michael Rachner

*Von:*users [mailto:users-boun...@open-mpi.org 
] *Im 
Auftrag von *Gilles Gouaillardet

*Gesendet:* Donnerstag, 19. November 2015 10:21
*An:* Open MPI Users
*Betreff:* Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 
1.10.0 with Intel-Ftn-compiler


Michael,

I remember i saw similar reports.

Could you give a try to the latest v1.10.1 ?
And if that still does not work, can you upgrade icc suite and give it 
an other try ?


I cannot remember whether this is an ifort bug or the way ompi uses 
fortran...


Btw, any reason why you do not
Use mpi_f08 ?

HTH

Gilles

michael.rach...@dlr.de 
 wrote:


Dear developers of OpenMPI,

I am trying to run our parallelized Ftn-95 code on a Linux cluster 
with OpenMPI-1-10.0 and Intel-16.0.0 Fortran compiler.


In the code I use the  module MPI  (“use MPI”-stmts).

However I am not able to compile the code, because of compiler error 
messages like this:


/src_SPRAY/mpi_wrapper.f90(2065): error #6285: There is no matching 
specific subroutin for this generic subroutine call.   [MPI_REDUCE]


The problem seems for me to be this one:

The interfaces in the module MPI for the MPI-routines do not accept a 
send or receive buffer array, which is


actually a variable, an array element or a constant (like MPI_IN_PLACE).

Example 1:

 This does not work (gives the compiler error message:  error 
#6285: There is no matching specific subroutin for this generic 
subroutine call )


 ivar=123! ßivar is an integer variable, not an array

*call* MPI_BCAST( ivar, 1, MPI_INTEGER, 0, MPI_COMM_WORLD), ierr_mpi 
)! ß- this should work, but is not accepted by the compiler


  only this cumbersome workaround works:

  ivar=123

allocate( iarr(1) )

iarr(1) = ivar

* call*MPI_BCAST( iarr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, 
ierr_mpi )! ß- this workaround works


ivar = iarr(1)

deallocate( iarr(1) )

Example 2:

 Any call of an MPI-routine with MPI_IN_PLACE does not work, like 
that coding:


*if*(lmaster) *then*

*call* MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX & 
! ß- this should work, but is not accepted by the compiler


,0_INT4, MPI_COMM_WORLD, ierr_mpi )

*else*  ! slaves

*call* MPI_REDUCE( rbuffarr, rdummyarr, nelem, MPI_REAL8, MPI_MAX &

,0_INT4, MPI_COMM_WORLD, ierr_mpi )

*endif*

This results in this compiler error message:

/src_SPRAY/mpi_wrapper.f90(2122): error #6285: There is no matching 
specific subroutine for this generic subroutine call.   [MPI_REDUCE]


call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, 
MPI_MAX &


-^

In our code I observed the bug with MPI_BCAST, MPI_REDUCE, MPI_ALLREDUCE,

but probably there may be other MPI-routines with the same kind of bug.

This bug occurred for:   OpenMPI-1.10.0 
 with Intel-16.0.0


In contrast, this bug did NOT occur for: OpenMPI-1.8.8

Re: [OMPI users] Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072

2015-11-19 Thread Saurabh T
I apologize, I have the wrong lines from strace for the initial file there (of 
course). The file with fd = 11 which causes the problem is called 
shared_mem_pool.[host] and fruncate(11, 134217736) is called on it. (This is 
exactly 1024 times the ulimit of 131072 which makes sense as the ulimit is in 
1K blocks).


From: saur...@hotmail.com
To: us...@open-mpi.org
Subject: RE: Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072
List-Post: users@lists.open-mpi.org
Date: Thu, 19 Nov 2015 17:08:22 -0500




> Could you please provide a little more info regarding the environment you

> are running under (which resource mgr or not, etc), how many nodes you had


> in the allocation, etc?



> There is no reason why something should behave that way. So it would help


> if we could understand the setup.


> Ralph


To answer Ralph's above question on the other thread, all nodes are  on the 
same machine orterun was run on. It's a redhat 7 64-bit gcc 4.8 install of 
openmpi 1.10.1. The only atypical thing is that
btl_tcp_if_exclude = virbr0
has been added to openmpi-mca-params.conf based on some failures I was seeing 
before.
(And now of course I've added btl = ^sm as well to fix this issue, see my other 
response).

Relevant output from strace (without the btl = ^sm) is below. Stuff in square 
brackets are my minor edits and snips.

open("/tmp/openmpi-sessions-[user]@[host]_0/40072/1/1/vader_segment.[host].1", 
O_RDWR|O_CREAT, 0600) = 12
ftruncate(12, 4194312)  = 0
mmap(NULL, 4194312, PROT_READ|PROT_WRITE, MAP_SHARED, 12, 0) = 0x7fe506c8a000
close(12)   = 0
write(9, "\1\0\0\0\0\0\0\0", 8) = 8
[...]
poll([{fd=5, events=POLLIN}, {fd=11, events=POLLIN}], 2, 0)= -1 
EFBIG (File too large)
--- SIGXFSZ {si_signo=SIGXFSZ, si_code=SI_USER, si_pid=12329, si_uid=1005} ---
--

From: saur...@hotmail.com
To: us...@open-mpi.org
Subject: Openmpi 1.10.1 fails with SIGXFSZ on file limit <= 131072
List-Post: users@lists.open-mpi.org
Date: Thu, 19 Nov 2015 15:24:08 -0500




Hi,

Sorry my previous email was garbled, sending it again.

> cd examples
> make hello_cxx

> ulimit -f 131073
> orterun -np 3 hello_cxx
Hello, world
(etc)

> ulimit -f 131072
> orterun -np 3 hello_cxx
--
orterun noticed that process rank 0 with PID 4473 on node sim16 exited on 
signal 25 (File size limit exceeded).
--

Any thoughts? 



  

Re: [OMPI users] OpenMPI 1.10.1 crashes with file size limit <= 131072

2015-11-19 Thread Ralph Castain
Could you please provide a little more info regarding the environment you
are running under (which resource mgr or not, etc), how many nodes you had
in the allocation, etc?

There is no reason why something should behave that way. So it would help
if we could understand the setup.
Ralph


On Thu, Nov 19, 2015 at 2:20 PM, Saurabh T  wrote:

> Here's what I find:
>
> > cd examples
> > make hello_cxx
> > ulimit -f 131073
>
> > orterun -np 3 hello_cxxHello, world!
> [Etc]
>
> > ulimit -f 131072
>
> > orterun -np 3 hello_cxx
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/11/28065.php
>


[OMPI users] OpenMPI 1.10.1 crashes with file size limit <= 131072

2015-11-19 Thread Saurabh T
Here's what I find:

> cd examples
> make hello_cxx
> ulimit -f 131073

> orterun -np 3 hello_cxxHello, world! 
[Etc]

> ulimit -f 131072

> orterun -np 3 hello_cxx

  

Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Howard
Hi Jeff

How did you configure for Cori?  You need to be using the slurm plm component 
for that system.  I know this sounds like gibberish.  

There should be a with-slurm configure option to pick up this component. 

Doesn't mpich have the option to use sysv memory?  You may want to try that

Oh for tuning params you can use env variables.  For example lets say rather 
than using the gni provider in ofi mtl you want to try sockets. Then do

Export OMPI_MCA_mtl_ofi_provider_include=sockets

In the spirit OMPI - may the force be with you.   

Howard 

Von meinem iPhone gesendet

> Am 19.11.2015 um 11:51 schrieb Martin Siegert :
> 
> Hi Jeff,
>  
> On Thu 19.11.2015 09:44:20 Jeff Hammond wrote:
> > I have no idea what this is trying to tell me. Help?
> >
> > jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
> > [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
> > ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418
> >
> > I can run the same job with srun without incident:
> >
> > jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
> > MPI was initialized.
> >
> > This is on the NERSC Cori Cray XC40 system. I build Open-MPI git head from
> > source for OFI libfabric.
> >
> > I have many other issues, which I will report later. As a spoiler, if I
> > cannot use your mpirun, I cannot set any of the MCA options there. Is
> > there a method to set MCA options with environment variables? I could not
> > find this documented anywhere.
> >
> > In particular, is there a way to cause shm to not use the global
> > filesystem? I see this issue comes up a lot and I read the list archives,
> > but the warning message (
> > https://github.com/hpc/cce-mpi-openmpi-1.6.4/blob/master/ompi/mca/common/sm/
> > help-mpi-common-sm.txt) suggested that I could override it by setting TMP,
> > TEMP or TEMPDIR, which I did to no avail.
>  
> From my experience on edison: the one environment variable that does works is 
> TMPDIR - the one that is not listed in the error message :-)
>  
> Can't help you with your mpirun problem though ...
>  
> Cheers,
> Martin
>  
> --
> Martin Siegert
> Head, Research Computing
> WestGrid/ComputeCanada Site Lead
> Simon Fraser University
> Burnaby, British Columbia
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/11/28063.php


Re: [OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Martin Siegert
Hi Jeff,

On Thu 19.11.2015 09:44:20 Jeff Hammond wrote:
> I have no idea what this is trying to tell me.  Help?
> 
> jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
> [nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
> ../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418
> 
> I can run the same job with srun without incident:
> 
> jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
> MPI was initialized.
> 
> This is on the NERSC Cori Cray XC40 system.  I build Open-MPI git head 
from
> source for OFI libfabric.
> 
> I have many other issues, which I will report later.  As a spoiler, if I
> cannot use your mpirun, I cannot set any of the MCA options there.  Is
> there a method to set MCA options with environment variables?  I could 
not
> find this documented anywhere.
> 
> In particular, is there a way to cause shm to not use the global
> filesystem?  I see this issue comes up a lot and I read the list archives,
> but the warning message (
> https://github.com/hpc/cce-mpi-openmpi-1.6.4/blob/master/ompi/mca/common/sm/
> help-mpi-common-sm.txt) suggested that I could override it by setting 
TMP,
> TEMP or TEMPDIR, which I did to no avail.

>From my experience on edison: the one environment variable that does 
works is TMPDIR - the one that is not listed in the error message :-)

Can't help you with your mpirun problem though ...

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid/ComputeCanada Site Lead
Simon Fraser University
Burnaby, British Columbia


[OMPI users] help understand unhelpful ORTE error message

2015-11-19 Thread Jeff Hammond
I have no idea what this is trying to tell me.  Help?

jhammond@nid00024:~/MPI/qoit/collectives> mpirun -n 2 ./driver.x 64
[nid00024:00482] [[46168,0],0] ORTE_ERROR_LOG: Not found in file
../../../../../orte/mca/plm/alps/plm_alps_module.c at line 418

I can run the same job with srun without incident:

jhammond@nid00024:~/MPI/qoit/collectives> srun -n 2 ./driver.x 64
MPI was initialized.

This is on the NERSC Cori Cray XC40 system.  I build Open-MPI git head from
source for OFI libfabric.

I have many other issues, which I will report later.  As a spoiler, if I
cannot use your mpirun, I cannot set any of the MCA options there.  Is
there a method to set MCA options with environment variables?  I could not
find this documented anywhere.

In particular, is there a way to cause shm to not use the global
filesystem?  I see this issue comes up a lot and I read the list archives,
but the warning message (
https://github.com/hpc/cce-mpi-openmpi-1.6.4/blob/master/ompi/mca/common/sm/help-mpi-common-sm.txt)
suggested that I could override it by setting TMP, TEMP or TEMPDIR, which I
did to no avail.

Thanks,

Jeff

--
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] Slides from the Open MPI SC'15 State of the Union BOF

2015-11-19 Thread Marco Atzeri



On 19/11/2015 16:15, Lev Givon wrote:

Received from Jeff Squyres (jsquyres) on Thu, Nov 19, 2015 at 10:03:33AM EST:

Thanks to the over 100 people who came to the Open MPI State of the Union BOF
yesterday.  George Bosilca from U. Tennessee, Nathan Hjelm from Los Alamos
National Lab, and I presented where we are with Open MPI development, and
where we're going.

If you weren't able to join us, feel free to read through the slides:

 http://www.open-mpi.org/papers/sc-2015/

Thank you!


FYI, there seems to be some problem with the posted PDF file - when I tried to
view it in Firefox 42 and 3 other PDF viewers (on Linux, at least), all of the
programs claimed that the file is either corrupted or misformatted.


Same on Windows.
The file seems incomplete; it misses the PDF closure

Regards
Marco



Re: [OMPI users] Slides from the Open MPI SC'15 State of the Union BOF

2015-11-19 Thread Lev Givon
Received from Jeff Squyres (jsquyres) on Thu, Nov 19, 2015 at 10:03:33AM EST:
> Thanks to the over 100 people who came to the Open MPI State of the Union BOF
> yesterday.  George Bosilca from U. Tennessee, Nathan Hjelm from Los Alamos
> National Lab, and I presented where we are with Open MPI development, and
> where we're going.
> 
> If you weren't able to join us, feel free to read through the slides:
> 
> http://www.open-mpi.org/papers/sc-2015/
> 
> Thank you!

FYI, there seems to be some problem with the posted PDF file - when I tried to
view it in Firefox 42 and 3 other PDF viewers (on Linux, at least), all of the
programs claimed that the file is either corrupted or misformatted.
-- 
Lev Givon
Bionet Group | Neurokernel Project
http://lebedov.github.io/
http://neurokernel.github.io/



[OMPI users] Slides from the Open MPI SC'15 State of the Union BOF

2015-11-19 Thread Jeff Squyres (jsquyres)
Thanks to the over 100 people who came to the Open MPI State of the Union BOF 
yesterday.  George Bosilca from U. Tennessee, Nathan Hjelm from Los Alamos 
National Lab, and I presented where we are with Open MPI development, and where 
we're going.

If you weren't able to join us, feel free to read through the slides:

http://www.open-mpi.org/papers/sc-2015/

Thank you!

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



Re: [OMPI users] mpijavac doesn't compile any thing

2015-11-19 Thread Howard Pritchard
Hi Ibrahim,

If you just try to compile with the javac do you at least see a "error:
package mpi..." does not exist?
Adding the "-verbose" option may also help with diagnosing the problem.

If the javac doesn't get that far then your problem is with the java
install.

Howard



2015-11-19 6:45 GMT-07:00 Ibrahim Ikhlawi :

>
> Hello,
>
> thank you for answering.
>
> the command mpijavac --verbose Hello.java gives me the same result as
> yours.
> JAVA_HOME ist right at me, but I don't have neither JAVA_BINDIR nor
> JAVA_ROOT.
> I think that the both variables don't cause the problem, because I was
> able to compile Hello.java before three days without any problem, but now I
> cann't.
>
> Ibrahim
>
>
> --
> Date: Wed, 18 Nov 2015 20:16:31 -0700
> From: hpprit...@gmail.com
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] mpijavac doesn't compile any thing
>
>
> Hello Ibrahim
>
> As a sanity check, could you try to compile the Hello.java in examples?
> mpijavac --verbose Hello.java
>
> you should see something like:
> /usr/bin/javac -cp
> /global/homes/h/hpp/ompi_install/lib/mpi.jar:/global/homes/h/hpp/ompi_install/lib/shmem.jar
> Hello.java
>
> You may also want to double check what your java env. variables, e.g.
> JAVA_HOME, JAVA_ROOT, and JAVA_BINDIR
> are set to.
> Howard
>
>
>
>
> --
>
> sent from my smart phonr so no good type.
>
> Howard
> On Nov 18, 2015 7:26 AM, "Ibrahim Ikhlawi" 
> wrote:
>
>
>
> Hello,
>
> I am trying to compile java classes with mpijavac, but it doesn't compile
> any class, for examle:
> Usually when I write the following line (mpijavac MyClass.java) in the
> console, it compiles and gives me the possible errors (e.g. missed
> semicolon) and the .class file will be created.
>
> But now when I compile any class with the same command (mpijavac
> AnyClass.java), it doesn't give me any error and the file AnyClass.class
> will be not created.
>
> What could be the problem?
>
> Thanks in advance
> Ibrahim
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/11/28047.php
>
>
> ___ users mailing list
> us...@open-mpi.org Subscription:
> http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/11/28049.php
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/11/28057.php
>


Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with Intel-Ftn-compiler

2015-11-19 Thread Michael.Rachner
Thank You,  Nick and Gilles,

I hope the administrators of the cluster will be so kind  and will update 
OpenMPI for me (and others) soon.

Greetings
Michael

Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles 
Gouaillardet
Gesendet: Donnerstag, 19. November 2015 12:59
An: Open MPI Users
Betreff: Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with 
Intel-Ftn-compiler

Thanks Nick for the pointer !

Michael,

good news is you do not have to upgrade ifort,
but you have to update to 1.10.1
(intel 16 changed the way gcc pragmas are handled, and ompi has been made aware 
in 1.10.1)
1.10.1 fixes many bugs from 1.10.0, so I strongly encourage anyone to use 1.10.1

Cheers,

Gilles

On Thursday, November 19, 2015, Nick Papior 
> wrote:
Maybe I can chip in,

We use OpenMPI 1.10.1 with Intel /2016.1.0.423501 without problems.

I could not get 1.10.0 to work, one reason is: 
http://www.open-mpi.org/community/lists/users/2015/09/27655.php

On a side-note, please note that if you require scalapack you may need to 
follow this approach:
https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302

2015-11-19 11:24 GMT+01:00 
>:
Sorry, Gilles,

I cannot  update to more recent versions, because what I used is the newest 
combination of OpenMPI and Intel-Ftn  available on that cluster.

When looking at the list of improvements  on the OpenMPI website for  OpenMPI 
1.10.1 compared to 1.10.0, I do not remember having seen this item to be 
corrected.

Greeting
Michael Rachner


Von: users 
[mailto:users-boun...@open-mpi.org]
 Im Auftrag von Gilles Gouaillardet
Gesendet: Donnerstag, 19. November 2015 10:21
An: Open MPI Users
Betreff: Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with 
Intel-Ftn-compiler

Michael,

I remember i saw similar reports.

Could you give a try to the latest v1.10.1 ?
And if that still does not work, can you upgrade icc suite and give it an other 
try ?

I cannot remember whether this is an ifort bug or the way ompi uses fortran...

Btw, any reason why you do not
Use mpi_f08 ?

HTH

Gilles

michael.rach...@dlr.de 
wrote:
Dear developers of OpenMPI,

I am trying to run our parallelized Ftn-95 code on a Linux cluster with 
OpenMPI-1-10.0 and Intel-16.0.0 Fortran compiler.
In the code I use the  module MPI  (“use MPI”-stmts).

However I am not able to compile the code, because of compiler error messages 
like this:

/src_SPRAY/mpi_wrapper.f90(2065): error #6285: There is no matching specific 
subroutin for this generic subroutine call.   [MPI_REDUCE]


The problem seems for me to be this one:

The interfaces in the module MPI for the MPI-routines do not accept a send or 
receive buffer array, which is
actually a variable, an array element or a constant (like MPI_IN_PLACE).

Example 1:
 This does not work (gives the compiler error message:  error #6285: 
There is no matching specific subroutin for this generic subroutine call  )
 ivar=123! <-- ivar is an integer variable, not an array
  call MPI_BCAST( ivar, 1, MPI_INTEGER, 0, MPI_COMM_WORLD), ierr_mpi )  
  ! <--- this should work, but is not accepted by the compiler

  only this cumbersome workaround works:
  ivar=123
allocate( iarr(1) )
iarr(1) = ivar
 call MPI_BCAST( iarr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr_mpi )
! <--- this workaround works
ivar = iarr(1)
deallocate( iarr(1) )

Example 2:
 Any call of an MPI-routine with MPI_IN_PLACE does not work, like that 
coding:

  if(lmaster) then
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX &
! <--- this should work, but is not accepted by the compiler
 ,0_INT4, MPI_COMM_WORLD, ierr_mpi )
  else  ! slaves
call MPI_REDUCE( rbuffarr, rdummyarr, nelem, MPI_REAL8, MPI_MAX &
,0_INT4, MPI_COMM_WORLD, ierr_mpi )
  endif

This results in this compiler error message:

  /src_SPRAY/mpi_wrapper.f90(2122): error #6285: There is no matching 
specific subroutine for this generic subroutine call.   [MPI_REDUCE]
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX &
-^


In our code I observed the bug with MPI_BCAST, MPI_REDUCE, MPI_ALLREDUCE,
but probably there may be other MPI-routines with the same kind of bug.

This bug occurred for   : OpenMPI-1.10.0  with 
Intel-16.0.0
In contrast, this bug did NOT occur for: OpenMPI-1.8.8with Intel-16.0.0

OpenMPI-1.8.8with Intel-15.0.3
  

Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with Intel-Ftn-compiler

2015-11-19 Thread Gilles Gouaillardet
Thanks Nick for the pointer !

Michael,

good news is you do not have to upgrade ifort,
but you have to update to 1.10.1
(intel 16 changed the way gcc pragmas are handled, and ompi has been made
aware in 1.10.1)
1.10.1 fixes many bugs from 1.10.0, so I strongly encourage anyone to use
1.10.1

Cheers,

Gilles

On Thursday, November 19, 2015, Nick Papior  wrote:

> Maybe I can chip in,
>
> We use OpenMPI 1.10.1 with Intel /2016.1.0.423501 without problems.
>
> I could not get 1.10.0 to work, one reason is:
> http://www.open-mpi.org/community/lists/users/2015/09/27655.php
>
> On a side-note, please note that if you require scalapack you may need to
> follow this approach:
>
> https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302
>
> 2015-11-19 11:24 GMT+01:00  >:
>
>> Sorry, Gilles,
>>
>>
>>
>> I cannot  update to more recent versions, because what I used is the
>> newest combination of OpenMPI and Intel-Ftn  available on that cluster.
>>
>>
>>
>> When looking at the list of improvements  on the OpenMPI website for
>>  OpenMPI 1.10.1 compared to 1.10.0, I do not remember having seen this item
>> to be corrected.
>>
>>
>>
>> Greeting
>>
>> Michael Rachner
>>
>>
>>
>>
>>
>> *Von:* users [mailto:users-boun...@open-mpi.org
>> ] *Im
>> Auftrag von *Gilles Gouaillardet
>> *Gesendet:* Donnerstag, 19. November 2015 10:21
>> *An:* Open MPI Users
>> *Betreff:* Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0
>> with Intel-Ftn-compiler
>>
>>
>>
>> Michael,
>>
>> I remember i saw similar reports.
>>
>> Could you give a try to the latest v1.10.1 ?
>> And if that still does not work, can you upgrade icc suite and give it an
>> other try ?
>>
>> I cannot remember whether this is an ifort bug or the way ompi uses
>> fortran...
>>
>> Btw, any reason why you do not
>> Use mpi_f08 ?
>>
>> HTH
>>
>> Gilles
>>
>> michael.rach...@dlr.de
>>  wrote:
>>
>> Dear developers of OpenMPI,
>>
>>
>>
>> I am trying to run our parallelized Ftn-95 code on a Linux cluster with
>> OpenMPI-1-10.0 and Intel-16.0.0 Fortran compiler.
>>
>> In the code I use the  module MPI  (“use MPI”-stmts).
>>
>>
>>
>> However I am not able to compile the code, because of compiler error
>> messages like this:
>>
>>
>>
>> /src_SPRAY/mpi_wrapper.f90(2065): error #6285: There is no matching
>> specific subroutin for this generic subroutine call.   [MPI_REDUCE]
>>
>>
>>
>>
>>
>> The problem seems for me to be this one:
>>
>>
>>
>> The interfaces in the module MPI for the MPI-routines do not accept a
>> send or receive buffer array, which is
>>
>> actually a variable, an array element or a constant (like MPI_IN_PLACE).
>>
>>
>>
>> Example 1:
>>
>>  This does not work (gives the compiler error message:  error
>> #6285: There is no matching specific subroutin for this generic subroutine
>> call  )
>>
>>  ivar=123! ß ivar is an integer variable, not an array
>>
>>   *call* MPI_BCAST( ivar, 1, MPI_INTEGER, 0, MPI_COMM_WORLD),
>> ierr_mpi )! ß- this should work, but is not accepted by the compiler
>>
>>
>>
>>   only this cumbersome workaround works:
>>
>>   ivar=123
>>
>> allocate( iarr(1) )
>>
>> iarr(1) = ivar
>>
>> * call* MPI_BCAST( iarr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD,
>> ierr_mpi )! ß- this workaround works
>>
>> ivar = iarr(1)
>>
>> deallocate( iarr(1) )
>>
>>
>>
>> Example 2:
>>
>>  Any call of an MPI-routine with MPI_IN_PLACE does not work, like
>> that coding:
>>
>>
>>
>>   *if*(lmaster) *then*
>>
>> *call* MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8,
>> MPI_MAX &! ß- this should work, but is not accepted by the compiler
>>
>>  ,0_INT4, MPI_COMM_WORLD,
>> ierr_mpi )
>>
>>   *else*  ! slaves
>>
>> *call* MPI_REDUCE( rbuffarr, rdummyarr, nelem, MPI_REAL8,
>> MPI_MAX &
>>
>> ,0_INT4, MPI_COMM_WORLD, ierr_mpi )
>>
>>   *endif*
>>
>>
>>
>> This results in this compiler error message:
>>
>>
>>
>>   /src_SPRAY/mpi_wrapper.f90(2122): error #6285: There is no matching
>> specific subroutine for this generic subroutine call.   [MPI_REDUCE]
>>
>> call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8,
>> MPI_MAX &
>>
>> -^
>>
>>
>>
>>
>>
>> In our code I observed the bug with MPI_BCAST, MPI_REDUCE, MPI_ALLREDUCE,
>>
>> but probably there may be other MPI-routines with the same kind of bug.
>>
>>
>>
>> This bug occurred for   : OpenMPI-1.10.0
>>  with Intel-16.0.0
>>
>> In contrast, this bug did NOT occur for: OpenMPI-1.8.8with
>> Intel-16.0.0
>>
>>
>>OpenMPI-1.8.8with Intel-15.0.3
>>
>>
>>  

Re: [OMPI users] Strange problem with SSH

2015-11-19 Thread Federico Reghenzani
Thank you for the fix,
I could have tried only today, I confirm it works with the patch and with
the mca option.


Cheers,
Federico Reghenzani

2015-11-18 6:15 GMT+01:00 Gilles Gouaillardet :

> Federico,
>
> i made PR #772 https://github.com/open-mpi/ompi-release/pull/772
>
> feel free to manually patch your ompi install or use the workaround i
> previously described
>
> Cheers,
>
> Gilles
>
>
> On 11/18/2015 11:31 AM, Gilles Gouaillardet wrote:
>
> Federico,
>
> thanks for the report, i will push a fix shortly
>
> meanwhile, and as a workaround, you can add the
> --mca orte_keep_fqdn_hostnames true
> to your mpirun command line when using --host user@ip
>
> Cheers,
>
> Gilles
>
> On 11/17/2015 7:19 PM, Federico Reghenzani wrote:
>
> I'm trying to execute this command:
>
>
> *mpirun -np 8 --host openmpi@10.10.1.1 ,
> openmpi@10.10.1.2 ,
> openmpi@10.10.1.3 ,
> openmpi@10.10.1.4  --mca
> oob_tcp_if_exclude lo,wlp2s0 ompi_info *
>
> Everything goes find if I execute the same command with only 2 nodes
> (independently of which nodes).
>
> With 3 or more nodes I obtain:
> *ssh: connect to host 10 port 22: Invalid argument*
> followed by "ORTE was unable to reliably start one or more daemons." error.
>
> Searching with plm_base_verbose, I found:
>
> ...
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm add new daemon
> [[53718,0],1]
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm assigning new daemon
> [[53718,0],1] to node openmpi@10.10.1.1
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm add new daemon
> [[53718,0],2]
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm assigning new daemon
> [[53718,0],2] to node openmpi@10.10.1.2
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm add new daemon
> [[53718,0],3]
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm assigning new daemon
> [[53718,0],3] to node openmpi@10.10.1.3
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm add new daemon
> [[53718,0],4]
> [Neptune:22627] [[53718,0],0] plm:base:setup_vm assigning new daemon
> [[53718,0],4] to node openmpi@10.10.1.4
> ...
> [Neptune:22627] [[53718,0],0] plm:rsh:launch daemon 0 not a child of mine
> [Neptune:22627] [[53718,0],0] plm:rsh: adding node 
> openmpi@10.10.1.1 to launch list
> [Neptune:22627] [[53718,0],0] plm:rsh: adding node 
> openmpi@10.10.1.2 to launch list
> [Neptune:22627] [[53718,0],0] plm:rsh:launch daemon 3 not a child of mine
> [Neptune:22627] [[53718,0],0] plm:rsh: adding node 
> openmpi@10.10.1.4 to launch list
> ...
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: remote spawn called
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: local shell: 0 (bash)
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: assuming same remote shell as
> local shell
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: remote shell: 0 (bash)
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: final template argv:
> /usr/bin/ssh   orted --hnp-topo-sig
> 0N:1S:0L3:1L2:2L1:2C:2H:x86_64 -mca ess "env" -mca orte_ess_jobid
> "3520462848" -mca orte_ess_vpid "" -mca orte_ess_num_procs "5"
> -mca orte_parent_uri "3520462848.1;tcp://10.10.1.1:35489" -mca
> orte_hnp_uri "3520462848.0;tcp://10.10.10.2:43771" --mca
> oob_tcp_if_exclude "lo,wlp2s0" --mca plm_base_verbose "100" -mca plm "rsh"
> --tree-spawn
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: activating launch event
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: recording launch of daemon
> [[53718,0],3]
> [roaster-vm1:00593] [[53718,0],1] plm:rsh: executing: (/usr/bin/ssh) 
> [*/usr/bin/ssh
> openmpi@10  orted* --hnp-topo-sig 0N:1S:0L3:1L2:2L1:2C:2H:x86_64 -mca ess
> "env" -mca orte_ess_jobid "3520462848" -mca orte_ess_vpid 3 -mca
> orte_ess_num_procs "5" -mca orte_parent_uri "3520462848.1;tcp://
> 10.10.1.1:35489" -mca orte_hnp_uri "3520462848.0;tcp://10.10.10.2:43771"
> --mca oob_tcp_if_exclude "lo,wlp2s0" --mca plm_base_verbose "100" -mca plm
> "rsh" --tree-spawn]
> *ssh: connect to host 10 port 22: Invalid argument*
>
> It seems it corrupts the ip address during remote spawn. Any idea?
>
> (I'm using 1.10.0rc7 version)
>
>
> Cheers,
> Federico
>
> __
> Federico Reghenzani
> M.Eng. Student @ Politecnico di Milano
> Computer Science and Engineering
>
>
>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/11/28042.php
>
>
>
>
> ___
> users mailing listus...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/11/28044.php
>
>
>
> ___
> users 

Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with Intel-Ftn-compiler

2015-11-19 Thread Nick Papior
Maybe I can chip in,

We use OpenMPI 1.10.1 with Intel /2016.1.0.423501 without problems.

I could not get 1.10.0 to work, one reason is:
http://www.open-mpi.org/community/lists/users/2015/09/27655.php

On a side-note, please note that if you require scalapack you may need to
follow this approach:
https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/590302

2015-11-19 11:24 GMT+01:00 :

> Sorry, Gilles,
>
>
>
> I cannot  update to more recent versions, because what I used is the
> newest combination of OpenMPI and Intel-Ftn  available on that cluster.
>
>
>
> When looking at the list of improvements  on the OpenMPI website for
>  OpenMPI 1.10.1 compared to 1.10.0, I do not remember having seen this item
> to be corrected.
>
>
>
> Greeting
>
> Michael Rachner
>
>
>
>
>
> *Von:* users [mailto:users-boun...@open-mpi.org] *Im Auftrag von *Gilles
> Gouaillardet
> *Gesendet:* Donnerstag, 19. November 2015 10:21
> *An:* Open MPI Users
> *Betreff:* Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0
> with Intel-Ftn-compiler
>
>
>
> Michael,
>
> I remember i saw similar reports.
>
> Could you give a try to the latest v1.10.1 ?
> And if that still does not work, can you upgrade icc suite and give it an
> other try ?
>
> I cannot remember whether this is an ifort bug or the way ompi uses
> fortran...
>
> Btw, any reason why you do not
> Use mpi_f08 ?
>
> HTH
>
> Gilles
>
> michael.rach...@dlr.de wrote:
>
> Dear developers of OpenMPI,
>
>
>
> I am trying to run our parallelized Ftn-95 code on a Linux cluster with
> OpenMPI-1-10.0 and Intel-16.0.0 Fortran compiler.
>
> In the code I use the  module MPI  (“use MPI”-stmts).
>
>
>
> However I am not able to compile the code, because of compiler error
> messages like this:
>
>
>
> /src_SPRAY/mpi_wrapper.f90(2065): error #6285: There is no matching
> specific subroutin for this generic subroutine call.   [MPI_REDUCE]
>
>
>
>
>
> The problem seems for me to be this one:
>
>
>
> The interfaces in the module MPI for the MPI-routines do not accept a send
> or receive buffer array, which is
>
> actually a variable, an array element or a constant (like MPI_IN_PLACE).
>
>
>
> Example 1:
>
>  This does not work (gives the compiler error message:  error
> #6285: There is no matching specific subroutin for this generic subroutine
> call  )
>
>  ivar=123! ß ivar is an integer variable, not an array
>
>   *call* MPI_BCAST( ivar, 1, MPI_INTEGER, 0, MPI_COMM_WORLD),
> ierr_mpi )! ß- this should work, but is not accepted by the compiler
>
>
>
>   only this cumbersome workaround works:
>
>   ivar=123
>
> allocate( iarr(1) )
>
> iarr(1) = ivar
>
> * call* MPI_BCAST( iarr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD,
> ierr_mpi )! ß- this workaround works
>
> ivar = iarr(1)
>
> deallocate( iarr(1) )
>
>
>
> Example 2:
>
>  Any call of an MPI-routine with MPI_IN_PLACE does not work, like that
> coding:
>
>
>
>   *if*(lmaster) *then*
>
> *call* MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8,
> MPI_MAX &! ß- this should work, but is not accepted by the compiler
>
>  ,0_INT4, MPI_COMM_WORLD, ierr_mpi
> )
>
>   *else*  ! slaves
>
> *call* MPI_REDUCE( rbuffarr, rdummyarr, nelem, MPI_REAL8, MPI_MAX
> &
>
> ,0_INT4, MPI_COMM_WORLD, ierr_mpi )
>
>   *endif*
>
>
>
> This results in this compiler error message:
>
>
>
>   /src_SPRAY/mpi_wrapper.f90(2122): error #6285: There is no matching
> specific subroutine for this generic subroutine call.   [MPI_REDUCE]
>
> call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8,
> MPI_MAX &
>
> -^
>
>
>
>
>
> In our code I observed the bug with MPI_BCAST, MPI_REDUCE, MPI_ALLREDUCE,
>
> but probably there may be other MPI-routines with the same kind of bug.
>
>
>
> This bug occurred for   : OpenMPI-1.10.0
>  with Intel-16.0.0
>
> In contrast, this bug did NOT occur for: OpenMPI-1.8.8with
> Intel-16.0.0
>
>
>OpenMPI-1.8.8with Intel-15.0.3
>
>
>  OpenMPI-1.10.0  with
> gfortran-5.2.0
>
>
>
> Greetings
>
> Michael Rachner
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/11/28052.php
>



-- 
Kind regards Nick


Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with Intel-Ftn-compiler

2015-11-19 Thread Michael.Rachner
Sorry, Gilles,

I cannot  update to more recent versions, because what I used is the newest 
combination of OpenMPI and Intel-Ftn  available on that cluster.

When looking at the list of improvements  on the OpenMPI website for  OpenMPI 
1.10.1 compared to 1.10.0, I do not remember having seen this item to be 
corrected.

Greeting
Michael Rachner


Von: users [mailto:users-boun...@open-mpi.org] Im Auftrag von Gilles 
Gouaillardet
Gesendet: Donnerstag, 19. November 2015 10:21
An: Open MPI Users
Betreff: Re: [OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with 
Intel-Ftn-compiler

Michael,

I remember i saw similar reports.

Could you give a try to the latest v1.10.1 ?
And if that still does not work, can you upgrade icc suite and give it an other 
try ?

I cannot remember whether this is an ifort bug or the way ompi uses fortran...

Btw, any reason why you do not
Use mpi_f08 ?

HTH

Gilles

michael.rach...@dlr.de wrote:
Dear developers of OpenMPI,

I am trying to run our parallelized Ftn-95 code on a Linux cluster with 
OpenMPI-1-10.0 and Intel-16.0.0 Fortran compiler.
In the code I use the  module MPI  (“use MPI”-stmts).

However I am not able to compile the code, because of compiler error messages 
like this:

/src_SPRAY/mpi_wrapper.f90(2065): error #6285: There is no matching specific 
subroutin for this generic subroutine call.   [MPI_REDUCE]


The problem seems for me to be this one:

The interfaces in the module MPI for the MPI-routines do not accept a send or 
receive buffer array, which is
actually a variable, an array element or a constant (like MPI_IN_PLACE).

Example 1:
 This does not work (gives the compiler error message:  error #6285: 
There is no matching specific subroutin for this generic subroutine call  )
 ivar=123! <-- ivar is an integer variable, not an array
  call MPI_BCAST( ivar, 1, MPI_INTEGER, 0, MPI_COMM_WORLD), ierr_mpi )  
  ! <--- this should work, but is not accepted by the compiler

  only this cumbersome workaround works:
  ivar=123
allocate( iarr(1) )
iarr(1) = ivar
 call MPI_BCAST( iarr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr_mpi )
! <--- this workaround works
ivar = iarr(1)
deallocate( iarr(1) )

Example 2:
 Any call of an MPI-routine with MPI_IN_PLACE does not work, like that 
coding:

  if(lmaster) then
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX &
! <--- this should work, but is not accepted by the compiler
 ,0_INT4, MPI_COMM_WORLD, ierr_mpi )
  else  ! slaves
call MPI_REDUCE( rbuffarr, rdummyarr, nelem, MPI_REAL8, MPI_MAX &
,0_INT4, MPI_COMM_WORLD, ierr_mpi )
  endif

This results in this compiler error message:

  /src_SPRAY/mpi_wrapper.f90(2122): error #6285: There is no matching 
specific subroutine for this generic subroutine call.   [MPI_REDUCE]
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX &
-^


In our code I observed the bug with MPI_BCAST, MPI_REDUCE, MPI_ALLREDUCE,
but probably there may be other MPI-routines with the same kind of bug.

This bug occurred for   : OpenMPI-1.10.0  with 
Intel-16.0.0
In contrast, this bug did NOT occur for: OpenMPI-1.8.8with Intel-16.0.0

OpenMPI-1.8.8with Intel-15.0.3

OpenMPI-1.10.0  with gfortran-5.2.0

Greetings
Michael Rachner


[OMPI users] Bug in Fortran-module MPI of OpenMPI 1.10.0 with Intel-Ftn-compiler

2015-11-19 Thread Michael.Rachner
Dear developers of OpenMPI,

I am trying to run our parallelized Ftn-95 code on a Linux cluster with 
OpenMPI-1-10.0 and Intel-16.0.0 Fortran compiler.
In the code I use the  module MPI  ("use MPI"-stmts).

However I am not able to compile the code, because of compiler error messages 
like this:

/src_SPRAY/mpi_wrapper.f90(2065): error #6285: There is no matching specific 
subroutin for this generic subroutine call.   [MPI_REDUCE]


The problem seems for me to be this one:

The interfaces in the module MPI for the MPI-routines do not accept a send or 
receive buffer array, which is
actually a variable, an array element or a constant (like MPI_IN_PLACE).

Example 1:
 This does not work (gives the compiler error message:  error #6285: 
There is no matching specific subroutin for this generic subroutine call  )
 ivar=123! <-- ivar is an integer variable, not an array
  call MPI_BCAST( ivar, 1, MPI_INTEGER, 0, MPI_COMM_WORLD), ierr_mpi )  
  ! <--- this should work, but is not accepted by the compiler

  only this cumbersome workaround works:
  ivar=123
allocate( iarr(1) )
iarr(1) = ivar
 call MPI_BCAST( iarr, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, ierr_mpi )
! <--- this workaround works
ivar = iarr(1)
deallocate( iarr(1) )

Example 2:
 Any call of an MPI-routine with MPI_IN_PLACE does not work, like that 
coding:

  if(lmaster) then
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX &
! <--- this should work, but is not accepted by the compiler
 ,0_INT4, MPI_COMM_WORLD, ierr_mpi )
  else  ! slaves
call MPI_REDUCE( rbuffarr, rdummyarr, nelem, MPI_REAL8, MPI_MAX &
,0_INT4, MPI_COMM_WORLD, ierr_mpi )
  endif

This results in this compiler error message:

  /src_SPRAY/mpi_wrapper.f90(2122): error #6285: There is no matching 
specific subroutine for this generic subroutine call.   [MPI_REDUCE]
call MPI_REDUCE( MPI_IN_PLACE, rbuffarr, nelem, MPI_REAL8, MPI_MAX &
-^


In our code I observed the bug with MPI_BCAST, MPI_REDUCE, MPI_ALLREDUCE,
but probably there may be other MPI-routines with the same kind of bug.

This bug occurred for   : OpenMPI-1.10.0  with 
Intel-16.0.0
In contrast, this bug did NOT occur for: OpenMPI-1.8.8with Intel-16.0.0

OpenMPI-1.8.8with Intel-15.0.3

OpenMPI-1.10.0  with gfortran-5.2.0

Greetings
Michael Rachner