Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Rodrigo Gómez Vázquez
In fact we should have restrictive firewall settings, as long as I 
remember. I will check the rules again tomorrow morning. That's very 
interesting, I would expect such kind of problem if I were working with 
a cluster, but I haven't thought that it might lead also to problems for 
the internal communication in the machine.


Thanks, Ralph. I'll let you know if this was the actual reason of the 
problem.

Rodrigo

On 04/10/2013 09:46 PM, Ralph Castain wrote:

Best guess is that there is some issue with getting TCP sockets on the system - once the 
procs are launched, they need to open a TCP socket and communicate back to mpirun. If the 
socket is "stuck" waiting to complete the open, things will hang.

You might check to ensure there isn't some security setting in place that 
protects sockets - something like iptables, for example.


On Apr 10, 2013, at 11:57 AM, Rodrigo Gómez Vázquez  
wrote:


Hi,

I am having troubles with the program in a simulation server.
The system consists of several processors but all in the same node (more 
information of the specs. is in the attachments).
The system is quite new (few months) and a user reported me that it was not 
possible to run simulations on multiple processors in parallel.
We are using it for CFD-Simulations with OpenFOAM, which comes along with an own 
1.5.3-version of OpenMPI (for more details you can look inside the "ThirdParty 
software folder" following this link: 
http://www.openfoam.org/archive/2.1.1/download/source.php). The OS is an Ubuntu 12.04 
Server distro (see uname.out in the attachments).
He tried to start a simulation in parallel using the following command:

~: mpirun -np 4 

As a result the simulation does not start and there is no error message. It looks like 
the program is just waiting/looking for something. We can see shortly the 4 processes 
with their PIDs in the "top" processes list, but only for few tenths of second 
and with 0% use of CPU and 0.0% use of memory as well. In order to recover the command 
terminal we have to kill the process.

The same happens with the "hello" scripts that come along with the OpenMPI's 
sources:

:~$mpicc hello_c.c -o hello
:~$mpirun -np 4 hello
... and here it hangs again.

I tried to execute other simpler processes, as recommended to check the 
installation. Let's see:

:~$mpirun -np 4 hostname
simserver
simserver
simserver
simserver
:~$

Works, as well as "ompi_info" does.

Since we use the same OpenFOAM version without problems in several computers 
over ubuntu-based distros, I supposed that there must be any kind of 
incompatibility problem, due to the hardware, but...

Anyway, I repeated the tests with the OpenMPI version from the ubuntu 
repositories (1.4.3) and got the same result.

It would be wonderful if anyone could give me a hint.

I am afraid, it may result a complicated issue, so please, let me know whatever 
relevant information missing.

Thanks in advance, guys

Rodrigo (Europe, GMT+2:00)
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Ralph Castain
Best guess is that there is some issue with getting TCP sockets on the system - 
once the procs are launched, they need to open a TCP socket and communicate 
back to mpirun. If the socket is "stuck" waiting to complete the open, things 
will hang.

You might check to ensure there isn't some security setting in place that 
protects sockets - something like iptables, for example.


On Apr 10, 2013, at 11:57 AM, Rodrigo Gómez Vázquez  
wrote:

> Hi,
> 
> I am having troubles with the program in a simulation server.
> The system consists of several processors but all in the same node (more 
> information of the specs. is in the attachments).
> The system is quite new (few months) and a user reported me that it was not 
> possible to run simulations on multiple processors in parallel.
> We are using it for CFD-Simulations with OpenFOAM, which comes along with an 
> own 1.5.3-version of OpenMPI (for more details you can look inside the 
> "ThirdParty software folder" following this link: 
> http://www.openfoam.org/archive/2.1.1/download/source.php). The OS is an 
> Ubuntu 12.04 Server distro (see uname.out in the attachments).
> He tried to start a simulation in parallel using the following command:
> 
> ~: mpirun -np 4 
> 
> As a result the simulation does not start and there is no error message. It 
> looks like the program is just waiting/looking for something. We can see 
> shortly the 4 processes with their PIDs in the "top" processes list, but only 
> for few tenths of second and with 0% use of CPU and 0.0% use of memory as 
> well. In order to recover the command terminal we have to kill the process.
> 
> The same happens with the "hello" scripts that come along with the OpenMPI's 
> sources:
> 
> :~$mpicc hello_c.c -o hello
> :~$mpirun -np 4 hello
> ... and here it hangs again.
> 
> I tried to execute other simpler processes, as recommended to check the 
> installation. Let's see:
> 
> :~$mpirun -np 4 hostname
> simserver
> simserver
> simserver
> simserver
> :~$
> 
> Works, as well as "ompi_info" does.
> 
> Since we use the same OpenFOAM version without problems in several computers 
> over ubuntu-based distros, I supposed that there must be any kind of 
> incompatibility problem, due to the hardware, but...
> 
> Anyway, I repeated the tests with the OpenMPI version from the ubuntu 
> repositories (1.4.3) and got the same result.
> 
> It would be wonderful if anyone could give me a hint.
> 
> I am afraid, it may result a complicated issue, so please, let me know 
> whatever relevant information missing.
> 
> Thanks in advance, guys
> 
> Rodrigo (Europe, GMT+2:00)
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] mpirun hangs: "hello" test in single machine

2013-04-10 Thread Rodrigo Gómez Vázquez

Hi,

I am having troubles with the program in a simulation server.
The system consists of several processors but all in the same node (more 
information of the specs. is in the attachments).
The system is quite new (few months) and a user reported me that it was 
not possible to run simulations on multiple processors in parallel.
We are using it for CFD-Simulations with OpenFOAM, which comes along 
with an own 1.5.3-version of OpenMPI (for more details you can look 
inside the "ThirdParty software folder" following this link: 
http://www.openfoam.org/archive/2.1.1/download/source.php). The OS is an 
Ubuntu 12.04 Server distro (see uname.out in the attachments).

He tried to start a simulation in parallel using the following command:

~: mpirun -np 4 

As a result the simulation does not start and there is no error message. 
It looks like the program is just waiting/looking for something. We can 
see shortly the 4 processes with their PIDs in the "top" processes list, 
but only for few tenths of second and with 0% use of CPU and 0.0% use of 
memory as well. In order to recover the command terminal we have to kill 
the process.


The same happens with the "hello" scripts that come along with the 
OpenMPI's sources:


:~$mpicc hello_c.c -o hello
:~$mpirun -np 4 hello
... and here it hangs again.

I tried to execute other simpler processes, as recommended to check the 
installation. Let's see:


:~$mpirun -np 4 hostname
simserver
simserver
simserver
simserver
:~$

Works, as well as "ompi_info" does.

Since we use the same OpenFOAM version without problems in several 
computers over ubuntu-based distros, I supposed that there must be any 
kind of incompatibility problem, due to the hardware, but...


Anyway, I repeated the tests with the OpenMPI version from the ubuntu 
repositories (1.4.3) and got the same result.


It would be wonderful if anyone could give me a hint.

I am afraid, it may result a complicated issue, so please, let me know 
whatever relevant information missing.


Thanks in advance, guys

Rodrigo (Europe, GMT+2:00)


openmpi1.4.3_ompi_info.out.bz2
Description: application/bzip
Linux simserver 3.2.0-23-generic #36-Ubuntu SMP Tue Apr 10 20:39:51 UTC 2012 
x86_64 x86_64 x86_64 GNU/Linux


cat_-proc-cpuinfo.out.bz2
Description: application/bzip


Re: [OMPI users] Segmentation fault with HPCC benchmark

2013-04-10 Thread Reza Bakhshayeshi
Dear Gus Correa,

Thank you in advance for your detailed answer.
I was busy checking your steps. But unfortunately still I have the problem.

1) Yes, I have sudo access to server, when I want to run the test only my
two instances are active.

2) There is no problem with running hello program simultaneously on two
instances, but someone told me these programs cannot check some factors.

Instances are pure installation of ubuntu server 12.04, by the way I
disabled "ufw". There are two notes here, openmpi uses ssh and I can
connect with no password from master to slave. And one more odd thing i​s
that​ the order is important in myhosts file, ie, allways the second
machine abort the process, even when I am in the master and master is
second in the file, it reports that master aborted.

3,4) I checked it, actually, I did everything from the first step. Just
installing Atlas and OpenMPI from packages with 64 switch to configure.

5) I used -np 4 with hello, is this sufficient?​

6) Yes, I checked auto-tuning (without input file) too.

One thing that I noticed is that a "vnet" created for each instance on the
main server. I ran these two commands:
mirun -np 2 --hostfile myhosts --mca btl_tcp_if_include eth0,lo hpcc
mirun -np 2 --hostfile myhosts --mca btl_tcp_if_exclude vnet0,vnet1 hpcc

in this case I didn't receive anything, ie, no error nor anything in output
file, I waited for hours but nothing happened. can these vnets cause the
problem?

Really Thank you for your consideration,
Best Regards,
Reza


Re: [OMPI users] Is Open MPI 1.6.4 viable on Mac OS X 10.6.8 ?

2013-04-10 Thread Ralph Castain
Hi Gus

I feel your pain - that's a pretty old system!

I obviously don't have any way to test it, but try configuring OMPI 
--without-memory-manager and see if that helps.


On Apr 9, 2013, at 9:33 PM, Gustavo Correa  wrote:

> Dear Open MPI Pros
> 
> Somehow I am stuck offsite and I have to test/develop an MPI program on a 
> super duper 
> 2006 vintage Mac PowerBookPro with Mac OS X 10.6.8 (Snow Leopard).
> This is a 32-bit machine with dual core Intel Core Duo processors and 2GB RAM.
> 
> Well, my under-development program using FFTW3 and OMPI 1.6.4 runs
> flawlessly on Linux, but I am offsite and I have to use the darn Mac,
> where I get all sorts of weird errors out of the blue, which are 
> very likely to be associated to the Mac OS X underlying memory management
> system.
> 
> I say so because the OMPI test programs (connectivity_c.c, etc), which do NOT
> allocate memory (other than the MPI internal buffers, if so), run correctly, 
> but once I 
> start using dynamic memory arrays, boomer, it breaks (but only on the Mac).
> 
> I enclose below one of the error messages, FYI.
> [It shows up as a segfault, but the array and buffer boundaries are correct,
> and the program runs perfectly on Linux.  RAM is OK also, my batch of test
> data is small. No automatic arrays on the code either.]
> 
> I read the OMPI FAQ on runtime issues, and a couple of them mention trouble 
> for OMPI 
> with the Mac OS X memory management scheme.  However, those FAQ are quite old,
> refer to OMPI 1.2 and 1.3 series only, recommend linking to an OMPI library 
> that seems to have been phased out (-lopenmpi-malloc), and didn't shed the 
> light
> I was hoping for.
> 
> So, before I give this effort up as not viable, here are a few questions:
> 
> Are there specific recommendations on how to build OMPI 1.6.4 on Mac OS X 
> 1.6.8?
> Are there any additional linker flags that should be used to build OMPI 
> applications under OS X?
> Are there any runtime options that should be added to mpiexec to make OMPI 
> programs
> that allocate memory dynamically to run correctly on Mac OS X?
> 
> Thank you,
> Gus Correa
>  Error message 
> *
> [1,0]:[Macintosh-72:36578] *** Process received signal ***
> [1,0]:[Macintosh-72:36578] Signal: Segmentation fault (11)
> [1,0]:[Macintosh-72:36578] Signal code: Address not mapped (1)
> [1,0]:[Macintosh-72:36578] Failing at address: 0x6648000
> [1,0]:[Macintosh-72:36578] [ 0] 2   libSystem.B.dylib 
>   0x9728c05b _sigtramp + 43
> [1,0]:[Macintosh-72:36578] [ 1] 3   ???   
>   0x 0x0 + 4294967295
> [1,0]:[Macintosh-72:36578] [ 2] 4   wcdp3d
>   0x0001be49 main + 1864
> [1,0]:[Macintosh-72:36578] [ 3] 5   wcdp3d
>   0x27ad start + 53
> [1,0]:[Macintosh-72:36578] [ 4] 6   ???   
>   0x0002 0x0 + 2
> [1,0]:[Macintosh-72:36578] *** End of error message ***
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Greenplum's MR+

2013-04-10 Thread Jakub Nowacki
Hi,

I am currently testing OpenMPI applications running on SLURM and I wanted
to test the MapReduce integration but I have trouble finding the actual MR+
package. Namely, both on your and SLURM there are some informations about
MR+, some of them relatively old,  but I could not find any actual software
package being available. I have been to Greenplum's web page but their MR
product is some remix of MapR, not the MR+ on OpenMPI. Could anyone tell me
was MR+ released already, and if so, where it is? If not, how can I get
early adopter's version?

Many thanks for the help.

Cheers,

Jakub


[OMPI users] Is Open MPI 1.6.4 viable on Mac OS X 10.6.8 ?

2013-04-10 Thread Gustavo Correa
Dear Open MPI Pros

Somehow I am stuck offsite and I have to test/develop an MPI program on a super 
duper 
2006 vintage Mac PowerBookPro with Mac OS X 10.6.8 (Snow Leopard).
This is a 32-bit machine with dual core Intel Core Duo processors and 2GB RAM.

Well, my under-development program using FFTW3 and OMPI 1.6.4 runs
flawlessly on Linux, but I am offsite and I have to use the darn Mac,
where I get all sorts of weird errors out of the blue, which are 
very likely to be associated to the Mac OS X underlying memory management
system.

I say so because the OMPI test programs (connectivity_c.c, etc), which do NOT
allocate memory (other than the MPI internal buffers, if so), run correctly, 
but once I 
start using dynamic memory arrays, boomer, it breaks (but only on the Mac).

I enclose below one of the error messages, FYI.
[It shows up as a segfault, but the array and buffer boundaries are correct,
and the program runs perfectly on Linux.  RAM is OK also, my batch of test
data is small. No automatic arrays on the code either.]

I read the OMPI FAQ on runtime issues, and a couple of them mention trouble for 
OMPI 
with the Mac OS X memory management scheme.  However, those FAQ are quite old,
refer to OMPI 1.2 and 1.3 series only, recommend linking to an OMPI library 
that seems to have been phased out (-lopenmpi-malloc), and didn't shed the light
I was hoping for.

So, before I give this effort up as not viable, here are a few questions:

Are there specific recommendations on how to build OMPI 1.6.4 on Mac OS X 1.6.8?
Are there any additional linker flags that should be used to build OMPI 
applications under OS X?
Are there any runtime options that should be added to mpiexec to make OMPI 
programs
that allocate memory dynamically to run correctly on Mac OS X?

Thank you,
Gus Correa
 Error message 
*
[1,0]:[Macintosh-72:36578] *** Process received signal ***
[1,0]:[Macintosh-72:36578] Signal: Segmentation fault (11)
[1,0]:[Macintosh-72:36578] Signal code: Address not mapped (1)
[1,0]:[Macintosh-72:36578] Failing at address: 0x6648000
[1,0]:[Macintosh-72:36578] [ 0] 2   libSystem.B.dylib   
0x9728c05b _sigtramp + 43
[1,0]:[Macintosh-72:36578] [ 1] 3   ??? 
0x 0x0 + 4294967295
[1,0]:[Macintosh-72:36578] [ 2] 4   wcdp3d  
0x0001be49 main + 1864
[1,0]:[Macintosh-72:36578] [ 3] 5   wcdp3d  
0x27ad start + 53
[1,0]:[Macintosh-72:36578] [ 4] 6   ??? 
0x0002 0x0 + 2
[1,0]:[Macintosh-72:36578] *** End of error message ***




Re: [OMPI users] mpirun error

2013-04-10 Thread Pradeep Jha
Hello,

thanks for the responses. But I have no idea how to do that. Which
environment variables should I look at? How do I find out where is the
openMPI installed and make the mpif90 use the openMPI?

Thanks,
Pradeep


2013/4/2 Elken, Tom 

> > The Intel Fortran 2013 compiler comes with support for Intel's MPI
> runtime and
> > you are getting that instead of OpenMPI.   You need to fix your path for
> all the
> > shells you use.
> [Tom]
> Agree with Michael, but thought I would note something additional.
> If you are using OFED's mpi-selector to select Open MPI, it will set up
> the path to Open MPI before a startup script like  .bashrc gets processed.
> So if you source the Intel Compiler's compilervars.sh, you will get
> Intel's mpirt in your path before the OpenMPI's bin directory.
>
> One workaround is to source the following _after_ you source the Intel
> Compiler's compilervars.sh in your start-up scripts:
> . /var/mpi-selector/data/openmpi_...sh
>
> -Tom
>
> >
> > On Apr 1, 2013, at 5:12 AM, Pradeep Jha wrote:
> >
> > > /opt/intel/composer_xe_2013.1.117/mpirt/bin/intel64/mpirun: line 96:
> > /opt/intel/composer_xe_2013.1.117/mpirt/bin/intel64/mpivars.sh: No such
> file
> > or directory
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>