Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-07-06 Thread Ralph Castain
On Jul 3, 2013, at 1:00 PM, Riccardo Murri wrote: > Hi Jeff, Ralph, > > first of all: thanks for your work on this! > > On 3 July 2013 21:09, Jeff Squyres (jsquyres) wrote: >> 1. The root cause of the issue is that you are assigning a >>

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-07-03 Thread Riccardo Murri
Hi Jeff, Ralph, first of all: thanks for your work on this! On 3 July 2013 21:09, Jeff Squyres (jsquyres) wrote: > 1. The root cause of the issue is that you are assigning a > non-existent IP address to a name. I.e., maps to 127.0.1.1, > but that IP address does not exist

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-07-03 Thread Jeff Squyres (jsquyres)
Ralph and I talked some more about this. Here's what we think: 1. The root cause of the issue is that you are assigning a non-existent IP address to a name. I.e., maps to 127.0.1.1, but that IP address does not exist anywhere. Hence, OMPI will never conclude that that is "local". If you

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-07-02 Thread Riccardo Murri
Hi, sorry for the delay in replying -- pretty busy week :-( On 28 June 2013 21:54, Jeff Squyres (jsquyres) wrote: > Here's what we think we know (I'm using the name "foo" instead of > your actual hostname because it's easier to type): > > 1. When you run "hostname", you get

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-28 Thread Jeff Squyres (jsquyres)
Ralph and I talked about this issue this afternoon. We're still struggling to understand the details of your configuration, in part because this thread was hijacked twice with issues unrelated to this 127.0.1.1 issue. Here's what we think we know (I'm using the name "foo" instead of your

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-26 Thread Ralph Castain
The root cause of the problem is that you are assigning your host name to the loopback device. This is rather unusual, but not forbidden. Normally, people would name that interface something like "localhost" since it cannot be used to communicate off-node. Doing it the way you have could cause

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-26 Thread Riccardo Murri
Hello, On 26 June 2013 03:11, Ralph Castain wrote: > I've been reviewing the code, and I think I'm getting a handle on > the issue. > > Just to be clear - your hostname resolves to the 127 address? And you are on > a Linux (not one of the BSD flavors out there)? Yes (but

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-25 Thread Ralph Castain
I'll ignore the rest of this thread as it kinda diverged from your original question. I've been reviewing the code, and I think I'm getting a handle on the issue. Just to be clear - your hostname resolves to the 127 address? And you are on a Linux (not one of the BSD flavors out there)? If the

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-22 Thread Riccardo Murri
On 20 June 2013 11:29, Riccardo Murri wrote: > However, I cannot reproduce the issue now Just to be clear: the "issue" in that mail refers to the OpenMPI SGE ras plugin not working with our version of SGE. The issue with 127.0.1.1 addresses is reproducible at will.

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-20 Thread Jeff Squyres (jsquyres)
Er... are you having problems with host IP addresses 127.0.1.1, or did you reply to the wrong thread? I thought you were asking about problems with multiple mpf90's in your PATH, etc. -- not 127.0.1.1 IP address issues. IIRC, there were a bunch of suggestions over on that thread about how

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-20 Thread Lorenzo DonĂ 
Dear all that help me thanks to everyone. I compiled open MPI with all yours advices posted but the error is always the same I'm also able to run the examples found with the package. but really I don't know what can I do to solve the problem. I trust in you to help me. Dearly Lorenzo. Il giorno

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-20 Thread Riccardo Murri
On 19 June 2013 23:52, Reuti wrote: > Am 19.06.2013 um 22:14 schrieb Riccardo Murri: > >> On 19 June 2013 20:42, Reuti wrote: >>> Am 19.06.2013 um 19:43 schrieb Riccardo Murri : >>> On 19 June 2013 16:01, Ralph

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-20 Thread Riccardo Murri
On 20 June 2013 06:33, Ralph Castain wrote: > Been trying to decipher this problem, and think maybe I'm beginning to > understand it. Just to clarify: > > * when you execute "hostname", you get the .local response? Yes: [rmurri@nh64-2-11 ~]$ hostname nh64-2-11.local

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-20 Thread Ralph Castain
Been trying to decipher this problem, and think maybe I'm beginning to understand it. Just to clarify: * when you execute "hostname", you get the .local response? * you somewhere have it setup so that 10.x.x.x resolves to , with no ".local" extension? Correct? On Wed, Jun 19, 2013 at 1:17

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Reuti
Am 19.06.2013 um 22:14 schrieb Riccardo Murri: > On 19 June 2013 20:42, Reuti wrote: >> Am 19.06.2013 um 19:43 schrieb Riccardo Murri : >> >>> On 19 June 2013 16:01, Ralph Castain wrote: How is OMPI picking up this

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Riccardo Murri
On 19 June 2013 20:42, Reuti wrote: > Am 19.06.2013 um 19:43 schrieb Riccardo Murri : > >> On 19 June 2013 16:01, Ralph Castain wrote: >>> How is OMPI picking up this hostfile? It isn't being specified on the cmd >>> line -

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Reuti
Am 19.06.2013 um 19:43 schrieb Riccardo Murri : > On 19 June 2013 16:01, Ralph Castain wrote: >> How is OMPI picking up this hostfile? It isn't being specified on the cmd >> line - are you running under some resource manager? > > Via the environment

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Ralph Castain
Hmmm..certainly sounds like a bug. It should pickup that the node is local. It checks the hostname (as returned by gethostname), but it also checks to see if host resolves to a local address. I'm assuming that the offending host has some other address besides just 127.0.1.1 as otherwise it

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Riccardo Murri
On 19 June 2013 16:01, Ralph Castain wrote: > How is OMPI picking up this hostfile? It isn't being specified on the cmd > line - are you running under some resource manager? Via the environment variable `OMPI_MCA_orte_default_hostfile`. We're running under SGE, but disable

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Ralph Castain
How is OMPI picking up this hostfile? It isn't being specified on the cmd line - are you running under some resource manager? I haven't seen that confusion elsewhere, so I'm trying to understand which code path is involved - hence the questions. On Jun 19, 2013, at 6:26 AM, Riccardo Murri

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Riccardo Murri
Hi, (colleague of OP here) On 19 June 2013 15:09, Ralph Castain wrote: > I don't see a hostfile on your command line - so I assume you are using a > default hostfile? What is in it? The hostfile comes from the batch system; it just contains the unqualified host names: $

Re: [OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Ralph Castain
I don't see a hostfile on your command line - so I assume you are using a default hostfile? What is in it? On Jun 19, 2013, at 1:49 AM, Sergio Maffioletti wrote: > Hello, > > we have been hit observing a strange behavior with OpenMPI 1.6.3 > > strace -f

[OMPI users] openmpi 1.6.3 fails to identify local host if its IP is 127.0.1.1

2013-06-19 Thread Sergio Maffioletti
Hello, we have been hit observing a strange behavior with OpenMPI 1.6.3 strace -f /share/apps/openmpi/1.6.3/bin/mpiexec -n 2 --nooversubscribe --display-allocation --display-map --tag-output /share/apps/gamess/2011R1/gamess.2011R1.x /state/partition1/rmurri/29515/exam01.F05 -scr