Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Bogdan Costescu

On Fri, 31 Mar 2006, Christian Kauhaus wrote:

So the resolver already does the complicated work for us, since it 
returns all addresses associated to a given target (hostname or 
IP-addr notation) in the order of decreasing preference.


What you propose here should work for the case of a single BTL that 
handles both IPv4 and IPv6. How about the case of 2 BTLs ? (as it's 
not clear to me from the rest of the discussion if one solution is 
better than the other)



Now we also have two different protocols on each interface.


This could theoretically happen in other situations as well. For 
example, it's possible to set up TCP/IPv4 (and I guess even v6) over 
Myrinet at the same time with GM over Myrinet, which also brings it to 
2 (or even 3) protocols over the same physical connection. So how 
should these situations be handled ? (this is more of a general 
question, not related to IPv6 implementation).



- We introduce another parameter, which allows an IP version selection
 both globally and on a per-interface basis. Something like:
 IPv4-only / prefer IPv4 / auto (resolver) / prefer IPv6 / IPv6-only

The third approach would possibly the cleanest one.


I also like it, with emphasis on "both globally and per-interface".

Since it is standard behaviour for every IPv6 app to try all known 
addresses for the target host until any one succeeds, we are also 
able to connect to a IPv6-enabled host where the target daemon does 
not listen on a IPv6 interface.


Err, it's not OpenMPI, but the rsh/ssh client that tries the 
connections. My point however is a bit different as it also relates to 
the authentication behind the connection, where the IP (and therefore 
its flavour) which is used for making the connection counts:


- if you pass an IPv6 address to a non-IPv6 aware rsh/ssh client, the 
connection will fail. So the upper level which executes the rsh/ssh 
client would need to handle the fallback to the different addresses. 
OTOH, if the rsh/ssh client is IPv6 aware, it might already try them 
which will lead to an increased time to make (or decide that is not 
possible to make) the connection.


- if you try to connect over IPv6 with to a ssh daemon that only has 
hostbased auth. configured for IPv4 addresses (or viceversa), the 
hostbased auth. will fail (and most likely will ask for a password). 
This is quite likely to happen unknowingly if the distribution (like 
RHEL and Fedora) turn on IPv6 by default and assign link-local 
addresses, resulting in a working IPv6 configuration on the local 
network - which would apply to most clusters - without the admin 
having to do anything about it (guess how I found this out ;-))


For example, we ran several weeks without an IPv6-enabled rsh, which 
is used to handle MPI job startup on the cluster, without any 
problems.


What do you mean by "IPv6-enabled rsh" ? Was it the daemon, client or 
both ?


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: bogdan.coste...@iwr.uni-heidelberg.de


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Christian Kauhaus
Bogdan Costescu :
>What you propose here should work for the case of a single BTL that 
>handles both IPv4 and IPv6. How about the case of 2 BTLs ? (as it's 
>not clear to me from the rest of the discussion if one solution is 
>better than the other)

The introduction of a new 'tcp6' BTL was mentioned several times in the
discussion. This would result in an enormous amount of duplicated code,
since the IPv4->IPv6 transition would only affect a small fraction of
the total tcp BTL codebase. This is clearly a violation of the DRY
principle (don't repeat yourself).

>Myrinet at the same time with GM over Myrinet, which also brings it to 
>2 (or even 3) protocols over the same physical connection. So how 
>should these situations be handled ? (this is more of a general 
>question, not related to IPv6 implementation).

Very good point. Since we do not have Myrinet or IB in our institute
clusters, I cannot tell. Experiences and suggestions from others?

>Err, it's not OpenMPI, but the rsh/ssh client that tries the 
>connections. My point however is a bit different as it also relates to 
>the authentication behind the connection, where the IP (and therefore 
>its flavour) which is used for making the connection counts:

When using rsh for startup, any interference between IPv6 and rsh/ssh is
beyond the scope of OpenMPI. I really don't know which part of the
OpenMPI code base is able to influence how rsh will authenticate some
addresses. If rsh/ssh cannot handle or authenticate IPv6 connections,
the admin must keep the IPv6 addresses out of the resolver, so that
getaddrbyhost() never returns an IPv6 address. That's it.

>> For example, we ran several weeks without an IPv6-enabled rsh, which 
>What do you mean by "IPv6-enabled rsh" ? Was it the daemon, client or 
>both ?

We had both client and server IPv6-enabled and IPv6 addresses in the
resolver, but the server did not listen on the IPv6 socket, resulting in
a connection failure for IPv6. The subsequent connection attempt with the
IPv4 address did succeed though. 

 Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Christian Kauhaus
Ralph Castain :
>   Actually, we have some sensor network folks that are interested in
>   using OpenRTE for their applications. Their platforms can be small
>   microprocessors, many with custom mini-operating systems. Almost
>   none support IPv6 nor have any knowledge of that protocol.

I see. Do you think that SUSv2[1] would be a good starting point? This
means that we can rely on getaddrinfo() but not on sockaddr_in6 and
friends. Is this ok?

Christian

[1] http://www.opengroup.org/onlinepubs/007908799/
-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Ralph Castain




I think that would be okay - certainly  makes a good starting point! If
it becomes an issue later, we can revisit at that time.

Thanks
Ralph


Christian Kauhaus wrote:

  Ralph Castain :
  
  
  Actually, we have some sensor network folks that are interested in
  using OpenRTE for their applications. Their platforms can be small
  microprocessors, many with custom mini-operating systems. Almost
  none support IPv6 nor have any knowledge of that protocol.

  
  
I see. Do you think that SUSv2[1] would be a good starting point? This
means that we can rely on getaddrinfo() but not on sockaddr_in6 and
friends. Is this ok?

Christian

[1] http://www.opengroup.org/onlinepubs/007908799/
  





Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Bogdan Costescu

On Mon, 3 Apr 2006, Christian Kauhaus wrote:

This would result in an enormous amount of duplicated code, since 
the IPv4->IPv6 transition would only affect a small fraction of the 
total tcp BTL codebase. This is clearly a violation of the DRY 
principle (don't repeat yourself).


IMHO code can simply be shared and only the really different part 
should be made independent. This is more a question of whether the 
build system would allow such a scheme and of the runtime behaviour 
(for static linking only one copy of the common part should be linked, 
for dynamic loading maybe some module dependency to load the common 
code only once could achieve the same result).


If rsh/ssh cannot handle or authenticate IPv6 connections, the admin 
must keep the IPv6 addresses out of the resolver, so that 
getaddrbyhost() never returns an IPv6 address. That's it.


I beg to disagree. In a setup like the one mentioned, after orted is 
started via an IPv4-only rsh/ssh, OpenMPI applications could use IPv6 
without problems, just like they could use f.e. GM if Myrinet cards 
would be present. I see this very much like your past experience with 
the non-IPv6 rsh - it worked for you because the rsh client 
automatically tried the IPv4 address after the IPv6 failure, but the 
ssh client might not (be able to) do this. Please note that this is 
also a matter of homogeneity of the cluster that you can't know in 
advance, before starting the daemons - each host (including the one 
where the rsh/ssh clients are run) can have its own level of IPv6 
awareness.


On a side note, I think that the discussion can also be extended to 
the batch/queueing systems that might be used to start the OpenMPI job 
and would pass a list of machines to OpenMPI. If the machines are 
given as IPs (either v4 or v6), OpenMPI should probably assume that 
the address as given can be passed further to the underlying mechanism 
for starting the job (for example, for SGE this would be its own rsh 
client, not the system rsh client); but how about machines given as 
names ?


--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: bogdan.coste...@iwr.uni-heidelberg.de


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Ralf Wildenhues
* Bogdan Costescu wrote on Mon, Apr 03, 2006 at 05:34:56PM CEST:
> 
> IMHO code can simply be shared and only the really different part 
> should be made independent. This is more a question of whether the 
> build system would allow such a scheme and of the runtime behaviour 
> (for static linking only one copy of the common part should be linked, 
> for dynamic loading maybe some module dependency to load the common 
> code only once could achieve the same result).

Yes, that should be possible; most portable would be if it can be coded
to not have circular symbol references.

Cheers,
Ralf


Re: [OMPI devel] IPv6 support in OpenMPI?

2006-04-03 Thread Christian Kauhaus
Bogdan Costescu :
>I beg to disagree. In a setup like the one mentioned, after orted is 
>started via an IPv4-only rsh/ssh, OpenMPI applications could use IPv6 
>without problems, just like they could use f.e. GM if Myrinet cards 
>would be present. I see this very much like your past experience with 

Ok, I slowly get the point... ;-)

I see two possibilities concerning machinefile contents and how this
affects both rsh/ssh and OpenMPI.

1. Put IPv[46] addresses into the machinefile. Since they are
   protocol-specific, both rsh/ssh uses them just the way they are.
   OpenMPI performs interface discovery, possibly getting more
   addresses. The use of these addresses is controlled via runtime
   configuration.

2. Put hostnames into the machinefile. Both rsh/ssh and OpenMPI perform
   their own resolver lookup. When the resolver library gives both
   addresses (IPv4 and IPv6), which one to used is handled via OpenMPI
   runtime configuration. What rsh/ssh do depends on
   them. Note that you can specify an IP protocol selection at least
   with ssh (-4/-6 cmdline switches).

Your setup (IPv6 addresses given by the resolver, but no IPv6-aware ssh)
could be handled in both ways: either by putting numeric IPv4-addresses
into the machinefile, or by specifying 'ssh -4'.

Is this a solution?

 Christian

-- 
Dipl.-Inf. Christian Kauhaus   <><
Lehrstuhl fuer Rechnerarchitektur und -kommunikation 
Institut fuer Informatik * Ernst-Abbe-Platz 1-2 * D-07743 Jena
Tel: +49 3641 9 46376  *  Fax: +49 3641 9 46372   *  Raum 3217