Am 13.11.2014 um 17:14 schrieb Ralph Castain:

> Hmmm…I’m beginning to grok the issue. It is a tad unusual for people to 
> assign different hostnames to their interfaces - I’ve seen it in the Hadoop 
> world, but not in HPC. Still, no law against it.

Maybe it depends on the background to do it this way. At one point in the past 
I read this Howto:

https://arc.liv.ac.uk/SGE/howto/multi_intrfcs.html

and appreciated the idea to route different services to different interfaces - 
a large file copy won't hurt the MPI communication this way. As SGE handles it 
well to contact the qmaster or execds on the correct interface of the machines 
(which might be eth0, eth1 or any else one), I'm doing it for a decade now this 
way and according to the mails on the SGE lists others are doing it too. Hence 
I don't see it that unusual.


> This will take a little thought to figure out a solution. One problem that 
> immediately occurs is if someone includes a hostfile that has lines which 
> refer to the same physical server, but using different interface names.

Yes, I see this point too. Therefore I had the idea to list all the interfaces 
one want to use in one line. In case they put it in different lines, they would 
do it the wrong way - their fault. One line = one machine, unless the list of 
interfaces is exactly the same in multiple lines, then they could be added up 
like now.

(Under SGE there is the [now correctly working] setup to get the same machine a 
couple of times in case they origin from several queues. But this would still 
fit with the above interpretation: the interface name is the same although they 
are coming from different queues they can just be added up like now in the 
GridEngine MCA.)


> We’ll think those are completely distinct servers, and so the process 
> placement will be totally messed up.
> 
> We’ll also encounter issues with the daemon when it reports back, as the 
> hostname it gets will almost certainly differ from the hostname we were 
> expecting. Not as critical, but need to check to see where that will impact 
> the code base

Hence I prefer to use eth0 for Open MPI (for now). But I remember that there 
was a time when it could be set up to route the MPI traffic dedicated to eth1, 
although it was for MPICH(1):

https://arc.liv.ac.uk/SGE/howto/mpich-integration.html => Wrong interface 
selected for the back channel of the MPICH-tasks with the ch_p4-device


> We can look at the hostfile changes at that time - no real objection to them, 
> but would need to figure out how to pass that info to the appropriate 
> subsystems. I assume you want this to apply to both the oob and tcp/btl?

Yes.


> Obviously, this won’t make it for 1.8 as it is going to be fairly intrusive, 
> but we can probably do something for 1.9
> 
>> On Nov 13, 2014, at 4:23 AM, Reuti <re...@staff.uni-marburg.de> wrote:
>> 
>> Am 13.11.2014 um 00:34 schrieb Ralph Castain:
>> 
>>>> On Nov 12, 2014, at 2:45 PM, Reuti <re...@staff.uni-marburg.de> wrote:
>>>> 
>>>> Am 12.11.2014 um 17:27 schrieb Reuti:
>>>> 
>>>>> Am 11.11.2014 um 02:25 schrieb Ralph Castain:
>>>>> 
>>>>>> Another thing you can do is (a) ensure you built with —enable-debug, and 
>>>>>> then (b) run it with -mca oob_base_verbose 100  (without the 
>>>>>> tcp_if_include option) so we can watch the connection handshake and see 
>>>>>> what it is doing. The —hetero-nodes will have not affect here and can be 
>>>>>> ignored.
>>>>> 
>>>>> Done. It really tries to connect to the outside interface of the 
>>>>> headnode. But being there a firewall or not: the nodes have no clue how 
>>>>> to reach 137.248.0.0 - they have no gateway to this network at all.
>>>> 
>>>> I have to revert this. They think that there is a gateway although it 
>>>> isn't. When I remove the entry by hand for the gateway in the routing 
>>>> table it starts up instantly too.
>>>> 
>>>> While I can do this on my own cluster I still have the 30 seconds delay on 
>>>> a cluster where I'm not root, while this can be because of the firewall 
>>>> there. The gateway on this cluster is indeed going to the outside world.
>>>> 
>>>> Personally I find this behavior a little bit too aggressive to use all 
>>>> interfaces. If you don't check this carefully beforehand and start a long 
>>>> running application one might even not notice the delay during the startup.
>>> 
>>> Agreed - do you have any suggestions on how we should choose the order in 
>>> which to try them? I haven’t been able to come up with anything yet. Jeff 
>>> has some fancy algo in his usnic BTL that we are going to discuss after SC 
>>> that I’m hoping will help, but I’d be open to doing something better in the 
>>> interim for 1.8.4
>> 
>> The plain`mpiexec` should just use the specified interface it finds in the 
>> hostfile. Being it hand crafted or prepared by any queuing system.
>> 
>> 
>> Option: could a single entry for a machine in the hostfile contain a list of 
>> interfaces? I mean something like:
>> 
>> node01,node01-extra-eth1,node01-extra-eth2 slots=4
>> 
>> or
>> 
>> node01* slots=4
>> 
>> Means: use exactly these interfaces or even try to find all available 
>> interfaces on/between the machines.
>> 
>> In case all interfaces have the same name, then it's up to the admin to 
>> correct this.
>> 
>> -- Reuti
>> 
>> 
>>>> -- Reuti
>>>> 
>>>> 
>>>>> It tries so independent from the internal or external name of the 
>>>>> headnode given in the machinefile - I hit ^C then. I attached the output 
>>>>> of Open MPI 1.8.1 for this setup too.
>>>>> 
>>>>> -- Reuti
>>>>> 
>>>>> <openmpi1.8.3.txt><openmpi1.8.1.txt>_______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2014/11/25777.php
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/11/25781.php
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/11/25782.php
>>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/11/25800.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25801.php

Reply via email to