Hi,

> Am 27.01.2015 um 02:15 schrieb Sangmin Park <[email protected]>:
> 
> We have three HPC systems called A, B, and C and these could be accessible 
> through the login node. SGE is installed login node.
> A and B HPC systems are consist of master node and computing nodes 
> respectively and connected gigabit ethernet between them. But, C HPC system 
> has ideal configuration, not ethernet. It's wired infiniband network.
> 
> Each HPC system has two kinds of network, one for management using gigabit 
> network, another for computing using gagiabit for A, B system and infiniband 
> for C system.
> SGE uses management network.
> 
> Problems arise in C HPC system. SGE uses management network.
> So, when a user submits a job using sge, it could be use gigabit network, not 
> infiniband network.

This is not necessarily related to SGE.

- SGE can be instructed to use any TCP/IP based connections, being it eth1 or 
any else:

https://arc.liv.ac.uk/SGE/howto/multi_intrfcs.html

But for sake of easiness I tended to route this low traffic to eth0 where also 
MPI should run on (in former times the MPI libraries just used the given 
hostname). The file transfer by NFS was then done on eth1, which was easy to 
adjust by export and mount.

But nowadays this depends heavily on the used MPI library. While I configured 
the nodes all the time to have an unique name per interface, the Open MPI 
library for example tries to cope with the situation that all interfaces have 
the same name and perform some kind of interface/network scan to get all 
possible routes between the granted nodes and uses a fixed distribution of the 
amount of traffic afterwards. So it might use both: the IB and Gigabit network 
and split the traffic.

What parallel library do you intend to use? Best is to ask on the associated 
mailing list of the parallel library how to adjust the startup to use IB (and 
only IB) after the initial startup of the application.

I would appreciate in case you can post the results of your findings here.

-- Reuti


> To use infiniband network, sge has to work with infiniband network in all 
> cases.
> 
> Does SGE work in heterogeneous network systems well?
> 
> -Sangmin
> _______________________________________________
> users mailing list
> [email protected]
> https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to