Don't forget that network topologies can also be complex -- it's not always a 
simple, single-path hierarchy.  There can be multiple paths between any pair of 
hosts on the network.  Sometimes the hosts are aware of the multiple paths, 
sometimes they are not (e.g., sometimes the fabric routing changes during the 
course of a single MPI job, and the hosts/MPI applications are unaware).

Meaning: the information about which network paths are taken for a given 
host-A-to-host-B traversal may be both distributed and transient.


On Aug 14, 2019, at 11:05 AM, Rigel Falcao do Couto Alves 
<rigel.al...@tu-dresden.de<mailto:rigel.al...@tu-dresden.de>> wrote:

Hi,

I am doing a PhD in performance analysis of highly parallel CFD codes and would 
like to suggest a feature for Netloc: from topic Build Scotch sub-architectures 
(at https://www.open-mpi.org/projects/hwloc/doc/v2.0.3/a00329.php), create a 
function-version of netloc_get_resources, which could retrieve at runtime the 
network details of the available cluster resources (i.e. the nodes allocated to 
the job). I am mostly interested about how many switches (the gray circles in 
the figure below) need to be traversed in order for any pair of allocated nodes 
to communicate with each other:

<netloc_draw (3).png>

For example, suppose my job is running within 4 nodes in the cluster, 
illustrated by the numbers above. All I would love to get from Netloc - at 
runtime - is some sort of classification of the nodes, like:

1: aa
2: ab
3: ba
4: ca

The difference between nodes 1 and 2 is on the last digit, which means their 
MPI communications only need to traverse 1 switch; however, between any of them 
and nodes 3 or 4, the difference starts on the second-last digit, which means 
their communications need to traverse two switches. More digits may be 
left-added to the string, per necessity; i.e. if the central gray circle on the 
above figure is connected to another switch, which in turnleads to another part 
of the cluster's structure (with its own switches, nodes etc.). For me, it is 
at the present moment irrelevant whether e.g. nodes 1 and 2 are physically - or 
logically - consecutive to each other: a, b, c etc. would be just arbitrary 
identifiers.

I would then use this data to plot the process placement, using open-source 
tools developed here in the University of Dresden (Germany); i.e. Scotch is not 
an option for me. The results of my study will be open-source as well and I can 
gladly share them with you once the thesis is finished.

I hope I have clearly explained what I have in mind; please let me know if 
there are any questions. Finally, it is important that this feature is part of 
Netloc's API (as it is supposed to be integrated with the tools we develop 
here), works at runtime and doesn't require root privileges (as those tools are 
used by our cluster's costumers on their every-day job submissions).

Kind regards,


--
Dipl.-Ing. Rigel Alves
researcher

Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
Zellescher Weg 12 A 218, 01069 Dresden | Germany

�� +49 (351) 463.42418
�� https://tu-dresden.de/zih/die-einrichtung/struktur/rigel-alves


_______________________________________________
hwloc-users mailing list
hwloc-users@lists.open-mpi.org<mailto:hwloc-users@lists.open-mpi.org>
https://lists.open-mpi.org/mailman/listinfo/hwloc-users


--
Jeff Squyres
jsquy...@cisco.com<mailto:jsquy...@cisco.com>



_______________________________________________
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Reply via email to