Don't forget that network topologies can also be complex -- it's not always a simple, single-path hierarchy. There can be multiple paths between any pair of hosts on the network. Sometimes the hosts are aware of the multiple paths, sometimes they are not (e.g., sometimes the fabric routing changes during the course of a single MPI job, and the hosts/MPI applications are unaware).
Meaning: the information about which network paths are taken for a given host-A-to-host-B traversal may be both distributed and transient. On Aug 14, 2019, at 11:05 AM, Rigel Falcao do Couto Alves <rigel.al...@tu-dresden.de<mailto:rigel.al...@tu-dresden.de>> wrote: Hi, I am doing a PhD in performance analysis of highly parallel CFD codes and would like to suggest a feature for Netloc: from topic Build Scotch sub-architectures (at https://www.open-mpi.org/projects/hwloc/doc/v2.0.3/a00329.php), create a function-version of netloc_get_resources, which could retrieve at runtime the network details of the available cluster resources (i.e. the nodes allocated to the job). I am mostly interested about how many switches (the gray circles in the figure below) need to be traversed in order for any pair of allocated nodes to communicate with each other: <netloc_draw (3).png> For example, suppose my job is running within 4 nodes in the cluster, illustrated by the numbers above. All I would love to get from Netloc - at runtime - is some sort of classification of the nodes, like: 1: aa 2: ab 3: ba 4: ca The difference between nodes 1 and 2 is on the last digit, which means their MPI communications only need to traverse 1 switch; however, between any of them and nodes 3 or 4, the difference starts on the second-last digit, which means their communications need to traverse two switches. More digits may be left-added to the string, per necessity; i.e. if the central gray circle on the above figure is connected to another switch, which in turnleads to another part of the cluster's structure (with its own switches, nodes etc.). For me, it is at the present moment irrelevant whether e.g. nodes 1 and 2 are physically - or logically - consecutive to each other: a, b, c etc. would be just arbitrary identifiers. I would then use this data to plot the process placement, using open-source tools developed here in the University of Dresden (Germany); i.e. Scotch is not an option for me. The results of my study will be open-source as well and I can gladly share them with you once the thesis is finished. I hope I have clearly explained what I have in mind; please let me know if there are any questions. Finally, it is important that this feature is part of Netloc's API (as it is supposed to be integrated with the tools we develop here), works at runtime and doesn't require root privileges (as those tools are used by our cluster's costumers on their every-day job submissions). Kind regards, -- Dipl.-Ing. Rigel Alves researcher Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) Zellescher Weg 12 A 218, 01069 Dresden | Germany �� +49 (351) 463.42418 �� https://tu-dresden.de/zih/die-einrichtung/struktur/rigel-alves _______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org<mailto:hwloc-users@lists.open-mpi.org> https://lists.open-mpi.org/mailman/listinfo/hwloc-users -- Jeff Squyres jsquy...@cisco.com<mailto:jsquy...@cisco.com>
_______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users