Re: [OMPI devel] HWLOC duplication relief

2021-02-04 Thread Brice Goglin via devel
The text looks correct to me. I don't have any better suggestion for now. I am thinking about adding a adopt() flag to say "adopt it, or give me a pointer to the already adopted one", but it's not clear to me how to implement this safely. I opened a hwloc issues to discuss the details of making

Re: [OMPI devel] HWLOC duplication relief

2021-02-03 Thread Ralph Castain via devel
I have updated the site to reflect this discussion to-date. I'm still trying to figure out what to do about low-level libs. For now, I've removed the envars and modified suggestions. https://openpmix.github.io/support/faq/avoid-hwloc-dup Further comment/input is welcome. > On Feb 3, 2021, at

Re: [OMPI devel] HWLOC duplication relief

2021-02-03 Thread Ralph Castain via devel
What if we do this: - if you are using PMIx v4.1 or above, then there is no problem. Call PMIx_Load_topology and we will always return a valid pointer to the topology, subject to the caveat that all members of the process (as well as the server) must use the same hwloc version. - if you are

Re: [OMPI devel] HWLOC duplication relief

2021-02-03 Thread Ralph Castain via devel
I guess this begs the question: how does a library detect that the shmem region has already been mapped? If we attempt to map it and fail, does that mean it has already been mapped or that it doesn't exist? It isn't reasonable to expect that all the libraries in a process will coordinate such

Re: [OMPI devel] HWLOC duplication relief

2021-02-03 Thread Brice Goglin via devel
Hello Ralph One thing that isn't clear in this document : the hwloc shmem region may only be mapped *once* per process (because the mmap address is always the same). Hence, if a library calls adopt() in the process, others will fail. This applies to the 2nd and 3rd case in "Accessing the HWLOC

[OMPI devel] HWLOC duplication relief

2021-02-02 Thread Ralph Castain via devel
Hi folks Per today's telecon, here is a link to a description of the HWLOC duplication issue for many-core environments and methods by which you can mitigate the impact. https://openpmix.github.io/support/faq/avoid-hwloc-dup George: for lower-level libs like treematch or HAN, you might want