iirc, hwloc can read input from an xml file.
if not already the case, should we provide a simple mechanism to tell hwloc
not to detect the topology from the os but from a config file.
for example, if working on a new os and/or hardware, then manually generate
the hwloc xml file on each node and do something like
mpirun --mca hwloc_file /etc/hwloc.xml ...

makes sense ?

On Friday, September 4, 2015, Ralph Castain <r...@open-mpi.org> wrote:

> It sounds, then, like removing —without-hwloc will do no harm. At worst,
> hwloc might report inaccurate info, but that won’t stop us from running
> with appropriate cmd line options (e.g., to set the #slots and bind-to
> none).
>
> Unless there are any further concerns, I’ll prep the PR
>
>
> On Sep 4, 2015, at 1:08 AM, Kawashima, Takahiro <
> t-kawash...@jp.fujitsu.com
> <javascript:_e(%7B%7D,'cvml','t-kawash...@jp.fujitsu.com');>> wrote:
>
> Brice,
>
> I'm a developer of Fujitsu MPI for K computer and Fujitsu
> PRIMEHPC FX10/FX100 (SPARC-based CPU).
>
> Though I'm not familiar with the hwloc code and didn't know
> the issue reported by Gilles, I also would be able to help
> you to fix the issue.
>
> Takahiro Kawashima,
> MPI development team,
> Fujitsu
>
> Thanks Brice,
>
> bottom line, even if hwloc is not fully ported, it should build and ompi
> should get something usable.
> in this case, i have no objection removing the --without-hwloc configure
> option.
>
> you can contact me off-list regarding the FX10 specific issue
>
> Cheers,
>
> Gilles
>
> On 9/4/2015 2:31 PM, Brice Goglin wrote:
>
> Le 04/09/2015 00:36, Gilles Gouaillardet a écrit :
>
> Ralph,
>
> just to be clear, your proposal is to abort if openmpi is configured
> with --without-hwloc, right ?
> ( the --with-hwloc option is not removed because we want to keep the
> option of using an external hwloc library )
>
> if I understand correctly, Paul's point is that if openmpi is ported
> to a new architecture for which hwloc has not been ported yet
> (embedded hwloc or external hwloc), then the very first step is to
> port hwloc before ompi can be built.
>
> did I get it right Paul ?
>
> Brice, what would happen in such a case ?
> embedded hwloc cannot be built ?
> hwloc returns little or no information ?
>
>
> If it's a new operating system and it supports at least things like
> sysconf, you will get a Machine object with one PUs per logical processor.
>
> If it's a new platform running Linux, they are supposed to tell Linux
> at least package/core/thread information. That's what we have for ARM
> for instance.
>
> Missing topology detection can be worked around easily (with XML and
> synthetic description, what we did for BlueGene/Q before adding manual
> support for that specific processor). Binding support can't.
> And once you get binding, you get x86-topology even if the operating
> system isn't supported (using cpuid).
>
> for example, on Fujitsu FX10 node (single socket, 16 cores), hwloc
> reports 16 sockets with one core each and no cache. though this is
> not correct, that can be seen as equivalent to the real config by
> ompi, so this is not really an issue for ompi.
>
>
> Can you help fixing this?
>
> The issue is indeed with supercomputers with uncommon architectures
> like this one.
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/09/17961.php
>
>
>

Reply via email to