Le 04/09/2015 00:36, Gilles Gouaillardet a écrit :
> Ralph,
>
> just to be clear, your proposal is to abort if openmpi is configured
> with --without-hwloc, right ?
> ( the --with-hwloc option is not removed because we want to keep the
> option of using an external hwloc library )
>
> if I understand correctly, Paul's point is that if openmpi is ported
> to a new architecture for which hwloc has not been ported yet
> (embedded hwloc or external hwloc), then the very first step is to
> port hwloc before ompi can be built.
>
> did I get it right Paul ?
>
> Brice, what would happen in such a case ?
> embedded hwloc cannot be built ?
> hwloc returns little or no information ?

If it's a new operating system and it supports at least things like
sysconf, you will get a Machine object with one PUs per logical processor.

If it's a new platform running Linux, they are supposed to tell Linux at
least package/core/thread information. That's what we have for ARM for
instance.

Missing topology detection can be worked around easily (with XML and
synthetic description, what we did for BlueGene/Q before adding manual
support for that specific processor). Binding support can't.
And once you get binding, you get x86-topology even if the operating
system isn't supported (using cpuid).

> for example, on Fujitsu FX10 node (single socket, 16 cores), hwloc
> reports 16 sockets with one core each and no cache. though this is not
> correct, that can be seen as equivalent to the real config by ompi, so
> this is not really an issue for ompi.

Can you help fixing this?

The issue is indeed with supercomputers with uncommon architectures like
this one.

Brice


>
> Cheers,
>
> Gilles
>
> On Friday, September 4, 2015, Ralph Castain <r...@open-mpi.org
> <mailto:r...@open-mpi.org>> wrote:
>
>     No - hwloc is embedded in OMPI anyway.
>
>>     On Sep 3, 2015, at 11:09 AM, Paul Hargrove <phhargr...@lbl.gov
>>     <javascript:_e(%7B%7D,'cvml','phhargr...@lbl.gov');>> wrote:
>>
>>
>>     On Thu, Sep 3, 2015 at 8:03 AM, Ralph Castain <r...@open-mpi.org
>>     <javascript:_e(%7B%7D,'cvml','r...@open-mpi.org');>> wrote:
>>
>>         Does anyone know of a reason why we shouldn’t do this?
>>
>>
>>
>>     Would doing this mean that a port to a new system would require
>>     that one first perform a full hwloc port?
>>
>>     -Paul
>>
>>     -- 
>>     Paul H. Hargrove                          phhargr...@lbl.gov
>>     <javascript:_e(%7B%7D,'cvml','phhargr...@lbl.gov');>
>>     Computer Languages & Systems Software (CLaSS) Group
>>     Computer Science Department               Tel: +1-510-495-2352
>>     Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>>     _______________________________________________
>>     devel mailing list
>>     de...@open-mpi.org
>>     <javascript:_e(%7B%7D,'cvml','de...@open-mpi.org');>
>>     Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>     Link to this post:
>>     http://www.open-mpi.org/community/lists/devel/2015/09/17942.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/09/17952.php

Reply via email to