Brice,

I'm a developer of Fujitsu MPI for K computer and Fujitsu
PRIMEHPC FX10/FX100 (SPARC-based CPU).

Though I'm not familiar with the hwloc code and didn't know
the issue reported by Gilles, I also would be able to help
you to fix the issue.

Takahiro Kawashima,
MPI development team,
Fujitsu

> Thanks Brice,
> 
> bottom line, even if hwloc is not fully ported, it should build and ompi 
> should get something usable.
> in this case, i have no objection removing the --without-hwloc configure 
> option.
> 
> you can contact me off-list regarding the FX10 specific issue
> 
> Cheers,
> 
> Gilles
> 
> On 9/4/2015 2:31 PM, Brice Goglin wrote:
> > Le 04/09/2015 00:36, Gilles Gouaillardet a écrit :
> >> Ralph,
> >>
> >> just to be clear, your proposal is to abort if openmpi is configured 
> >> with --without-hwloc, right ?
> >> ( the --with-hwloc option is not removed because we want to keep the 
> >> option of using an external hwloc library )
> >>
> >> if I understand correctly, Paul's point is that if openmpi is ported 
> >> to a new architecture for which hwloc has not been ported yet 
> >> (embedded hwloc or external hwloc), then the very first step is to 
> >> port hwloc before ompi can be built.
> >>
> >> did I get it right Paul ?
> >>
> >> Brice, what would happen in such a case ?
> >> embedded hwloc cannot be built ?
> >> hwloc returns little or no information ?
> >
> > If it's a new operating system and it supports at least things like 
> > sysconf, you will get a Machine object with one PUs per logical processor.
> >
> > If it's a new platform running Linux, they are supposed to tell Linux 
> > at least package/core/thread information. That's what we have for ARM 
> > for instance.
> >
> > Missing topology detection can be worked around easily (with XML and 
> > synthetic description, what we did for BlueGene/Q before adding manual 
> > support for that specific processor). Binding support can't.
> > And once you get binding, you get x86-topology even if the operating 
> > system isn't supported (using cpuid).
> >
> >> for example, on Fujitsu FX10 node (single socket, 16 cores), hwloc 
> >> reports 16 sockets with one core each and no cache. though this is 
> >> not correct, that can be seen as equivalent to the real config by 
> >> ompi, so this is not really an issue for ompi.
> >
> > Can you help fixing this?
> >
> > The issue is indeed with supercomputers with uncommon architectures 
> > like this one.

Reply via email to