Re: [OMPI devel] openmpi-2.0.0 - problems with ppc64, PGI and atomics

2016-09-02 Thread Jeff Squyres (jsquyres)
Issue filed at https://github.com/open-mpi/ompi/issues/2044. I asked Nathan and Sylvain to have a look. > On Sep 1, 2016, at 9:20 PM, Paul Hargrove wrote: > > I failed to get PGI 16.x working at all (licence issue, I think). > So, I can neither confirm nor refute Geoffroy's reported problems.

Re: [OMPI devel] 2.0.1rc3 posted

2016-09-02 Thread Jeff Squyres (jsquyres)
> On Sep 1, 2016, at 8:42 PM, Gilles Gouaillardet wrote: > > Paul, > > > I guess this was a typo, and you should either read > > - Fix a SPARC alignment issue > > or > > - Fix an alignment issue on alignment sensitive processors such as SPARC I did not copy and paste those bullets from NEWS

Re: [OMPI devel] 2.0.1rc3 posted

2016-09-02 Thread Paul Hargrove
I can confirm that 2.0.1rc2+patch *did* run correctly on Linux/SPARC. I am running 2.0.1rc3 now, for completeness. -Paul On Fri, Sep 2, 2016 at 3:24 AM, Jeff Squyres (jsquyres) wrote: > > On Sep 1, 2016, at 8:42 PM, Gilles Gouaillardet > wrote: > > > > Paul, > > > > > > I guess this was a typo

Re: [OMPI devel] 2.0.1rc3 posted

2016-09-02 Thread Paul Hargrove
All of my testing on 2.0.1rc3 is complete except for SPARC. The alignment issue on SPARC *has* been tested via 2.0.1rc2 + patch (so there is very low probability that 2.0.1rc3 would fail). At this point I am aware of only two platforms that fail that we didn't already know about: + OpenBSD-6.0 dis

[OMPI devel] Question about Open MPI bindings

2016-09-02 Thread George Bosilca
While investigating the ongoing issue with OMPI messaging layer, I run into some troubles with process binding. I read the documentation, but I still find this puzzling. Disclaimer: all experiments were done with current master (9c496f7) compiled in optimized mode. The hardware: a single node 20 c

Re: [OMPI devel] Question about Open MPI bindings

2016-09-02 Thread r...@open-mpi.org
I’ll dig more later, but just checking offhand, I can’t replicate this on my box, so it may be something in hwloc for that box (or maybe you have some MCA params set somewhere?): $ mpirun -n 2 --bind-to core --report-bindings hostname [rhc001:83938] MCW rank 0 bound to socket 0[core 0[hwt 0-1]]:

Re: [OMPI devel] Question about Open MPI bindings

2016-09-02 Thread Gilles Gouaillardet
George, I cannot help much with this i am afraid My best bet would be to rebuild OpenMPI with --enable-debug and an external recent hwloc (iirc hwloc v2 cannot be used in Open MPI yet) You might also want to try mpirun --tag-output --bind-to xxx --report-bindings grep Cpus_allowed_list /proc/s

Re: [OMPI devel] Question about Open MPI bindings

2016-09-02 Thread George Bosilca
On Sat, Sep 3, 2016 at 12:18 AM, r...@open-mpi.org wrote: > I’ll dig more later, but just checking offhand, I can’t replicate this on > my box, so it may be something in hwloc for that box (or maybe you have > some MCA params set somewhere?): > Yes, I have 2 MCA parameters set (orte_default_host

Re: [OMPI devel] Question about Open MPI bindings

2016-09-02 Thread George Bosilca
Thanks Gilles, that's a very useful trick. The bindings reported by ORTE are in sync with the one reported by the OS. $ mpirun -np 2 --tag-output --bind-to core --report-bindings grep Cpus_allowed_list /proc/self/status [1,0]:[arc00:90813] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core

[OMPI devel] Question about Open MPI bindings

2016-09-02 Thread Gilles Gouaillardet
George, did you mean to write *not* in sync instead ? note the ORTE output is different than the one you posted earlier (though btl were differents) as far as I understand, CPUs_allowed_list should really be 0,20 and 10,30 and in order to match the ORTE output, they should be 0,4 and 10,14 Ch