I can report that that openmpi-v3.0.x-201803060306-c79e33b.tar.gz doesn’t
show the problem.

I also reran all of the osu benchmarks and performance was general in-line
with my 3.0.0 and 3.0.1rc3 builds.

Any chance of the fix making the 3.0.1 release (or a minimal recommend
patch I can apply to 3.0.0)?

-Alan

On Fri, Mar 9, 2018 at 12:21 PM Jeff Squyres (jsquyres) <[email protected]>
wrote:

> Specifically, Nathan is referring to:
>
>     https://www.open-mpi.org/nightly/v3.0.x/
>
>
> > On Mar 9, 2018, at 12:51 PM, Nathan Hjelm <[email protected]> wrote:
> >
> > Fixed in master and I'm the 3.0.x branch. Try the nightly tarball.
> >
> > On Mar 9, 2018, at 10:01 AM, Alan Wild <[email protected]> wrote:
> >
> >> I’ve been running the OSU micro benchmarks  (
> http://mvapich.cse.ohio-state.edu/benchmarks/ ). on my various MPI
> installations.  One test that has been consistently failing is osu_put_bibw
> when compiled with either openmpi 3.0.0 or openmpi 3.0.1rc3 when these
> builds have also linked in the Mellanox mxm, hcoll, and SHaRP libraries AND
> when running this two rank test across two nodes communicating with EDR
> Infiniband.
> >>
> >> Fortunately this failure was true for both optimized and debug builds
> of openmpi.
> >>
> >> Stepping into the code with Allinea DDT I think I found the issue...
> >>
> >> MPI_Win_post is ultimately calling ompi_osc_rdma_post_atomic() and on
> line 245 there’s an if statement that reads:
> >>
> >>         If (OPAL_UNLIKELY(OMPI_SUCCESS != ret)) {
> >>                 return OMPI_ERR_OUT_OF_RESOURCE;
> >>         }
> >>
> >> (Sorry can’t easily cut and paste the code... my work PC can’t get to
> my personal email so I have to post this from an iPad).
> >>
> >> Anyway,  if you look at the proceeding ~16 lines of code... “ret” is
> never initialized or assigned to in any way... (as far as I can tell).  I’m
> not completely familiar with the all the macros used, but it doesn’t appear
> that any of them are assigning to “ret”.  Surprised this isn’t causing more
> chaos.
> >>
> >> If I’m “right”.. is the right thing just to initialize ret to
> OMPI_SUCCESS or perhaps should this condition just come out?
> >>
> >> Thoughts?
> >>
> >> -Alan
> >> [email protected]
> >> --
> >> [email protected] http://humbleville.blogspot.com
> >> _______________________________________________
> >> devel mailing list
> >> [email protected]
> >> https://lists.open-mpi.org/mailman/listinfo/devel
> > _______________________________________________
> > devel mailing list
> > [email protected]
> > https://lists.open-mpi.org/mailman/listinfo/devel
>
>
> --
> Jeff Squyres
> [email protected]
>
> _______________________________________________
> devel mailing list
> [email protected]
> https://lists.open-mpi.org/mailman/listinfo/devel

-- 
[email protected] http://humbleville.blogspot.com
_______________________________________________
devel mailing list
[email protected]
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to