I’ve been running the OSU micro benchmarks  (
http://mvapich.cse.ohio-state.edu/benchmarks/ ). on my various MPI
installations.  One test that has been consistently failing is osu_put_bibw
when compiled with either openmpi 3.0.0 or openmpi 3.0.1rc3 when these
builds have also linked in the Mellanox mxm, hcoll, and SHaRP libraries AND
when running this two rank test across two nodes communicating with EDR
Infiniband.

Fortunately this failure was true for both optimized and debug builds of
openmpi.

Stepping into the code with Allinea DDT I think I found the issue...

MPI_Win_post is ultimately calling ompi_osc_rdma_post_atomic() and on line
245 there’s an if statement that reads:

        If (OPAL_UNLIKELY(OMPI_SUCCESS != ret)) {
                return OMPI_ERR_OUT_OF_RESOURCE;
        }

(Sorry can’t easily cut and paste the code... my work PC can’t get to my
personal email so I have to post this from an iPad).

Anyway,  if you look at the proceeding ~16 lines of code... “ret” is never
initialized or assigned to in any way... (as far as I can tell).  I’m not
completely familiar with the all the macros used, but it doesn’t appear
that any of them are assigning to “ret”.  Surprised this isn’t causing more
chaos.

If I’m “right”.. is the right thing just to initialize ret to OMPI_SUCCESS
or perhaps should this condition just come out?

Thoughts?

-Alan
a...@madllama.net
-- 
a...@madllama.net http://humbleville.blogspot.com
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Reply via email to