If you see ways to improve it, you are welcome to do so. > On Aug 22, 2016, at 12:30 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: > > Folks, > > > i was reviewing the sources of the coll/sync module, and > > > 1) i noticed the same pattern is used in *every* sources : > > if (s->in_operation) { > return s->c_coll.coll_xxx(...); > } else { > COLL_SYNC(s, s->c_coll.coll_xxx(...)); > > } > > is there any rationale for not moving the if(s->in_operation) test into the > COLL_SYNC macro ? > > > 2) i could not find a rationale for using s->in_operation : > - if a barrier must be performed, the barrier of the underlying module (e.g. > coll/tuned) is directly invoked, so coll/sync is not somehow reentrant > - with MPI_THREAD_MULTIPLE, it is the enduser responsability that two threads > never invoke simultaneously a collective operation on the *same* communicator > (and s->in_operation is a per-communicator boolean), so i do not see how > s->in_operation can be true in a valid MPI program. > > > Though the first point can be seen as a "matter of style", i am pretty > curious about the second one. > > > Cheers, > > > Gilles > > On 8/21/2016 3:44 AM, George Bosilca wrote: >> Ralph, >> >> Bringing back the coll/sync is a cheap shot at hiding a real issue behind a >> smoke curtain. As Nathan described in his email, Open MPI lacks of control >> flow on eager messages is the real culprit here, and the loop around any >> one-to-many collective (bcast and scatter*) was only helping to exacerbate >> the issue. However, doing a loop around a small MPI_Send will also end on a >> memory exhaustion issue, one that would not be easily circumvented by adding >> synchronizations deep inside the library. >> >> George. >> >> >> On Sat, Aug 20, 2016 at 12:30 AM, r...@open-mpi.org >> <mailto:r...@open-mpi.org> <r...@open-mpi.org <mailto:r...@open-mpi.org>> >> wrote: >> I can not provide the user report as it is a proprietary problem. However, >> it consists of a large loop of calls to MPI_Bcast that crashes due to >> unexpected messages. We have been looking at instituting flow control, but >> that has way too widespread an impact. The coll/sync component would be a >> simple solution. >> >> I honestly don’t believe the issue I was resolving was due to a bug - it was >> a simple problem of one proc running slow and creating an overload of >> unexpected messages that eventually consumed too much memory. Rather, I >> think you solved a different problem - by the time you arrived at LANL, the >> app I was working with had already modified their code to no longer create >> the problem (essentially refactoring the algorithm to avoid the massive loop >> over allreduce). >> >> I have no issue supporting it as it takes near-zero effort to maintain, and >> this is a fairly common problem with legacy codes that don’t want to >> refactor their algorithms. >> >> >> > On Aug 19, 2016, at 8:48 PM, Nathan Hjelm <hje...@me.com >> > <mailto:hje...@me.com>> wrote: >> > >> >> On Aug 19, 2016, at 4:24 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> >> >> wrote: >> >> >> >> Hi folks >> >> >> >> I had a question arise regarding a problem being seen by an OMPI user - >> >> has to do with the old bugaboo I originally dealt with back in my LANL >> >> days. The problem is with an app that repeatedly hammers on a collective, >> >> and gets overwhelmed by unexpected messages when one of the procs falls >> >> behind. >> > >> > I did some investigation on roadrunner several years ago and determined >> > that the user code issue coll/sync was attempting to fix was due to a bug >> > in ob1/cksum (really can’t remember). coll/sync was simply masking a >> > live-lock problem. I committed a workaround for the bug in r26575 >> > (https://github.com/open-mpi/ompi/commit/59e529cf1dfe986e40d14ec4d2a2e5ef0cea5e35 >> > >> > <https://github.com/open-mpi/ompi/commit/59e529cf1dfe986e40d14ec4d2a2e5ef0cea5e35>) >> > and tested it with the user code. After this change the user code ran >> > fine without coll/sync. Since lanl no longer had any users of coll/sync we >> > stopped supporting it. >> > >> >> I solved this back then by introducing the “sync” component in >> >> ompi/mca/coll, which injected a barrier operation every N collectives. >> >> You could even “tune” it by doing the injection for only specific >> >> collectives. >> >> >> >> However, I can no longer find that component in the code base - I find it >> >> in the 1.6 series, but someone removed it during the 1.7 series. >> >> >> >> Can someone tell me why this was done??? Is there any reason not to bring >> >> it back? It solves a very real, not uncommon, problem. >> >> Ralph >> > >> > This was discussed during one (or several) tel-cons years ago. We agreed >> > to kill it and bring it back if there is 1) a use case, and 2) someone is >> > willing to support it. See >> > https://github.com/open-mpi/ompi/commit/5451ee46bd6fcdec002b333474dec919475d2d62 >> > >> > <https://github.com/open-mpi/ompi/commit/5451ee46bd6fcdec002b333474dec919475d2d62> >> > . >> > >> > Can you link the user email? >> > >> > -Nathan >> > _______________________________________________ >> > devel mailing list >> > devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> >> > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> > <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel> >> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel> >> >> >> _______________________________________________ >> devel mailing list >> devel@lists.open-mpi.org <mailto:devel@lists.open-mpi.org> >> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel >> <https://rfd.newmexicoconsortium.org/mailman/listinfo/devel> > _______________________________________________ > devel mailing list > devel@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
_______________________________________________ devel mailing list devel@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/devel