For Iboffload this should not be an issue since our connection manager is blocking (I have to double-check )
For openib, this should not be such huge change. The code is pretty much standalone, we only have to move it to main thread and add signaling mechanism. I will take a look. Best, -Pasha On Nov 14, 2013, at 7:25 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Nov 14, 2013, at 4:22 PM, Shamis, Pavel <sham...@ornl.gov> wrote: > >> Well, this is major change in a behavior. >> >> Since openib calls communication calls from the callback >> it pretty much requires to enable thread safety on openib btl level. > > Ah, yes - could well be true. Or else separate the two like we do elsewhere - > transfer the recv callback to the openib thread and let it do the rest. > >> >> But we may move the queue flush operation from the callback to main thread, >> so >> the progress engine will wait on a signal from callback. > > Yep - that's what we do elsewhere > >> >> How does it work for other parts of OMPI (sm, communicator) ? >> I guess they don't do anything in the callbacks ? > > Correct - they immediately transfer the info to their local progress engine > (in whatever form). > >> >> Best, >> Pasha >> >> On Nov 14, 2013, at 6:35 PM, Ralph Castain <r...@open-mpi.org> wrote: >> >>> >>> On Nov 14, 2013, at 3:33 PM, Shamis, Pavel <sham...@ornl.gov> wrote: >>> >>>> >>>>> The only change is that the receive callback is now occurring in the ORTE >>>>> event thread, and so perhaps someone needs to look at a way to pass that >>>>> back into the OMPI event base (which I guess is the OPAL event base)? >>>>> Just glancing at the code, it looks like that could be the issue - but I >>>>> honestly have no idea what event base someone wants to switch to, or if >>>>> they want to resolve it some other way. There are clearly some things >>>>> happening in the ofacm oob code that involve thread locking etc., but I >>>>> don't know what those areas are trying to do. >>>> >>>> I see. In this mode do you enable thread safety support in all library >>>> (mpi)? >>> >>> Only if the user configures to do so - ORTE doesn't require it as we use >>> the event library's thread safety and do everything inside events. >>> >>>> >>>> _______________________________________________ >>>> devel mailing list >>>> de...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel