The likelihood of a physical meeting about this in the near future is unlikely; I think we're all facing travel restrictions and constraints with the holidays coming up.

How about a teleconf to discuss the following about the notifier:

- what exactly is there today
- why what is there today is the way it is
- discuss proposals on different ways to do it

More specifically, I think we all agree that the idea of an MPI application notifying a higher-level entity when it detects errors is a good one (e.g., on the host, or in the network, or ...). I think that it is worth discussing in higher bandwidth so that we can avoid email hell (I agree with Ralph; this could devolve pretty easily).

I propose any of the following times to discuss (I'll setup a phone bridge):

- Mon, Dec 8, 2pm, 3pm, or 4pm Eastern
- Tue, Dec 9, 10am, noon, 1pm, 2pm, 3pm, or 4pm Eastern
- Wed, Dec 10, any time
- Thu, Dec 11, 11am, 1pm, 2pm, 3pm, or 4pm Eastern
- Fri, Dec 12, 9am, 10am, 11am, 2pm, 3pm, or 4pm Eastern




On Dec 4, 2008, at 3:16 PM, Ralph Castain wrote:

I'm beginning to believe that we need a design meeting specifically over this question. Too many unknowns exist, with significant potential problems lurking behind them. Frankly, this issue could have a major impact on how we operate, performance, and a variety of other factors going forward - many of which may be difficult to predict.

I suspect there may not be "optimal" solutions to many of these questions, but there certainly will be strong opinions in multiple directions.

As part of that discussion, I propose that we consider alternative methods for meeting the same overall objective - namely, reuse of the BTL's by another software project. For example, a simple copy- and-branch is the dominant method today, with patches used by both parties to cherry-pick the changes they want from the other code users. Multiple tools have been developed to support this mode of operation, yet we haven't discussed any of them in this context. The proposed approach contains a number of impacts that may be avoided with an alternative approach.

Without such a meeting, I fear we are going to rapidly dissolve into email hell again.

Ralph



On Dec 4, 2008, at 1:07 PM, Eugene Loh wrote:

Richard Graham wrote:

I expect this will involve some sort of well defined interface between the btl’s and orte, and I don’t know if this will also require something like this between the btl’s and the pml – I think that interface is rigidly enforced, but am not sure.
I'm probably missing the scope of what you're saying here, but it raises another question in my mind. Is there today a well-defined interface between the BTLs and... anything else? PML or whatever? Maybe this comes back to a documentation question: do we (or will we) have anything written down that says what a BTL must do, what it may rely on, etc.?
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Jeff Squyres
Cisco Systems


Reply via email to