Re: [Pvfs2-developers] fix BMI multiplexing of multiple methods

Rob Ross Thu, 08 Jan 2009 08:49:52 -0800

Hey,

For the CALLBACK option, you would use that to have the individualmethods filling in things at the "generic BMI" layer (for lack of theright terminology), but the overall user API would be the same?

I don't think that the CONTEXT option is appropriate. I don't want toexpose the specifics of the underlying networks any more than we havealready.

There should be relevant research in the MPI space related to thePOLL_PLAN option.

Do we consider this to be a problem for both clients and servers, oris it really a server-specific issue? If this is something we thinkwill solely (or mostly) a server thing, we could consider throwing athread at the issue. One option might be to kick off a thread to waiton the TCP side of things, since the kernel is doing most of the workfor us anyway, and put completed TCP events into the completion listasynchronously (for servers only)?


Rob

On Jan 7, 2009, at 4:06 PM, Sam Lang wrote:

Hi All,
Right now if multiple methods are enabled in BMI, we tend to getpoor performance from the "fast" network, because BMI_testcontextiterates through all the active methods calling testcontext for eachone. It tries to be smart about which methods get scheduled ;-) toprevent starvation, but it treats all the methods fairly, whichtends to make tcp (the slow one) hog the time spent in testcontext.I have a few ideas for this, so I'll go ahead and propose them andlet you all shoot them down or propose others.
Option CALLBACK: Instead of returning completion as a list intestcontext, we allow a BMI context to be constructed with acallback, and on completion of operations, the callback is called.This allows each method to drive its own operations, and notify theconsumer of completion immediately. There would still need to be atestcontext call for methods that only service operations duringthat call. The changes might not be that significant, theBMI_open_context call could just take an extra parameter that wasthe callback function. If the parameter is null, we just use thecompletion list as before.
Option CONTEXT: Require separate contexts for separate methods.This pushes the problem up to the application, probably not where itbelongs, since active methods are opaque from the BMI api.
Option POLL_PLAN: Modify the construct_poll_plan function in bmithat already tries to be fair, so that its aware of the performancediscrepancy between methods. Maybe it can just skip tcp every othertime for example. This is probably the easiest, since it doesn'trequire API changes and the like.
-sam

_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers

Re: [Pvfs2-developers] fix BMI multiplexing of multiple methods

Reply via email to