On Mar 3, 2012, at 18:18 , Alex Margolin wrote: > I've figured that what I really need is to write my own BTL component, rather > then trying to manipulate the existing TCP one. I've started writing it using > the 1.5.5rc3 tarball and some pdfs from 2006 I found on the website (anything > else I can look at? TCP is much more complicated then what I'm writing). I > think I'm getting the hang of it, but I still have some questions about > terminology for the component implementation: > > The basic data structures for routing fragments are components, modules, > interfaces and endpoints, right?
Are you trying to route fragments through intermediary nodes? If yes, then I might have a patch somewhere supporting routing for send/recv protocols. > So, If I have 3 nodes, each with 2 interfaces (each having one constant IP), > and i'm running 2 processes total. I'll have... 1 component, 2 modules, 4 > interfaces (2 per module) and 4 addresses? > What about "links" (as in "num_of_links" component struct member) - what does > it count? Number of socket to be opened per device. In some cases (as an example when there is a hypervisor) one single socket is not enough to use the device completely. If I remember correctly on the PS3 3 socket were needed to get the 900Mbs out of the 1Gb ethernet link. > ompi_modex_send - Is it supposed to share the addresses of all the running > processes before they start? suppose I assume one NIC per machine. Can I just > send an array of mca_btl_tcp_addr_t, and every process will find the one > belonging to him by some index (his rank?). I saw the ompi_modex_recv() call > in _proc.c and it seems that every proc instance reads the entire sent buffer > anyway. Right, the modex is used to exchange the "business card" of each process. > Sorry for flooding you all with questions, I hope I'm not way off here. I > hope I'll finish writing something by the end of next week (I'm working on > this after hours, not full time), with the purpose of submitting it as a > contribution to open-mpi. Looking forward to it. george. > > Appreciate your help so far, > Alex > > On 03/02/2012 09:26 PM, Jeffrey Squyres wrote: >> Give your btl progress function. It'll get called quite frequently. >> >> Look at the "progress" section in btl.h. Progress threads don't work yet, >> but the btl_progress function will get called by the PML quite frequently. >> It's how BTL's like openib progress their outstanding message passing. >> >> >> >> On Mar 2, 2012, at 2:22 PM, Alex Margolin wrote: >> >>> On 03/02/2012 04:33 PM, Jeffrey Squyres wrote: >>>> Note that the OMPI 1.4.x series is about to be retired. If you're doing >>>> new stuff, I'd advise you to be working with the Open MPI SVN trunk. In >>>> the trunk, we've changed how we build libevent, so if you're adding to it, >>>> you probably want to be working there for max forward-compatibility. >>>> >>>> That being said: >>>> >>>>> I know trying to replace poll() seems like I'm doing something very >>>>> wrong, but I want to poll on events without a valid linux file descriptor >>>>> (and existing events, specifically sockets, at the same time), and I see >>>>> no other way. Obviously, my poll2 calls the linux poll in most cases. >>>> What exactly are you trying to do? OMPI has some internal hooks for >>>> non-fd-or-event-based progress. Indeed, libevent is typically called with >>>> fairly low frequency (e.g., if you're running with OpenFabrics or some >>>> other high-speed/not-fd-based networking interconnect). >>> I'm trying to create a new btl module. I've written an adapter from my >>> library to TCP, so I've implemented socket/connect/accept/send/recv... now >>> I've taken the TCP BTL module and cloned it - replacing the relevant calls >>> with mine. My only problem is with poll, which is not in the MCA (at least >>> in 1.4.x). >>> I've implemented poll() and select() but it's not that good, because my >>> events are not based on valid linux file descriptors, but I can poll all my >>> events at the same time (but not in conjunction with real FDs, >>> unfortunatly). >>> Can you give me some pointers as to where to look in the MPI (1.5?) source >>> code to implement it properly? >>> >>> Thanks, >>> Alex >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel