I've figured that what I really need is to write my own BTL component, rather then trying to manipulate the existing TCP one. I've started writing it using the 1.5.5rc3 tarball and some pdfs from 2006 I found on the website (anything else I can look at? TCP is much more complicated then what I'm writing). I think I'm getting the hang of it, but I still have some questions about terminology for the component implementation:

The basic data structures for routing fragments are components, modules, interfaces and endpoints, right? So, If I have 3 nodes, each with 2 interfaces (each having one constant IP), and i'm running 2 processes total. I'll have... 1 component, 2 modules, 4 interfaces (2 per module) and 4 addresses? What about "links" (as in "num_of_links" component struct member) - what does it count?

ompi_modex_send - Is it supposed to share the addresses of all the running processes before they start? suppose I assume one NIC per machine. Can I just send an array of mca_btl_tcp_addr_t, and every process will find the one belonging to him by some index (his rank?). I saw the ompi_modex_recv() call in _proc.c and it seems that every proc instance reads the entire sent buffer anyway.

Sorry for flooding you all with questions, I hope I'm not way off here. I hope I'll finish writing something by the end of next week (I'm working on this after hours, not full time), with the purpose of submitting it as a contribution to open-mpi.

Appreciate your help so far,
Alex

On 03/02/2012 09:26 PM, Jeffrey Squyres wrote:
Give your btl progress function.  It'll get called quite frequently.

Look at the "progress" section in btl.h.  Progress threads don't work yet, but 
the btl_progress function will get called by the PML quite frequently.  It's how BTL's 
like openib progress their outstanding message passing.



On Mar 2, 2012, at 2:22 PM, Alex Margolin wrote:

On 03/02/2012 04:33 PM, Jeffrey Squyres wrote:
Note that the OMPI 1.4.x series is about to be retired.  If you're doing new 
stuff, I'd advise you to be working with the Open MPI SVN trunk.  In the trunk, 
we've changed how we build libevent, so if you're adding to it, you probably 
want to be working there for max forward-compatibility.

That being said:

I know trying to replace poll() seems like I'm doing something very wrong, but 
I want to poll on events without a valid linux file descriptor (and existing 
events, specifically sockets, at the same time), and I see no other way. 
Obviously, my poll2 calls the linux poll in most cases.
What exactly are you trying to do?  OMPI has some internal hooks for 
non-fd-or-event-based progress.  Indeed, libevent is typically called with 
fairly low frequency (e.g., if you're running with OpenFabrics or some other 
high-speed/not-fd-based networking interconnect).
I'm trying to create a new btl module. I've written an adapter from my library 
to TCP, so I've implemented socket/connect/accept/send/recv... now I've taken 
the TCP BTL module and cloned it - replacing the relevant calls with mine. My 
only problem is with poll, which is not in the MCA (at least in 1.4.x).
I've implemented poll() and select() but it's not that good, because my events 
are not based on valid linux file descriptors, but I can poll all my events at 
the same time (but not in conjunction with real FDs, unfortunatly).
Can you give me some pointers as to where to look in the MPI (1.5?) source code 
to implement it properly?

Thanks,
Alex
_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to