Re: [OMPI devel] rankfile relative host claiming option patch

2009-06-25 Thread Ralph Castain
Forget that comment, Lenny - I think this actually looks fine. The relative notation currently is only used in the allocators, not the mappers, so this is fine. Sorry for the confusion. Ralph On Jun 25, 2009, at 2:50 PM, Ralph Castain wrote: Question: for all other mappers, the relative ran

Re: [OMPI devel] rankfile relative host claiming option patch

2009-06-25 Thread Ralph Castain
Question: for all other mappers, the relative rank is given with respect to the allocation. It looks here like you are doing it relative to the list of nodes, which is compiled from the allocation passed through hostfile and -host options. Do you want to conform to the behavior of the other

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Paul H. Hargrove
Brian W. Barrett wrote: All - Jeff, Eugene, and I had a long discussion this morning on the sm BTL flow management issues and came to a couple of conclusions. * Jeff, Eugene, and I are all convinced that Eugene's addition of polling the receive queue to drain acks when sends start backing up

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Brian W. Barrett
On Thu, 25 Jun 2009, Eugene Loh wrote: I spoke with Brian and Jeff about this earlier today. Presumably, up through 1.2, mca_btl_component_progress would poll and if it received a message fragment would return. Then, presumably in 1.3.0, behavior was changed to keep polling until the FIFO wa

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Brian W. Barrett
All - Jeff, Eugene, and I had a long discussion this morning on the sm BTL flow management issues and came to a couple of conclusions. * Jeff, Eugene, and I are all convinced that Eugene's addition of polling the receive queue to drain acks when sends start backing up is required for deadloc

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Eugene Loh
Eugene Loh wrote: If you look in mca_btl_sm_component_progress, when a process receives a message fragment and returns it to the sender, it executes code like this: goto recheck_peer; break; Okay, the reason I show you that code is because a static code checker should easily ident

Re: [OMPI devel] [OMPI svn] svn:open-mpi r21504

2009-06-25 Thread Ralph Castain
On Jun 25, 2009, at 9:30 AM, Iain Bason wrote: On Jun 25, 2009, at 11:10 AM, Ralph Castain wrote: They do flow along the route at all times. However, without static ports the orted has to start by directly connecting to the HNP and sending the orted's contact info to the HNP. This is th

[OMPI devel] rankfile relative host claiming option patch

2009-06-25 Thread Lenny Verkhovsky
Hi, Proposed small patch to extend current rankfile syntax to be compliant with orte_hosts syntax making it possible to claim relative hosts from the hostfile/scheduler by using +n# hostname, where 0 <= # < np ex: cat ~/work/svn/hpc/dev/test/Rankfile/rankfile rank 0=+n0 slot=0 rank 1=+n0 slot=1 ra

Re: [OMPI devel] [OMPI svn] svn:open-mpi r21504

2009-06-25 Thread Iain Bason
On Jun 25, 2009, at 11:10 AM, Ralph Castain wrote: They do flow along the route at all times. However, without static ports the orted has to start by directly connecting to the HNP and sending the orted's contact info to the HNP. This is the part I don't understand. Why can't they send th

Re: [OMPI devel] [OMPI svn] svn:open-mpi r21504

2009-06-25 Thread Ralph Castain
They do flow along the route at all times. However, without static ports the orted has to start by directly connecting to the HNP and sending the orted's contact info to the HNP. Then the HNP includes that info in the launch msg, allowing the orteds to wireup their routes. So the difference

Re: [OMPI devel] [OMPI svn] svn:open-mpi r21504

2009-06-25 Thread Iain Bason
On Jun 23, 2009, at 7:17 PM, Ralph Castain wrote: Not any more, when using regex - the only message that comes back is one/node telling the HNP that the procs have been launched. These messages flow along the route, not direct to the HNP - assuming you use the static port option. Is ther

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Jeff Squyres
FWIW, Ralph and I have generally moved away from hosting hg's on www.open-mpi.org -- we've been using bitbucket.org for hosting public and shared hg repos. It's free to get an account. We love bitbucket.org! :-) On Jun 25, 2009, at 10:23 AM, Eugene Loh wrote: Might be fixed now. Ralph

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Eugene Loh
Might be fixed now. Ralph Castain wrote: Unfortunately, we cannot access this - permissions are denied. In poking around, I found that your hg directory has permission 700. Afraid you'll have to grant us permission to access this. :-/ On Jun 25, 2009, at 1:06 AM, Eugene Loh wrote: Bryan

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Ralph Castain
Unfortunately, we cannot access this - permissions are denied. In poking around, I found that your hg directory has permission 700. Afraid you'll have to grant us permission to access this. :-/ Ralph On Jun 25, 2009, at 1:06 AM, Eugene Loh wrote: Bryan Lally wrote: Ralph Castain wrote:

Re: [OMPI devel] sm BTL flow management

2009-06-25 Thread Eugene Loh
Bryan Lally wrote: Ralph Castain wrote: Be happy to put it through the wringer... :-) My wringer is available, too. 'kay. Try hg clone ssh://www.open-mpi.org/~eloh/hg/pending_sends which is r21498 but with changes to poll one's own FIFO more regularly (e.g., even when just performing s