Re: [OMPI devel] OMPI over ofed udapl over iwarp

2007-05-14 Thread Steve Wise
On Sun, 2007-05-13 at 21:26 -0400, Donald Kerr wrote:
> 
> Caitlin Bestler wrote:
> 
> >Donal Kerr wrote:
> >
> >  
> >
> order of business after connection establishment
> (mba_btl_udapl_sendrecv().  The RECV buffer post for this exchange,
> however, should really be done _before_ the
> dat_ep_connect() on the active side, and _before_ the
> dat_cr_accept() on the server side.
> Currently its done after the ESTABLISHED event is dequeued, thus
> allowing the race condition. 
> 
> I believe the rules are the ULP must ensure that a RECV is posted
> before the client can post a SEND for that buffer.
> And further, the ULP must enforce flow control somehow so that a
> SEND never arrives without a RECV buffer being available.
> 
> 
> 
> 
> >>maybe this is a rule iwarp imposes on its ULPs but not uDAPL.
> >>
> >>
> >>
> >
> >It is most assuredly a rule for uDAPL. And it is not a matter
> >of iWARP "imposing" on uDAPL. uDAPL was explicitly designed
> >to support IB, iWARP and VI. To do that DAPL documents its
> >model of what RDMA is.
> >  
> >
> (sorry I was off the grid for a couple of days)
> Not to beat a dead horse but you would have to show me where in the Spec 
> it says I must post a recv before a send.  And thinking about it some I 
> don't believe there is a race condition because this is not called out 
> as such. Now if posting the handshake recv before the connect call 
> speeds things up and helps the iwarp scenario I am all for it.
> 
> >This issue is in fact one that is truly fundamental to the
> >efficiency of RDMA -- the transport layer DOES NOT provide
> >buffering. That's the application's job. It is precisely
> >because the application layer does a better job that RDMA
> >can achieve better performance at high bandwidth.
> >
> >For reasons that have been discussed in more depth in the
> >RDMA applicability statement and in RDDP/IPS discussions
> >on iSER, the absence of transport layer buffer throttling
> >places the onus for end-to-end pacing on the application.
> >It is a situation somewhat akin to a car with a broken
> >spedometer that had previously only driven during rush
> >hour bumper-to-bumper traffic. The fact that the spedometer
> >was broken was irrelevant. But if that same car hits the
> >open road the driver will need to come up with some method
> >of regulating their speed.
> >
> >The DAPL semantics are very clear that send/recv operations must
> >be matched one to one, that the receive buffer must be large
> >enough for the received message and that there must be a receive
> >buffer for each incoming send/recv message. That means that
> >the sender needs to have some basis for believing that the
> >RECV has been posted. Usually this is an explicit credit
> >that is decremented per message and incremented per response.
> >  
> >
> Matching one to one sure, still does not say a recv must be posted 
> before a send. Flow control is handled by the BTL.
> 
> >What DAPL does not state is if the transport does explicit flow
> >control so that the sending application's work request is simply
> >not processed (and the sending application continues to provide
> >the buffer, as with InfiniBand) or whether the sender simply
> >transmits and leaves error detection to the receiver (iWARP).
> >There are theoretical advantages to both, but more importantly
> >neither of them is going to change. So the Consumer of RDMA
> >applications needs to use ULP/application layer flow control
> >to pace the transmitter. At the application layer that means
> >that the RECV must be posted *before* the Send/accept that
> >grants ULP credits to the far side.
> >
> >All of that should be clear in the IOV ownership rules and
> >discussion of the semantics of send/recv. If you thought you
> >saw something that implied any guarantees to the contrary
> >then could you point them out in a posting to the DAT reflector?
> >(or just send them to me or Arkady Kanevsky).
> >  
> >
> I believe it was either your Steve who claimed a recv must be posted 
> before a send thus leading to a race condition. I fail to see this. But 
> again, if Steve's patch makes things better I am all for it.
> 

For iWARP, the connection may be TERMINATED if a SEND arrives on a QP
and no corresponding RECV buffer is posted.


Steve.





Re: [OMPI devel] Autotools Upgrade Time

2007-05-14 Thread Brian Barrett

On May 8, 2007, at 2:03 PM, Brian Barrett wrote:

As was discussed on the telecon a couple of weeks ago, to try to  
lower the maintenance cost of the build system, starting this  
Saturday Autoconf 2.60 and Automake 1.10 will be required to  
successfully run autogen.sh on the trunk.  As I mentioned in a  
previous e-mail, the required versions of the autotools will be:


AutoconfAutomakeLibtool
v1.12.57-2.59   1.9.6   1.5.22
v1.22.57-new1.9.6-new   1.5.22-new
trunk   2.60-new1.10.0-new  1.5.22-new


This means that there's no set of autotools that will be able to  
build all three versions of Open MPI, but since very few people  
currently spend time on v1.1, this should not present a major problem.


Hi all -

Due to being behind with EuroPVM/MPI papers, I didn't get a chance to  
commit the changes to the trunk this weekend.  Instead, they will be  
committed this evening.


Brian