Re: [openib-general] basic IB doubt

Michael Krause Mon, 28 Aug 2006 20:46:23 -0700

At 10:14 AM 8/23/2006, Ralph Campbell wrote:

On Wed, 2006-08-23 at 09:47 -0700, Caitlin Bestler wrote:
> [EMAIL PROTECTED] wrote:
> > Quoting r. john t <[EMAIL PROTECTED]>:
> >> Subject: basic IB doubt
> >>
> >> Hi
> >>
> >> I have a very basic doubt. Suppose Host A is doing RDMA write (say 8
> >> MB) to Host B. When data is copied into Host B's local
> > buffer, is it guaranteed that data will be copied starting
> > from the first location (first buffer address) to the last
> > location (last buffer address)? or it could be in any order?
> >
> > Once B gets a completion (e.g. of a subsequent send), data in
> > its buffer matches that of A, byte for byte.
>
> An excellent and concise answer. That is exactly what the application
> should rely upon, and nothing else. With iWARP this is very explicit,
> because portions of the message not only MAY be placed out of
> order, they SHOULD be when packets have been re-ordered by the
> network. But for *any* RDMA adapter there is no guarantee on
> what order the adapter flushes things to host memory or particularly
> when old contents that may be cached are invalidated or updated.
> The role of the completion is to limit the frequency with which
> the RDMA adapter MUST guarantee coherency with application visible
> buffers. The completion not only indicates that the entire message
> was received, but that it has been entirely delivered to host memory.

Actually, A knows the data is in B's memory when A gets the completion
notice.

This is incorrect for both iWARP and IB. A completion by A only means that the receiving HCA / RNIC has the data and has generated an acknowledgement. It does not indicate that B has flushed the data to host memory. Hence, the fault zone remains the HCA / RNIC and while A may free the associated buffer for other usage, it should not rely upon the data being delivered to host memory on B. This is one of the fault scenarios I raised during the initial RDS transparent recovery assertions. If A were to issue a RDMA Read to the B targeting the associated RDMA Write memory location, then it can know the data has been placed in B's memory.

B can't rely on anything unless A uses the RDMA write with
immediate which puts a completion event in B's CQ.
Most applications on B ignore this requirement and test for the last
memory location being modified which usually works but doesn't
guarantee that all the data is in memory.

B cannot rely on anything until a completion is seen either through an immediate or a subsequent Send. It is not wise to rely upon IHV-specific behaviors when designing an application as even an IHV can change things over time or due to interoperability requirements, things may not work as desired which is definitely a customer complaint that many would like to avoid.

BTW, the reason immediate data is 4 bytes in length is that was what was defined in VIA. Many within the IBTA wanted to get rid of immediate data but due to the requirement to support legacy VIA applications, the immediate value was left in place. The need to support a larger value was not apparent. One needs to keep in mind where the immediate resides within the wire protocol and its usage model. The past usage was to signal a PID or some other unique identifier that could be used to comprehend which thread of execution should be informed of a particular completion event. Four bytes is sufficient to communicate such information without significantly complicating or making the wire protocol too inefficient.

Mike

_______________________________________________
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general


To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] basic IB doubt

Reply via email to