On Friday 18 January 2008 23:12, Roland Dreier wrote: > > The corruption happened when the process that allocated the MRs went > > away in the middle of the operation. We would free the MR and invalidate > > - and expect the in flight RDMA to error out. RDS does not know who is > > doing RDMA to or from a MR at any given time. > > OK, I see. Of course this error will move your QP to the error state > and cause other in-flight operations on behalf of other processes to > fail and need to be reissued after you reconnect. Seems like a bit of > a mess but I don't see a way around it if you want to multiplex direct > access operations to multiple different processes over the same QP.
Yes, and that's the whole point of RDS. Sockets are unconnected and you use sendto, else we'd drown in sockets. I will readily agree that this approach, while it's fast and simple, does get us into a bit of a mess sometimes :-) > > Is that a safe thing to do? I found the spec a little unclear on > > the ordering rules. It *seems* that RDMA writes are always fencing > > against subsequent operations, and RDMA reads will fence if we ask > > for it. But I'm not perfectly sure whether the ordering applies > > to the sending system only, or if IB also guarantees that the > > RDMA will have completed when it puts the incoming message on > > the completion queue at the consumer. > > I believe this is safe. I can't point to chapter and verse in the > spec, but operations are supposed to complete in order, so I don't > think that the receive completion can appear before earlier responder > operations have completed. Okay, thanks. Much appreciated, Olaf -- Olaf Kirch | --- o --- Nous sommes du soleil we love when we play [EMAIL PROTECTED] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg