A send queue can only serve max J jobs within a node. Is it possible to make a single send queue to serve all jobs on all nodes ?
--CQ > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of > Michael S. Tsirkin > Sent: Monday, July 30, 2007 7:51 AM > To: Gleb Natapov > Cc: Pavel Shamis; ewg@lists.openfabrics.org; Michael S. > Tsirkin; [EMAIL PROTECTED]; Ishai Rabinovitz > Subject: [ofa-general] Scalable reliable connection > > > Here's some background on what SRC is. This is basically > slide 6 in Dror's talk, for those that missed the talk. > > * * * > > SRC is an extension supported by recent Mellanox hardware > which is geared toward reducing the number of QPs required > for all-to-all communication on systems with a high number of > jobs per node. > > =================================================================== > Motivation: > =================================================================== > Given N nodes with J jobs per node, number of QPs required > for all-to-all communication is: > > With RC: > O((N * J) ^ 2) > > Since each job out of O(N * J) jobs must create a single QP > to communicate with each one of O(N * J) other jobs. > > With SRC: > O(N ^ 2 * J) > > This is achived by using a single send queue (per job, > out of O(N * J) jobs) > to send data to all J jobs running on a specific node > (out of O(N) nodes). > Hardware uses new "SRQ number" field in packet header to > multiplex receive WRs and WCs to private memory of each job. > > This is similiar idea to IB RD. > Q: Why not use RD then? > A: Because no hardware supports it. > > Details: > > =================================================================== > Verbs extension: > =================================================================== > > - There is a new transport/QP type "SRC". > - There is a new object type "SRC domain" > - Each SRQ gets new (optional) attributes: > SRC domain > SRC SRQ number > SRC CQ > SRQ must have either all 3 of these or none of these attributes > > - QPs of type SRC have all the same attributes as regular RC QPs > connected to SRQ, except that: > A. Each SRC QP has a new required attribute "SRC domain" > B. SRC QPs do *not* have "SRQ" attribute > (do not have a specific SRQ associated with them) > > =================================================================== > Protocol extension: > =================================================================== > SRC QP behaviour: Requestor > - Post send WR for this QP type is extended with SRQ number field > This number is sent as part of packet header > - SRC Packets follow rules for RC packets on the wire, exactly > What is different is their handling at the responder side > > SRC QP behaviour: Responder > Each incoming packet passes transport checks with respect to > the SRC QP, following RC rules, exactly. > > After this, SRQ number in packet header is used to look up a > specific SRQ. SRC domain of the resulting SRQ must be equal > to SRC domain of the QP, otherwise a NAK is sent, and QP > moves to error state. > > If the SRC domains match, receive WR and receive WC > processing are as follows: > > - RC Send > - Rather than using SRQ to which the QP is attached, > SRQ is looked up by SRQ number in the packet. > Receive WR is taken from this SRQ. > - Completions are generated on the CQ specified in the SRQ > > - RDMA/Atomic > - Rather than using PD to which the QP is attached, > SRQ is looked up by SRQ number in the packet. > PD of this SRQ is used for protection checks. > =================================================================== > > -- > MST > _______________________________________________ > general mailing list > [EMAIL PROTECTED] > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general > _______________________________________________ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg