[ewg] Re: Scalable reliable connection
On Mon, Jul 30, 2007 at 03:50:54PM +0300, Michael S. Tsirkin wrote: With SRC: O(N ^ 2 * J) This is achived by using a single send queue (per job, out of O(N * J) jobs) to send data to all J jobs running on a specific node (out of O(N) nodes). Hardware uses new SRQ number field in packet header to multiplex receive WRs and WCs to private memory of each job. But since the send queue cannot be used for receiving packets additional receive QPs have to be created one per job so with SRC it is actually O(N ^ 2 * J + N * J) unless I am missing something. This is similiar idea to IB RD. Except that with RD there is no need to jump through hoops and create separate QP for sending and receiving packets in order to achieve scalability. Q: Why not use RD then? A: Because no hardware supports it. Wrong answer :) There was no HW for SRC too, but Mellanox decided to implement SRC instead of RD. The reasons Dror provided for this a) RD is hard to do Not really very sounding reason IMO. Not doing RD is just pushing the complexity from HW to SW. And there are HW implementation of RD, not for IB though. b) RD, as defined by IB spec, will not achieve good performance This reason is serious, but can Spec be changed to allow for high performance implementation? Spec compliance not something that stopped Mellanox from doing things before :) Thanks for protocol explanation. -- Gleb. ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: Scalable reliable connection
Quoting Gleb Natapov [EMAIL PROTECTED]: Subject: Re: Scalable reliable connection On Mon, Jul 30, 2007 at 03:50:54PM +0300, Michael S. Tsirkin wrote: With SRC: O(N ^ 2 * J) This is achived by using a single send queue (per job, out of O(N * J) jobs) to send data to all J jobs running on a specific node (out of O(N) nodes). Hardware uses new SRQ number field in packet header to multiplex receive WRs and WCs to private memory of each job. But since the send queue cannot be used for receiving packets additional receive QPs have to be created one per job so with SRC it is actually O(N ^ 2 * J + N * J) unless I am missing something. Yes but since N = 1, N ^ 2 = N and so O(N ^ 2 * J + N * J) == O(N ^ 2 * J). -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: Scalable reliable connection
Quoting Tang, Changqing [EMAIL PROTECTED]: Subject: RE: Scalable reliable connection A send queue can only serve max J jobs within a node. Is it possible to make a single send queue to serve all jobs on all nodes ? How do you propose to do this? -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: Scalable reliable connection
In this way, only one send queue is needed for each job(process), and we don't need to track the location of each other job(which is on which node). from a job point of view, either self, or others, all others are equal... --CQ -Original Message- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 31, 2007 11:16 AM To: Tang, Changqing Cc: Michael S. Tsirkin; Gleb Natapov; Pavel Shamis; ewg@lists.openfabrics.org; [EMAIL PROTECTED]; Ishai Rabinovitz Subject: Re: Scalable reliable connection Quoting Tang, Changqing [EMAIL PROTECTED]: Subject: RE: Scalable reliable connection A send queue can only serve max J jobs within a node. Is it possible to make a single send queue to serve all jobs on all nodes ? How do you propose to do this? -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg