[ewg] Re: RFCv2: SRC API
Only of the job among j2, j3, j4 on remote node n need to create a receiving qp2 for j1, right ? Correct. A single QP can be used to send data to any SRQ that shares the same domain. -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: RFCv2: SRC API
OK, I was wrong before, here is my question. if remote node n has j2, j3, and j4, and j2 is the job to create qp2 and make connection with qp1 in j1. if j2 is done before j3 and j4, then we can not let j2 to destroy qp2, because j3 and j4 are still communicating with j1. Since j2 owns qp2, j2 need to be the last job to cleanup. Am I right ? Correct. Is this clear from the text, or is some kind of additional clarification necessary? It is not clear at the first read, so please add one sentence to clarify it. if j2 is the last job to cleanup, how can it know all other jobs on the same node has called ibv_close_src_domain(), and it is time for itself to cleanup ? Is this something upto application to do ? --CQ -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: RFCv2: SRC API
Quoting Tang, Changqing [EMAIL PROTECTED]: Subject: RE: RFCv2: SRC API OK, I was wrong before, here is my question. if remote node n has j2, j3, and j4, and j2 is the job to create qp2 and make connection with qp1 in j1. if j2 is done before j3 and j4, then we can not let j2 to destroy qp2, because j3 and j4 are still communicating with j1. Since j2 owns qp2, j2 need to be the last job to cleanup. Am I right ? Correct. Is this clear from the text, or is some kind of additional clarification necessary? It is not clear at the first read, so please add one sentence to clarify it. Would something like this help? Cleanup: When job j1 does not need to communicate to any jobs on node n, it disconnects qp1 from qp2, and asks j2 to destroy qp2. + +Note: both qp1 and qp2 must exist for the communication to take place. +Thus, j2 should not destroy qp2 (and in particular, should not exit) +until j1 has completed communication with node n and +has asked j2 to disconnect. if j2 is the last job to cleanup, how can it know all other jobs on the same node has called ibv_close_src_domain(), and it is time for itself to cleanup ? Is this something upto application to do ? No, this is handled automatically. Have you seen this text? * ibv_close_src_domain - close an SRC domain * If this is the last reference, destroys the domain. So, each job has a reference to the domain. Once the last reference is gone, the domain is destroyed. -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] RE: RFCv2: SRC API
Cleanup: When job j1 does not need to communicate to any jobs on node n, it disconnects qp1 from qp2, and asks j2 to destroy qp2. + +Note: both qp1 and qp2 must exist for the communication to take place. +Thus, j2 should not destroy qp2 (and in particular, should not exit) +until j1 has completed communication with node n and has asked j2 to +disconnect. Thanks. Another question. if a node n has 8 jobs, say, j2-j9, usually the first job j2 is the one to create the SRC domain(other jobs just attach and share) and it make sense to let j2 to create all the receiving QPs for all other remote jobs and make all the connections. (we can do in roundrobin way, but more work). Is there any performance worry to let j2(the first job on a node) to do all the work ? What is the latency of SRC+SRQ ? --CQ ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
[ewg] Re: RFCv2: SRC API
Quoting Tang, Changqing [EMAIL PROTECTED]: Subject: RE: RFCv2: SRC API Cleanup: When job j1 does not need to communicate to any jobs on node n, it disconnects qp1 from qp2, and asks j2 to destroy qp2. + +Note: both qp1 and qp2 must exist for the communication to take place. +Thus, j2 should not destroy qp2 (and in particular, should not exit) +until j1 has completed communication with node n and has asked j2 to +disconnect. Thanks. Another question. if a node n has 8 jobs, say, j2-j9, usually the first job j2 is the one to create the SRC domain(other jobs just attach and share) and it make sense to let j2 to create all the receiving QPs for all other remote jobs and make all the connections. (we can do in roundrobin way, but more work). Sure, creating allconnections upfront will work to, this is just a usage example. Is there any performance worry to let j2(the first job on a node) to do all the work ? How do you mean? What is the latency of SRC+SRQ ? I'd expect it to be more or less the same as regular SRQ. -- MST ___ ewg mailing list ewg@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg