>> There are a couple of benefits. The number of PR queries is reduced >> from O(n^2) to O(n). The queries can also be done once up front, >> even started at different times if needed, rather than all at once >> at job startup. The jobs are also able to make progress even if the >> SA dies or is unreachable. > >Do you mean each node changes from O(local_cpus*nodes) -> O(nodes) ? >Globally, from cold cache start you should still be O(n^2)?
Each node goes from O(processes * nodes) -> O(1). The local SA does a single GetTable query to obtain all PRs. Whereas, applications do one PR query for each connection. >OK - thats fine then. When you get around to doing the user space side >I'll argue for netlink :) Having written both netlink user space code >and mad code, I can say netlink is way better! We can thumb wrestle. (I would never argue that the IB MAD interface is great.) I'm suggesting that we want an interface that allows an application running on a remote node to control local SA policy, and that the message format should be similar to SA MADs. My hope is that we can create an interface that will be usable for QoS purposes as well. I will start an open thread on this once the QoS is released, and I've had time to think about more of the details. - Sean _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
