On Wed, Jul 18, 2007 at 03:53:36PM -0700, Sean Hefty wrote: > There are a couple of benefits. The number of PR queries is reduced > from O(n^2) to O(n). The queries can also be done once up front, > even started at different times if needed, rather than all at once > at job startup. The jobs are also able to make progress even if the > SA dies or is unreachable.
Do you mean each node changes from O(local_cpus*nodes) -> O(nodes) ? Globally, from cold cache start you should still be O(n^2)? > >I'm trying to say, I think a simple kernel cache itself is fine, but > >there should be only 1 cache (get rid of ipoib) and it should have a > >really good interface to userspace so that the really hard problems > >can be solved through user space code. > > I don't disagree, but (for now anyway) I believe that the natural > interface for communicating with an SA related agent is a MAD > interface based on the SA management class for the reasons I > mentioned earlier. But this is really talking about extensions to > the local SA patch, rather than addressing anything fundamentally > wrong with the current patch set. OK - thats fine then. When you get around to doing the user space side I'll argue for netlink :) Having written both netlink user space code and mad code, I can say netlink is way better! Only other thing I'd see is to have the cache be on by default (ie included by default in distro kernels) it really needs a default short life time for cached entries as a work around for a coherence protocol.. Jason _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
