> From: Michael S. Tsirkin > Sent: Thursday, November 02, 2006 5:55 PM > To: Arlin Davis > Cc: Or Gerlitz; openib-general; Arlin Davis > Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add support > for address and route retries, call disconnect when recving dreq > > Quoting r. Arlin Davis <[EMAIL PROTECTED]>: > > Subject: Re: [openib-general] scaling issues, was: uDAPL cma: add > support for address and route retries, call disconnect when recving dreq > > > > Sean Hefty wrote: > > > > >One option is having the SA (or ib_umad?) return a busy status in > response to a > > >MAD, but we'd still have to be able to send this response as quickly as > requests > > >are being received. We could then limit the number of requests that > would be > > >queued in the kernel for a user. > > > > > > > > > > Another great option would be to have path record caching. Unfortunately > > OFED 1.1 did not include ib_local_sa in the release. > > > > This won't help you much. > With 256 nodes all to all already gives you 65000 requests > which is the same order of magnitude as the reported 130000.
We have SA caching working quite well with very large clusters. Here are some techniques which make it much more efficient: 1. A given node only cares about path records relevant to it. So only ask for path records where it is the source. 2. Use SA notices for GID in/out of service to trigger cache updates, and only then for the specific GID which has changed - as background, refresh all cache entrys slowly and infrequently just in case the notice was lost, however IBTA does allow retries and Acks of notices so this will be infrequent 3. limit number of outstanding SA queries from a given node, this avoids 1 node blasting the SM There a little more to it, but that should be the main points relevant to this discussion. Todd Rimmer _______________________________________________ openib-general mailing list openib-general@openib.org http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general