RE: Scalability Efforts
On Thu, 7 Sep 2000, Marty Fouts wrote: > FWIW, large system scalability, especially NUMA is not tractable > with a 'one size (algorithm) fits all' approach, and can be a > significant test of the degree of modularity in your system. > Different relative costs of access to the different levels of > the memory hierarchy and different models of cache concurrency, > especially, tend to make what works for system A be maximally > pessimal for system B. The first thing to tackle is making sure a lot of global things in the system become per-node. Something like that will increase scalability on various kinds of hardware while not affecting performance adversely on other systems. Other scalability things will have to receive more care as to not upset existing systems, but I'm sure we'll be able to get a long way by simply moving things into local structures... regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Scalability Efforts
I don't know. It may well be that by the time Linux is seriously in contention for cc-NUMA, the number of architectures will be seriously reduced, in much the same way that the number of architectures for general purpose computers got shook out in the '80s and '90s. In that case, my dire warning won't really matter, and best practice and possibly even simple algorithms will work. One of the things I investigated at HP Labs in the mid 90s was on-the-fly configuration by algorithm substitution. There was some good work on underlying technology for the nitty part of substituting algorithms done at the Oregon Graduate Center. This is substitution, rather than run time choice, to avoid the overhead of making the algorithm-choice branch frequently at runtime, and is an attempt to generalize techniques like back-patching the memory copy routines at boot time. As we all know, the major problem with one-size-fits-all-algorithms is scalability. Algorithms that are efficient for small N in the order statistic don't scale well, but algorithms for large N tend to have too much overhead to justify using them when N is small. List management (of which operating systems are major users) gives a trivial example. A list that might have a half dozen items at all is trivial to maintain in sorted order and to search linearly, while one with thousands of entries an frequent insertions requires data structures that would have outrageous overhead for small N, and may never be kept in sorted order at all. cc-NUMA complicates the problem because not only do you have the dimension of growth to take into account, which could probably be coped with by back-patching generalizations, but you also have variation in system design. Relative costs of coherence change at different numbers of processors, and some systems have complicated memory hierarchies while others tend to have a small number of levels. I've worked with machines where the "right" approach was to treat them as clusters of SMPs, effectively requiring an approach in which each "virtual" smp ran its own independent kernel and a lot of work had to be done to provide a single system image model and support very efficient in-cluster IPC between the kernels. (See, for instance, my idea "Sender Mediated Communication," and John Wilkes' patent on a related idea, documented in the Hamlyn papers.) On other machines, there wasn't such a break in the memory hierarchy and running the whole thing as one virtual SMP with algorithms tuned to the cost of sharing was the right approach. These systems may differ significantly in the source code base of the implementation. Processor scheduling, paging, even I/O processing models can vary radically between machines. (Try optimizing a Tera using the same processor scheduling algorithms as are appropriate for a T3, for example.) But, as I say, this may all be a red herring, because the market place doesn't tolerate a lot of diversity, and many of the interesting architectures that the research community have worked on may never appear in any significant numbers in the market place. If you are able to limit your solution to the problem to a fairly narrow class of cc-NUMA machines, then the problem really becomes simply the run-time replacement of algorithms based on system size. Marty -Original Message- From: Jesse Noller [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 07, 2000 2:46 PM To: Marty Fouts; [EMAIL PROTECTED] Subject: RE: Scalability Efforts But would it be possible to apply a sort of "Linux Server Tuning Best Practices" method to items not unlike NUMA, but more specific to say, webserver and file serving? (this is a project i am working on, finding kernel and apache tuning guidelines for maximum File/Web serving speed w/ the 2.4 kernel) - Note, if anyone has any pointers, please let me know. -Jesse -Original Message- From: Marty Fouts [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 07, 2000 5:30 PM To: [EMAIL PROTECTED] Subject: RE: Scalability Efforts FWIW, large system scalability, especially NUMA is not tractable with a 'one size (algorithm) fits all' approach, and can be a significant test of the degree of modularity in your system. Different relative costs of access to the different levels of the memory hierarchy and different models of cache concurrency, especially, tend to make what works for system A be maximally pessimal for system B. marty ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Scalability Efforts
But would it be possible to apply a sort of "Linux Server Tuning Best Practices" method to items not unlike NUMA, but more specific to say, webserver and file serving? (this is a project i am working on, finding kernel and apache tuning guidelines for maximum File/Web serving speed w/ the 2.4 kernel) - Note, if anyone has any pointers, please let me know. -Jesse -Original Message- From: Marty Fouts [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 07, 2000 5:30 PM To: [EMAIL PROTECTED] Subject: RE: Scalability Efforts FWIW, large system scalability, especially NUMA is not tractable with a 'one size (algorithm) fits all' approach, and can be a significant test of the degree of modularity in your system. Different relative costs of access to the different levels of the memory hierarchy and different models of cache concurrency, especially, tend to make what works for system A be maximally pessimal for system B. marty ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Scalability Efforts
FWIW, large system scalability, especially NUMA is not tractable with a 'one size (algorithm) fits all' approach, and can be a significant test of the degree of modularity in your system. Different relative costs of access to the different levels of the memory hierarchy and different models of cache concurrency, especially, tend to make what works for system A be maximally pessimal for system B. marty -Original Message- From: Rik van Riel [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 07, 2000 11:20 AM To: Henry Worth Cc: [EMAIL PROTECTED] Subject: Re: Scalability Efforts On Thu, 7 Sep 2000, Henry Worth wrote: > With all the talk of improving Linux's scalability to > large-scale SMP and ccNUMA platforms -- including efforts > at several HW companies and now OSDL forming to throw > hardware at the effort -- is there any move afoot to > coordinate these efforts? Nothing coordinated, AFAIK... > Or is it all, whatever there may be of it, taking > place offline? Most of the times I've talked about this topic it was in person with other developers at various conferences. For the VM subsystem and the scheduler we have some ideas to improve scalability for NUMA machines. It's not been implemented yet, but for most of it the design seems to be pretty ok and ready to be implemented for 2.5. OTOH, some of the develish details still aren't resolved. If there are people interested in discussing this topic, I'll setup a mailing list for it ... regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Scalability Efforts
On Thu, 7 Sep 2000, Henry Worth wrote: > > With all the talk of improving Linux's scalability to > large-scale SMP and ccNUMA platforms -- including efforts > at several HW companies and now OSDL forming to throw > hardware at the effort -- is there any move afoot to > coordinate these efforts? Some links which maybe useful: http://oss.sgi.com/projects/linux-scalability/ http://www.citi.umich.edu/projects/linux-scalability/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Scalability Efforts
Rik van Riel wrote: > On Thu, 7 Sep 2000, Henry Worth wrote: > > > > Or is it all, whatever there may be of it, taking > > place offline? > > Most of the times I've talked about this topic it > was in person with other developers at various > conferences. > Ugh, no wonder I never see this. Guess it's time to get some trips paid for so I can stand near you guys and listen to what's going on ;) > > For the VM subsystem and the scheduler we have some > ideas to improve scalability for NUMA machines. It's > not been implemented yet, but for most of it the > design seems to be pretty ok and ready to be implemented > for 2.5. > Is any of this documented in any way? I'm the one that booted Linux on a 31 (one CPU was bad at the time) CPU 256GB GS Series Alphaserver, and NUMA has always been a favorite play-thing of mine. :) > > OTOH, some of the develish details still aren't resolved. > If there are people interested in discussing this topic, > I'll setup a mailing list for it ... > Sure, unless people really want it to stay here. Heck, I'd even sign up twice just to make sure I don't miss anything. ;) - Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Scalability Efforts
On Thu, 7 Sep 2000, Henry Worth wrote: > With all the talk of improving Linux's scalability to > large-scale SMP and ccNUMA platforms -- including efforts > at several HW companies and now OSDL forming to throw > hardware at the effort -- is there any move afoot to > coordinate these efforts? Nothing coordinated, AFAIK... > Or is it all, whatever there may be of it, taking > place offline? Most of the times I've talked about this topic it was in person with other developers at various conferences. For the VM subsystem and the scheduler we have some ideas to improve scalability for NUMA machines. It's not been implemented yet, but for most of it the design seems to be pretty ok and ready to be implemented for 2.5. OTOH, some of the develish details still aren't resolved. If there are people interested in discussing this topic, I'll setup a mailing list for it ... regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Scalability Efforts
On Thu, 7 Sep 2000, Henry Worth wrote: With all the talk of improving Linux's scalability to large-scale SMP and ccNUMA platforms -- including efforts at several HW companies and now OSDL forming to throw hardware at the effort -- is there any move afoot to coordinate these efforts? Nothing coordinated, AFAIK... Or is it all, whatever there may be of it, taking place offline? Most of the times I've talked about this topic it was in person with other developers at various conferences. For the VM subsystem and the scheduler we have some ideas to improve scalability for NUMA machines. It's not been implemented yet, but for most of it the design seems to be pretty ok and ready to be implemented for 2.5. OTOH, some of the develish details still aren't resolved. If there are people interested in discussing this topic, I'll setup a mailing list for it ... regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Scalability Efforts
Rik van Riel wrote: On Thu, 7 Sep 2000, Henry Worth wrote: snip Or is it all, whatever there may be of it, taking place offline? Most of the times I've talked about this topic it was in person with other developers at various conferences. Ugh, no wonder I never see this. Guess it's time to get some trips paid for so I can stand near you guys and listen to what's going on ;) For the VM subsystem and the scheduler we have some ideas to improve scalability for NUMA machines. It's not been implemented yet, but for most of it the design seems to be pretty ok and ready to be implemented for 2.5. Is any of this documented in any way? I'm the one that booted Linux on a 31 (one CPU was bad at the time) CPU 256GB GS Series Alphaserver, and NUMA has always been a favorite play-thing of mine. :) OTOH, some of the develish details still aren't resolved. If there are people interested in discussing this topic, I'll setup a mailing list for it ... Sure, unless people really want it to stay here. Heck, I'd even sign up twice just to make sure I don't miss anything. ;) - Pete - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Scalability Efforts
FWIW, large system scalability, especially NUMA is not tractable with a 'one size (algorithm) fits all' approach, and can be a significant test of the degree of modularity in your system. Different relative costs of access to the different levels of the memory hierarchy and different models of cache concurrency, especially, tend to make what works for system A be maximally pessimal for system B. marty -Original Message- From: Rik van Riel [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 07, 2000 11:20 AM To: Henry Worth Cc: [EMAIL PROTECTED] Subject: Re: Scalability Efforts On Thu, 7 Sep 2000, Henry Worth wrote: With all the talk of improving Linux's scalability to large-scale SMP and ccNUMA platforms -- including efforts at several HW companies and now OSDL forming to throw hardware at the effort -- is there any move afoot to coordinate these efforts? Nothing coordinated, AFAIK... Or is it all, whatever there may be of it, taking place offline? Most of the times I've talked about this topic it was in person with other developers at various conferences. For the VM subsystem and the scheduler we have some ideas to improve scalability for NUMA machines. It's not been implemented yet, but for most of it the design seems to be pretty ok and ready to be implemented for 2.5. OTOH, some of the develish details still aren't resolved. If there are people interested in discussing this topic, I'll setup a mailing list for it ... regards, Rik -- "What you're running that piece of shit Gnome?!?!" -- Miguel de Icaza, UKUUG 2000 http://www.conectiva.com/ http://www.surriel.com/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
RE: Scalability Efforts
But would it be possible to apply a sort of "Linux Server Tuning Best Practices" method to items not unlike NUMA, but more specific to say, webserver and file serving? (this is a project i am working on, finding kernel and apache tuning guidelines for maximum File/Web serving speed w/ the 2.4 kernel) - Note, if anyone has any pointers, please let me know. -Jesse -Original Message- From: Marty Fouts [mailto:[EMAIL PROTECTED]] Sent: Thursday, September 07, 2000 5:30 PM To: [EMAIL PROTECTED] Subject: RE: Scalability Efforts FWIW, large system scalability, especially NUMA is not tractable with a 'one size (algorithm) fits all' approach, and can be a significant test of the degree of modularity in your system. Different relative costs of access to the different levels of the memory hierarchy and different models of cache concurrency, especially, tend to make what works for system A be maximally pessimal for system B. marty ...snip - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/