RE: Scalability Efforts

2000-09-08 Thread Rik van Riel

On Thu, 7 Sep 2000, Marty Fouts wrote:

> FWIW, large system scalability, especially NUMA is not tractable
> with a 'one size (algorithm) fits all' approach, and can be a
> significant test of the degree of modularity in your system.

> Different relative costs of access to the different levels of
> the memory hierarchy and different models of cache concurrency,
> especially, tend to make what works for system A be maximally
> pessimal for system B.

The first thing to tackle is making sure a lot of global
things in the system become per-node.

Something like that will increase scalability on various
kinds of hardware while not affecting performance adversely
on other systems.

Other scalability things will have to receive more care
as to not upset existing systems, but I'm sure we'll be
able to get a long way by simply moving things into local
structures...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Scalability Efforts

2000-09-07 Thread Marty Fouts

I don't know. It may well be that by the time Linux is seriously in
contention for cc-NUMA, the number of architectures will be seriously
reduced, in much the same way that the number of architectures for general
purpose computers got shook out in the '80s and '90s.  In that case, my dire
warning won't really matter, and best practice and possibly even simple
algorithms will work.

One of the things I investigated at HP Labs in the mid 90s was on-the-fly
configuration by algorithm substitution.  There was some good work on
underlying technology for the nitty part of substituting algorithms done at
the Oregon Graduate Center.  This is substitution, rather than run time
choice, to avoid the overhead of making the algorithm-choice branch
frequently at runtime, and is an attempt to generalize techniques like
back-patching the memory copy routines at boot time.

As we all know, the major problem with one-size-fits-all-algorithms is
scalability.  Algorithms that are efficient for small N in the order
statistic don't scale well, but algorithms for large N tend to have too much
overhead to justify using them when N is small.  List management (of which
operating systems are major users) gives a trivial example.  A list that
might have a half dozen items at all is trivial to maintain in sorted order
and to search linearly, while one with thousands of entries an frequent
insertions requires data structures that would have outrageous overhead for
small N, and may never be kept in sorted order at all.

cc-NUMA complicates the problem because not only do you have the dimension
of growth to take into account, which could probably be coped with by
back-patching generalizations, but you also have variation in system design.
Relative costs of coherence change at different numbers of processors, and
some systems have complicated memory hierarchies while others tend to have a
small number of levels.  I've worked with machines where the "right"
approach was to treat them as clusters of SMPs, effectively requiring an
approach in which each "virtual" smp ran its own independent kernel and a
lot of work had to be done to provide a single system image model and
support very efficient in-cluster IPC between the kernels.  (See, for
instance, my idea "Sender Mediated Communication," and John Wilkes' patent
on a related idea, documented in the Hamlyn papers.) On other machines,
there wasn't such a break in the memory hierarchy and running the whole
thing as one virtual SMP with algorithms tuned to the cost of sharing was
the right approach.  These systems may differ significantly in the source
code base of the implementation.  Processor scheduling, paging, even I/O
processing models can vary radically between machines.  (Try optimizing a
Tera using the same processor scheduling algorithms as are appropriate for a
T3, for example.)

But, as I say, this may all be a red herring, because the market place
doesn't tolerate a lot of diversity, and many of the interesting
architectures that the research community have worked on may never appear in
any significant numbers in the market place.  If you are able to limit your
solution to the problem to a fairly narrow class of cc-NUMA machines, then
the problem really becomes simply the run-time replacement of algorithms
based on system size.

Marty

-Original Message-
From: Jesse Noller [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 07, 2000 2:46 PM
To: Marty Fouts; [EMAIL PROTECTED]
Subject: RE: Scalability Efforts


But would it be possible to apply a sort of "Linux Server Tuning Best
Practices" method to items not unlike NUMA, but more specific to say,
webserver and file serving?

(this is a project i am working on, finding kernel and apache tuning
guidelines for maximum File/Web serving speed w/ the 2.4 kernel)

- Note, if anyone has any pointers, please let me know.

-Jesse

-Original Message-
From: Marty Fouts [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 07, 2000 5:30 PM
To: [EMAIL PROTECTED]
Subject: RE: Scalability Efforts


FWIW, large system scalability, especially NUMA is not tractable with a 'one
size (algorithm) fits all' approach, and can be a significant test of the
degree of modularity in your system.  Different relative costs of access to
the different levels of the memory hierarchy and different models of cache
concurrency, especially, tend to make what works for system A be maximally
pessimal for system B.

marty

...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Scalability Efforts

2000-09-07 Thread Jesse Noller



But would it be possible to apply a sort of "Linux Server Tuning Best
Practices" method to items not unlike NUMA, but more specific to say,
webserver and file serving?

(this is a project i am working on, finding kernel and apache tuning
guidelines for maximum File/Web serving speed w/ the 2.4 kernel)

- Note, if anyone has any pointers, please let me know.

-Jesse 

-Original Message-
From: Marty Fouts [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 07, 2000 5:30 PM
To: [EMAIL PROTECTED]
Subject: RE: Scalability Efforts


FWIW, large system scalability, especially NUMA is not tractable with a 'one
size (algorithm) fits all' approach, and can be a significant test of the
degree of modularity in your system.  Different relative costs of access to
the different levels of the memory hierarchy and different models of cache
concurrency, especially, tend to make what works for system A be maximally
pessimal for system B.

marty

...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Scalability Efforts

2000-09-07 Thread Marty Fouts

FWIW, large system scalability, especially NUMA is not tractable with a 'one
size (algorithm) fits all' approach, and can be a significant test of the
degree of modularity in your system.  Different relative costs of access to
the different levels of the memory hierarchy and different models of cache
concurrency, especially, tend to make what works for system A be maximally
pessimal for system B.

marty

-Original Message-
From: Rik van Riel [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 07, 2000 11:20 AM
To: Henry Worth
Cc: [EMAIL PROTECTED]
Subject: Re: Scalability Efforts

On Thu, 7 Sep 2000, Henry Worth wrote:

> With all the talk of improving Linux's scalability to
> large-scale SMP and ccNUMA platforms -- including efforts
> at several HW companies and now OSDL forming to throw
> hardware at the effort -- is there any move afoot to
> coordinate these efforts?

Nothing coordinated, AFAIK...

> Or is it all, whatever there may be of it, taking
> place offline?

Most of the times I've talked about this topic it
was in person with other developers at various
conferences.

For the VM subsystem and the scheduler we have some
ideas to improve scalability for NUMA machines. It's
not been implemented yet, but for most of it the
design seems to be pretty ok and ready to be implemented
for 2.5.

OTOH, some of the develish details still aren't resolved.
If there are people interested in discussing this topic,
I'll setup a mailing list for it ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Scalability Efforts

2000-09-07 Thread Marcelo Tosatti



On Thu, 7 Sep 2000, Henry Worth wrote:

> 
> With all the talk of improving Linux's scalability to
> large-scale SMP and ccNUMA platforms -- including efforts
> at several HW companies and now OSDL forming to throw
> hardware at the effort -- is there any move afoot to 
> coordinate these efforts? 

Some links which maybe useful: 

http://oss.sgi.com/projects/linux-scalability/

http://www.citi.umich.edu/projects/linux-scalability/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Scalability Efforts

2000-09-07 Thread Peter Rival

Rik van Riel wrote:

> On Thu, 7 Sep 2000, Henry Worth wrote:
>
> 
> > Or is it all, whatever there may be of it, taking
> > place offline?
>
> Most of the times I've talked about this topic it
> was in person with other developers at various
> conferences.
>

Ugh, no wonder I never see this.  Guess it's time to get some trips paid for
so I can stand near you guys and listen to what's going on ;)

>
> For the VM subsystem and the scheduler we have some
> ideas to improve scalability for NUMA machines. It's
> not been implemented yet, but for most of it the
> design seems to be pretty ok and ready to be implemented
> for 2.5.
>

Is any of this documented in any way?  I'm the one that booted Linux on a 31
(one CPU was bad at the time) CPU 256GB GS Series Alphaserver, and NUMA has
always been a favorite play-thing of mine. :)

>
> OTOH, some of the develish details still aren't resolved.
> If there are people interested in discussing this topic,
> I'll setup a mailing list for it ...
>

Sure, unless people really want it to stay here.  Heck, I'd even sign up
twice just to make sure I don't miss anything. ;)

 - Pete

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Scalability Efforts

2000-09-07 Thread Rik van Riel

On Thu, 7 Sep 2000, Henry Worth wrote:

> With all the talk of improving Linux's scalability to
> large-scale SMP and ccNUMA platforms -- including efforts
> at several HW companies and now OSDL forming to throw
> hardware at the effort -- is there any move afoot to 
> coordinate these efforts? 

Nothing coordinated, AFAIK...

> Or is it all, whatever there may be of it, taking 
> place offline?

Most of the times I've talked about this topic it
was in person with other developers at various
conferences.

For the VM subsystem and the scheduler we have some
ideas to improve scalability for NUMA machines. It's
not been implemented yet, but for most of it the
design seems to be pretty ok and ready to be implemented
for 2.5.

OTOH, some of the develish details still aren't resolved.
If there are people interested in discussing this topic,
I'll setup a mailing list for it ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Scalability Efforts

2000-09-07 Thread Rik van Riel

On Thu, 7 Sep 2000, Henry Worth wrote:

 With all the talk of improving Linux's scalability to
 large-scale SMP and ccNUMA platforms -- including efforts
 at several HW companies and now OSDL forming to throw
 hardware at the effort -- is there any move afoot to 
 coordinate these efforts? 

Nothing coordinated, AFAIK...

 Or is it all, whatever there may be of it, taking 
 place offline?

Most of the times I've talked about this topic it
was in person with other developers at various
conferences.

For the VM subsystem and the scheduler we have some
ideas to improve scalability for NUMA machines. It's
not been implemented yet, but for most of it the
design seems to be pretty ok and ready to be implemented
for 2.5.

OTOH, some of the develish details still aren't resolved.
If there are people interested in discussing this topic,
I'll setup a mailing list for it ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



Re: Scalability Efforts

2000-09-07 Thread Peter Rival

Rik van Riel wrote:

 On Thu, 7 Sep 2000, Henry Worth wrote:

 snip
  Or is it all, whatever there may be of it, taking
  place offline?

 Most of the times I've talked about this topic it
 was in person with other developers at various
 conferences.


Ugh, no wonder I never see this.  Guess it's time to get some trips paid for
so I can stand near you guys and listen to what's going on ;)


 For the VM subsystem and the scheduler we have some
 ideas to improve scalability for NUMA machines. It's
 not been implemented yet, but for most of it the
 design seems to be pretty ok and ready to be implemented
 for 2.5.


Is any of this documented in any way?  I'm the one that booted Linux on a 31
(one CPU was bad at the time) CPU 256GB GS Series Alphaserver, and NUMA has
always been a favorite play-thing of mine. :)


 OTOH, some of the develish details still aren't resolved.
 If there are people interested in discussing this topic,
 I'll setup a mailing list for it ...


Sure, unless people really want it to stay here.  Heck, I'd even sign up
twice just to make sure I don't miss anything. ;)

 - Pete

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Scalability Efforts

2000-09-07 Thread Marty Fouts

FWIW, large system scalability, especially NUMA is not tractable with a 'one
size (algorithm) fits all' approach, and can be a significant test of the
degree of modularity in your system.  Different relative costs of access to
the different levels of the memory hierarchy and different models of cache
concurrency, especially, tend to make what works for system A be maximally
pessimal for system B.

marty

-Original Message-
From: Rik van Riel [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 07, 2000 11:20 AM
To: Henry Worth
Cc: [EMAIL PROTECTED]
Subject: Re: Scalability Efforts

On Thu, 7 Sep 2000, Henry Worth wrote:

 With all the talk of improving Linux's scalability to
 large-scale SMP and ccNUMA platforms -- including efforts
 at several HW companies and now OSDL forming to throw
 hardware at the effort -- is there any move afoot to
 coordinate these efforts?

Nothing coordinated, AFAIK...

 Or is it all, whatever there may be of it, taking
 place offline?

Most of the times I've talked about this topic it
was in person with other developers at various
conferences.

For the VM subsystem and the scheduler we have some
ideas to improve scalability for NUMA machines. It's
not been implemented yet, but for most of it the
design seems to be pretty ok and ready to be implemented
for 2.5.

OTOH, some of the develish details still aren't resolved.
If there are people interested in discussing this topic,
I'll setup a mailing list for it ...

regards,

Rik
--
"What you're running that piece of shit Gnome?!?!"
   -- Miguel de Icaza, UKUUG 2000

http://www.conectiva.com/   http://www.surriel.com/

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/



RE: Scalability Efforts

2000-09-07 Thread Jesse Noller



But would it be possible to apply a sort of "Linux Server Tuning Best
Practices" method to items not unlike NUMA, but more specific to say,
webserver and file serving?

(this is a project i am working on, finding kernel and apache tuning
guidelines for maximum File/Web serving speed w/ the 2.4 kernel)

- Note, if anyone has any pointers, please let me know.

-Jesse 

-Original Message-
From: Marty Fouts [mailto:[EMAIL PROTECTED]]
Sent: Thursday, September 07, 2000 5:30 PM
To: [EMAIL PROTECTED]
Subject: RE: Scalability Efforts


FWIW, large system scalability, especially NUMA is not tractable with a 'one
size (algorithm) fits all' approach, and can be a significant test of the
degree of modularity in your system.  Different relative costs of access to
the different levels of the memory hierarchy and different models of cache
concurrency, especially, tend to make what works for system A be maximally
pessimal for system B.

marty

...snip
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/