subject:"bigphysarea support in 2.2.19 and 2.4.0 kernels"

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-24 Thread Albert D. Cahalan


Eric W. Biederman writes:

> If you are doing a real time task you don't want to very close
> to your performance envelope.  If you are hitting the performance
> envelope any small hiccup will cause you to miss your deadline,
> and close to your performance envelope hiccups are virtually certain.
>
> Pushing the machine just 5% slower should get everything going
> with multiple pages, and you wouldn't be pushing the performance
> envelope so your machine can compensate for the occasional hiccup.
>
>> The data stream is fat and relentless.
>
> So you add another node if your current nodes can't handle the load
> without using giant physical areas of memory.  Attempt to redesign
> the operating system.  Much more cost effective.

Nodes can be wicked expensive. :-)

Pushing the performance envelope is important when you want to
sell lots of systems. Radar is a similar computational task,
with the added need to reduce space and weight requirements.
It's not OK to be 5% more expensive, bulky, and heavy.

Also the Airplane Principal: more nodes means more big failures.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-24 Thread Albert D. Cahalan

Jes Sorensen writes:
> Albert D Cahalan <[EMAIL PROTECTED]> writes:

[about using huge physical allocations for number crunching]

>> 2. Programming a DMA controller with multiple addresses isn't
>> as fast as programming it with one.
>
> LOL
>
> Consider that allocating the larger block of memory is going
> to take a lot longer than it will take for the DMA engine to
> read the scatter/gather table entries and fetch a new address
> word now and then.

Say it takes a whole minute to allocate the memory. It wouldn't
of course, because you'd allocate memory at boot, but anyway...
Then the app runs, using that memory, for a multi-hour surgery.
The allocation happens once; the inter-node DMA transfers occur
dozens or hundreds of times per second.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-24 Thread Albert D. Cahalan


Jes Sorensen writes:
 Albert D Cahalan [EMAIL PROTECTED] writes:

[about using huge physical allocations for number crunching]

 2. Programming a DMA controller with multiple addresses isn't
 as fast as programming it with one.

 LOL

 Consider that allocating the larger block of memory is going
 to take a lot longer than it will take for the DMA engine to
 read the scatter/gather table entries and fetch a new address
 word now and then.

Say it takes a whole minute to allocate the memory. It wouldn't
of course, because you'd allocate memory at boot, but anyway...
Then the app runs, using that memory, for a multi-hour surgery.
The allocation happens once; the inter-node DMA transfers occur
dozens or hundreds of times per second.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-24 Thread Albert D. Cahalan


Eric W. Biederman writes:

 If you are doing a real time task you don't want to very close
 to your performance envelope.  If you are hitting the performance
 envelope any small hiccup will cause you to miss your deadline,
 and close to your performance envelope hiccups are virtually certain.

 Pushing the machine just 5% slower should get everything going
 with multiple pages, and you wouldn't be pushing the performance
 envelope so your machine can compensate for the occasional hiccup.

 The data stream is fat and relentless.

 So you add another node if your current nodes can't handle the load
 without using giant physical areas of memory.  Attempt to redesign
 the operating system.  Much more cost effective.

Nodes can be wicked expensive. :-)

Pushing the performance envelope is important when you want to
sell lots of systems. Radar is a similar computational task,
with the added need to reduce space and weight requirements.
It's not OK to be 5% more expensive, bulky, and heavy.

Also the Airplane Principal: more nodes means more big failures.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-23 Thread Jes Sorensen

> "Albert" == Albert D Cahalan <[EMAIL PROTECTED]> writes:

>> bigmem is 'last resort' stuff. I'd much rather it is as now a
>> seperate allocator so you actually have to sit and think and decide
>> to give up on kmalloc/vmalloc/better algorithms and only use it
>> when the hardware sucks

Albert> It isn't just for sucky hardware. It is for performance too.

Albert> 1. Linux isn't known for cache coloring ability. Even if it
Albert> was, users want to take advantage of large pages or BAT
Albert> registers to reduce TLB miss costs. (that is, mapping such
Albert> areas into a process is needed... never mind security for now)

Albert> 2. Programming a DMA controller with multiple addresses isn't
Albert> as fast as programming it with one.

LOL

Consider that allocating the larger block of memory is going to take a
lot longer than it will take for the DMA engine to read the
scatter/gather table entries and fetch a new address word now and
then.

Jes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-23 Thread Eric W. Biederman

"Albert D. Cahalan" <[EMAIL PROTECTED]> writes:

> > bigmem is 'last resort' stuff. I'd much rather it is as now a
> > seperate allocator so you actually have to sit and think and
> > decide to give up on kmalloc/vmalloc/better algorithms and
> > only use it when the hardware sucks
> 
> It isn't just for sucky hardware. It is for performance too.

> 1. Linux isn't known for cache coloring ability. 
Most hardware doesn't need it.  It might help a little
but not much.
>Even if it was,
>users want to take advantage of large pages or BAT registers
>to reduce TLB miss costs. (that is, mapping such areas into
>a process is needed... never mind security for now)

I think the minor cost incurred by uniform size is well made up
for by reliable memory management, and avoidance of swapping, and
needing less total ram.  Besides the fact I don't see large
physical areas of memory being more than a marginal performance gain.

> 2. Programming a DMA controller with multiple addresses isn't
>as fast as programming it with one.

Garbage collecting is theoretically more efficient than explicit
memory management too.  But seriously I doubt that several pages
have significantly more overhead than a giant burst, per transfer.

> Consider what happens when you have the ability to make one
> compute node DMA directly into the physical memory of another.
> With a large block of physical memory, you only need to have
> the destination node give the writer a single physical memory
> address to send the data to. With loose pages, the destination
> has to transmit a great big list. That might be 30 thousand!

Hmm, queuing up enough data for a second at a time seems a little
excessive.  And with a 128M chunk...  your system can't do good
memory management at all.

> The point of all this is to crunch data as fast as possible,
> with Linux mostly getting out of the way. Perhaps you want
> to generate real-time high-resolution video of a human heart
> as it beats inside somebody. You process raw data (audio, X-ray,
> magnetic resonance, or whatever) on one group of processors,
> then hand off the data to another group of processors for the
> rendering task. Actually there might be many stages. Playing
> games with individual pages will cut into your performance.

If you are doing a real time task you don't want to very close
to your performance envelope.  If you are hitting the performance
envelope any small hiccup will cause you to miss your deadline,
and close to your performance envelope hiccups are virtually certain.

Pushing the machine just 5% slower should get everything going
with multiple pages, and you wouldn't be pushing the performance
envelope so your machine can compensate for the occasional hiccup.

> The data stream is fat and relentless.

So you add another node if your current nodes can't handle the load
without using giant physical areas of memory.  Attempt to redesign
the operating system.  Much more cost effective.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-23 Thread Jes Sorensen


 "Albert" == Albert D Cahalan [EMAIL PROTECTED] writes:

 bigmem is 'last resort' stuff. I'd much rather it is as now a
 seperate allocator so you actually have to sit and think and decide
 to give up on kmalloc/vmalloc/better algorithms and only use it
 when the hardware sucks

Albert It isn't just for sucky hardware. It is for performance too.

Albert 1. Linux isn't known for cache coloring ability. Even if it
Albert was, users want to take advantage of large pages or BAT
Albert registers to reduce TLB miss costs. (that is, mapping such
Albert areas into a process is needed... never mind security for now)

Albert 2. Programming a DMA controller with multiple addresses isn't
Albert as fast as programming it with one.

LOL

Consider that allocating the larger block of memory is going to take a
lot longer than it will take for the DMA engine to read the
scatter/gather table entries and fetch a new address word now and
then.

Jes
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Albert D. Cahalan


> bigmem is 'last resort' stuff. I'd much rather it is as now a
> seperate allocator so you actually have to sit and think and
> decide to give up on kmalloc/vmalloc/better algorithms and
> only use it when the hardware sucks

It isn't just for sucky hardware. It is for performance too.

1. Linux isn't known for cache coloring ability. Even if it was,
   users want to take advantage of large pages or BAT registers
   to reduce TLB miss costs. (that is, mapping such areas into
   a process is needed... never mind security for now)

2. Programming a DMA controller with multiple addresses isn't
   as fast as programming it with one.

Consider what happens when you have the ability to make one
compute node DMA directly into the physical memory of another.
With a large block of physical memory, you only need to have
the destination node give the writer a single physical memory
address to send the data to. With loose pages, the destination
has to transmit a great big list. That might be 30 thousand!

The point of all this is to crunch data as fast as possible,
with Linux mostly getting out of the way. Perhaps you want
to generate real-time high-resolution video of a human heart
as it beats inside somebody. You process raw data (audio, X-ray,
magnetic resonance, or whatever) on one group of processors,
then hand off the data to another group of processors for the
rendering task. Actually there might be many stages. Playing
games with individual pages will cut into your performance.
The data stream is fat and relentless.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 10:39:43PM +0100, Erik Mouw wrote:
> On Fri, Dec 22, 2000 at 02:54:50PM -0700, Jeff V. Merkey wrote:
> > Having a 1 Gigabyte per second fat pipe that runs over a prallel bus 
> > fabric with a standard PCI card that costs @ $ 500 and can run LVS 
> > and TUX at high speeds would be for the common good, particularly since
> > NT and W2K both have implementations of Dolphin SCI that allow them 
> > to exploit this hardware.  
> 
> I'm just wondering how you are going to do 1 Gbyte per second when you
> still have to get the data through a PCI bus to that card. In theory,
> standard PCI can do 133 Mbyte/s, but only when you're very lucky to be
> able to burst large chunks of data. OK, 64 bit PCI at 66 MHz should
> quadruple the throughput, but that's still not enough for 1 Gbyte/s.

THe fabric supports this data rate.  PCI cards are limited to @ 130MB, 
but multiple nodes all running at the same time could generate this much
traffic.

Jeff

> 
> 
> Erik
> 
> -- 
> J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department
> of Electrical Engineering, Faculty of Information Technology and Systems,
> Delft University of Technology, PO BOX 5031,  2600 GA Delft, The Netherlands
> Phone: +31-15-2783635  Fax: +31-15-2781843  Email: [EMAIL PROTECTED]
> WWW: http://www-ict.its.tudelft.nl/~erik/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 11:37:29AM -0800, Tim Wright wrote:

I have been working with SCI since 1994.  The people who own 
Dolphin and the SCI chipsets also own TRG.  We dropped work in 
the P6 ccNUMA cards several years back because Intel was 
convinced that shared-nothing was the way to go (and it is).
However, SCI's ability to create explicit sharing makes it
the fastest shared nothing interface around for message passing
(go figure).

I think we do need some bettr APIs.  Grab the source at my FTP server,
and I'd love any input you could provide.

Thanks,

:-)

Jeff


> Hi Jeff,
> 
> On Fri, Dec 22, 2000 at 11:11:05AM -0700, Jeff V. Merkey wrote:
> [...]
> > SCI allows machines to create windows of shared memory across a cluster
> > of nodes, and at 1 Gigabyte-per-second (Gigabyte not gigabit).  I am
> > putting a sockets interface into the drivers so Apache, LVS, and 
> > Pirahna can use these very high speed adapters for a clustered web 
> > server.  Our M2FS clustered file system also is being architected 
> > to use these cards.  
> 
> You're probably aware of this, but SCI allows a lot more then the creation
> of windows of shared memory. The IBM NUMA-Q machines (what was Sequent), use
> the SCI interconnect to build a single-system image machine with all memory
> visible from all "nodes". In fact, all the commercial NUMA machines of which
> I am aware have this property (all nodes see and can address all memory). The
> non-uniform part of NUMA comes from the potentially differing latency and
> speed of different parts of memory (local vs remote in this case).
> AFAIK, the work that Kanoj Sarcar has been doing is to enable such machines.
> 
> It sounds like you have a different requirement of very high-speed shared
> memory between different nodes that can be mapped and unmapped as required.
> Do I understand this correctly ? That would make your requirements somewhat
> orthogonal to the requirements those of us with NUMA architectures have.
> 
> > I will post the source code for the SCI cards at vger.timpanogas.org 
> > and if you have time, please download this code and take a look at 
> > how we are using the bigphysarea APIs to create these windows accros
> > machines.  The current NUMA support in Linux is somewhat slim, and 
> > I would like to use established APIs to do this if possible.
> 
> See above. It may be that you need different APIs anyway.
> 
> Regards,
> 
> Tim
> 
> -- 
> Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
> IBM Linux Technology Center, Beaverton, Oregon
> "Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

2000-12-22 Thread Tim Wright

Hi Jeff,

On Fri, Dec 22, 2000 at 11:11:05AM -0700, Jeff V. Merkey wrote:
[...]
> SCI allows machines to create windows of shared memory across a cluster
> of nodes, and at 1 Gigabyte-per-second (Gigabyte not gigabit).  I am
> putting a sockets interface into the drivers so Apache, LVS, and 
> Pirahna can use these very high speed adapters for a clustered web 
> server.  Our M2FS clustered file system also is being architected 
> to use these cards.  

You're probably aware of this, but SCI allows a lot more then the creation
of windows of shared memory. The IBM NUMA-Q machines (what was Sequent), use
the SCI interconnect to build a single-system image machine with all memory
visible from all "nodes". In fact, all the commercial NUMA machines of which
I am aware have this property (all nodes see and can address all memory). The
non-uniform part of NUMA comes from the potentially differing latency and
speed of different parts of memory (local vs remote in this case).
AFAIK, the work that Kanoj Sarcar has been doing is to enable such machines.

It sounds like you have a different requirement of very high-speed shared
memory between different nodes that can be mapped and unmapped as required.
Do I understand this correctly ? That would make your requirements somewhat
orthogonal to the requirements those of us with NUMA architectures have.

> I will post the source code for the SCI cards at vger.timpanogas.org 
> and if you have time, please download this code and take a look at 
> how we are using the bigphysarea APIs to create these windows accros
> machines.  The current NUMA support in Linux is somewhat slim, and 
> I would like to use established APIs to do this if possible.

See above. It may be that you need different APIs anyway.

Regards,

Tim

-- 
Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
IBM Linux Technology Center, Beaverton, Oregon
"Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey

On Fri, Dec 22, 2000 at 08:21:37PM +0100, Andi Kleen wrote:
> On Fri, Dec 22, 2000 at 11:35:30AM -0700, Jeff V. Merkey wrote:
> > The real question is how to guarantee that these pages will be contiguous
> > in memory.  The slab allocator may also work, but I think there are size
> > constraints on how much I can get in one pass.
> 
> You cannot guarantee it after the system has left bootup stage. That's the
> whole reason why bigphysarea exists.
> 
> -Andi

I am wondering why the drivers need such a big contiguous chunk of memory.
For message passing operatings, they should not.  Some of 
the user space libraries appear to need this support.  I am going through 
this code today attempting to determine if there's a way to reduce this 
requirement or map the memory differently.   I am not using these cards
for a ccNUMA implementation, although they have versions of these 
adapters that can provide this capability, but for message passing with 
small windows of coherence between machines with push/pull DMA-style
behavior for high speed data transfers.  99.9% of the clustering 
stuff on Linux uses this model, so this requirement perhaps can be
restructured to be a better fit for Linux.

Just having the patch in the kernel for bigphysarea support would solve
this issue if it could be structured into a form Alan finds acceptable.
Absent this, we need a workaround that's more tailored for the 
requirments for Linux apps.

Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Andi Kleen


On Fri, Dec 22, 2000 at 11:35:30AM -0700, Jeff V. Merkey wrote:
> The real question is how to guarantee that these pages will be contiguous
> in memory.  The slab allocator may also work, but I think there are size
> constraints on how much I can get in one pass.

You cannot guarantee it after the system has left bootup stage. That's the
whole reason why bigphysarea exists.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey

On Fri, Dec 22, 2000 at 11:11:05AM -0700, Jeff V. Merkey wrote:
> On Fri, Dec 22, 2000 at 09:39:28AM +0100, Pauline Middelink wrote:
> > On Thu, 21 Dec 2000 around 15:53:39 -0700, Jeff V. Merkey wrote:
> 

Pauline/Alan,

I have been studying the SCI code and I think I may have a workaround that
won't need the patch, but it will require pinning large chunks of memory 
with the existing __get_free_pages() functions.  I will need to make the 
changes and test them.   This change will require significant testing.  I will
ping you guys if I have questions.  If we can reach a compromise on the 
bigphysarea patch, it would be great, but absent this, I will be looking
at this alternate solution.

The real question is how to guarantee that these pages will be contiguous
in memory.  The slab allocator may also work, but I think there are size
constraints on how much I can get in one pass.

:-)

Jeff

> 
> 
> 
> > -- 
> > GPG Key fingerprint = 2D5B 87A7 DDA6 0378 5DEA  BD3B 9A50 B416 E2D0 C3C2
> > For more details look at my website http://www.polyware.nl/~middelink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey

On Fri, Dec 22, 2000 at 09:39:28AM +0100, Pauline Middelink wrote:
> On Thu, 21 Dec 2000 around 15:53:39 -0700, Jeff V. Merkey wrote:
> > 
> > Alan,
> > 
> > I am looking over the 2.4 bigphysarea patch, and I think I agree 
> > there needs to be a better approach.  It's a messy hack -- I agree.
> 
> Please explain further.
> Just leaving it at that is not nice. What is messy?
> The implementation? The API?
> 
> If you have a better solutions for allocating big chunks of
> physical continious memory at different stages during the
> runtime of the kernel, i would be very interesseted.
> 
> (Alan: bootmem allocation just won't do. I need that memory
> in modules which get potentially loaded/unloaded, hence a
> wrapper interface for allowing access to a bootmem allocated
> piece of memory)
> 
> And the API? That API was set a long time ago, luckely not by me :)
> Though I dont see the real problem. It allows allocation and
> freeing of chunks of memory. Period. Its all its suppose to do.
> Or do you want it rolled in kmalloc? So GFP_DMA with size>128K
> would take memory from this? That would mean a much more intrusive
> patch in very sensitive and rapidly changing parts of the kernel
> (2.2->2.4 speaking)...
> 
> Met vriendelijke groet,
> Pauline Middelink

Pauline,

Can we put together a patch that meets Alan's requirements and get it into 
the kernel proper.  We have taken on a project from Dolphin to merge the
high speed Dolphin SCI interconnect drivers into the kernel proper, and 
obviously, it's not possible to do so if the drivers are dependent on 
this patch.  I can send you the driver sources for the SCI cards, at 
least the portions that depend on this patch, and would appreciate
any guidance you could provide on a better way to allocate memory.

SCI allows machines to create windows of shared memory across a cluster
of nodes, and at 1 Gigabyte-per-second (Gigabyte not gigabit).  I am
putting a sockets interface into the drivers so Apache, LVS, and 
Pirahna can use these very high speed adapters for a clustered web 
server.  Our M2FS clustered file system also is being architected 
to use these cards.  

I will post the source code for the SCI cards at vger.timpanogas.org 
and if you have time, please download this code and take a look at 
how we are using the bigphysarea APIs to create these windows accros
machines.  The current NUMA support in Linux is somewhat slim, and 
I would like to use established APIs to do this if possible.

:-)

Jeff

> -- 
> GPG Key fingerprint = 2D5B 87A7 DDA6 0378 5DEA  BD3B 9A50 B416 E2D0 C3C2
> For more details look at my website http://www.polyware.nl/~middelink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Alan Cox


> (Alan: bootmem allocation just won't do. I need that memory
> in modules which get potentially loaded/unloaded, hence a
> wrapper interface for allowing access to a bootmem allocated
> piece of memory)

Yes, I pointed him at you for 2.4test because you had the code sitting on
top of bootmem which is the right way to do it.

> Or do you want it rolled in kmalloc? So GFP_DMA with size>128K
> would take memory from this? That would mean a much more intrusive
> patch in very sensitive and rapidly changing parts of the kernel
> (2.2->2.4 speaking)...

bigmem is 'last resort' stuff. I'd much rather it is as now a seperate
allocator so you actually have to sit and think and decide to give up on
kmalloc/vmalloc/better algorithms and only use it when the hardware sucks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Pauline Middelink

On Thu, 21 Dec 2000 around 15:53:39 -0700, Jeff V. Merkey wrote:
> 
> Alan,
> 
> I am looking over the 2.4 bigphysarea patch, and I think I agree 
> there needs to be a better approach.  It's a messy hack -- I agree.

Please explain further.
Just leaving it at that is not nice. What is messy?
The implementation? The API?

If you have a better solutions for allocating big chunks of
physical continious memory at different stages during the
runtime of the kernel, i would be very interesseted.

(Alan: bootmem allocation just won't do. I need that memory
in modules which get potentially loaded/unloaded, hence a
wrapper interface for allowing access to a bootmem allocated
piece of memory)

And the API? That API was set a long time ago, luckely not by me :)
Though I dont see the real problem. It allows allocation and
freeing of chunks of memory. Period. Its all its suppose to do.
Or do you want it rolled in kmalloc? So GFP_DMA with size>128K
would take memory from this? That would mean a much more intrusive
patch in very sensitive and rapidly changing parts of the kernel
(2.2->2.4 speaking)...

Met vriendelijke groet,
Pauline Middelink
-- 
GPG Key fingerprint = 2D5B 87A7 DDA6 0378 5DEA  BD3B 9A50 B416 E2D0 C3C2
For more details look at my website http://www.polyware.nl/~middelink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Pauline Middelink


On Thu, 21 Dec 2000 around 15:53:39 -0700, Jeff V. Merkey wrote:
 
 Alan,
 
 I am looking over the 2.4 bigphysarea patch, and I think I agree 
 there needs to be a better approach.  It's a messy hack -- I agree.

Please explain further.
Just leaving it at that is not nice. What is messy?
The implementation? The API?

If you have a better solutions for allocating big chunks of
physical continious memory at different stages during the
runtime of the kernel, i would be very interesseted.

(Alan: bootmem allocation just won't do. I need that memory
in modules which get potentially loaded/unloaded, hence a
wrapper interface for allowing access to a bootmem allocated
piece of memory)

And the API? That API was set a long time ago, luckely not by me :)
Though I dont see the real problem. It allows allocation and
freeing of chunks of memory. Period. Its all its suppose to do.
Or do you want it rolled in kmalloc? So GFP_DMA with size128K
would take memory from this? That would mean a much more intrusive
patch in very sensitive and rapidly changing parts of the kernel
(2.2-2.4 speaking)...

Met vriendelijke groet,
Pauline Middelink
-- 
GPG Key fingerprint = 2D5B 87A7 DDA6 0378 5DEA  BD3B 9A50 B416 E2D0 C3C2
For more details look at my website http://www.polyware.nl/~middelink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Alan Cox


 (Alan: bootmem allocation just won't do. I need that memory
 in modules which get potentially loaded/unloaded, hence a
 wrapper interface for allowing access to a bootmem allocated
 piece of memory)

Yes, I pointed him at you for 2.4test because you had the code sitting on
top of bootmem which is the right way to do it.

 Or do you want it rolled in kmalloc? So GFP_DMA with size128K
 would take memory from this? That would mean a much more intrusive
 patch in very sensitive and rapidly changing parts of the kernel
 (2.2-2.4 speaking)...

bigmem is 'last resort' stuff. I'd much rather it is as now a seperate
allocator so you actually have to sit and think and decide to give up on
kmalloc/vmalloc/better algorithms and only use it when the hardware sucks

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 09:39:28AM +0100, Pauline Middelink wrote:
 On Thu, 21 Dec 2000 around 15:53:39 -0700, Jeff V. Merkey wrote:
  
  Alan,
  
  I am looking over the 2.4 bigphysarea patch, and I think I agree 
  there needs to be a better approach.  It's a messy hack -- I agree.
 
 Please explain further.
 Just leaving it at that is not nice. What is messy?
 The implementation? The API?
 
 If you have a better solutions for allocating big chunks of
 physical continious memory at different stages during the
 runtime of the kernel, i would be very interesseted.
 
 (Alan: bootmem allocation just won't do. I need that memory
 in modules which get potentially loaded/unloaded, hence a
 wrapper interface for allowing access to a bootmem allocated
 piece of memory)
 
 And the API? That API was set a long time ago, luckely not by me :)
 Though I dont see the real problem. It allows allocation and
 freeing of chunks of memory. Period. Its all its suppose to do.
 Or do you want it rolled in kmalloc? So GFP_DMA with size128K
 would take memory from this? That would mean a much more intrusive
 patch in very sensitive and rapidly changing parts of the kernel
 (2.2-2.4 speaking)...
 
 Met vriendelijke groet,
 Pauline Middelink

Pauline,

Can we put together a patch that meets Alan's requirements and get it into 
the kernel proper.  We have taken on a project from Dolphin to merge the
high speed Dolphin SCI interconnect drivers into the kernel proper, and 
obviously, it's not possible to do so if the drivers are dependent on 
this patch.  I can send you the driver sources for the SCI cards, at 
least the portions that depend on this patch, and would appreciate
any guidance you could provide on a better way to allocate memory.

SCI allows machines to create windows of shared memory across a cluster
of nodes, and at 1 Gigabyte-per-second (Gigabyte not gigabit).  I am
putting a sockets interface into the drivers so Apache, LVS, and 
Pirahna can use these very high speed adapters for a clustered web 
server.  Our M2FS clustered file system also is being architected 
to use these cards.  

I will post the source code for the SCI cards at vger.timpanogas.org 
and if you have time, please download this code and take a look at 
how we are using the bigphysarea APIs to create these windows accros
machines.  The current NUMA support in Linux is somewhat slim, and 
I would like to use established APIs to do this if possible.

:-)

Jeff
 


 -- 
 GPG Key fingerprint = 2D5B 87A7 DDA6 0378 5DEA  BD3B 9A50 B416 E2D0 C3C2
 For more details look at my website http://www.polyware.nl/~middelink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 11:11:05AM -0700, Jeff V. Merkey wrote:
 On Fri, Dec 22, 2000 at 09:39:28AM +0100, Pauline Middelink wrote:
  On Thu, 21 Dec 2000 around 15:53:39 -0700, Jeff V. Merkey wrote:
 

Pauline/Alan,

I have been studying the SCI code and I think I may have a workaround that
won't need the patch, but it will require pinning large chunks of memory 
with the existing __get_free_pages() functions.  I will need to make the 
changes and test them.   This change will require significant testing.  I will
ping you guys if I have questions.  If we can reach a compromise on the 
bigphysarea patch, it would be great, but absent this, I will be looking
at this alternate solution.

The real question is how to guarantee that these pages will be contiguous
in memory.  The slab allocator may also work, but I think there are size
constraints on how much I can get in one pass.

:-)

Jeff

 
 
 
  -- 
  GPG Key fingerprint = 2D5B 87A7 DDA6 0378 5DEA  BD3B 9A50 B416 E2D0 C3C2
  For more details look at my website http://www.polyware.nl/~middelink
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Andi Kleen


On Fri, Dec 22, 2000 at 11:35:30AM -0700, Jeff V. Merkey wrote:
 The real question is how to guarantee that these pages will be contiguous
 in memory.  The slab allocator may also work, but I think there are size
 constraints on how much I can get in one pass.

You cannot guarantee it after the system has left bootup stage. That's the
whole reason why bigphysarea exists.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 08:21:37PM +0100, Andi Kleen wrote:
 On Fri, Dec 22, 2000 at 11:35:30AM -0700, Jeff V. Merkey wrote:
  The real question is how to guarantee that these pages will be contiguous
  in memory.  The slab allocator may also work, but I think there are size
  constraints on how much I can get in one pass.
 
 You cannot guarantee it after the system has left bootup stage. That's the
 whole reason why bigphysarea exists.
 
 -Andi

I am wondering why the drivers need such a big contiguous chunk of memory.
For message passing operatings, they should not.  Some of 
the user space libraries appear to need this support.  I am going through 
this code today attempting to determine if there's a way to reduce this 
requirement or map the memory differently.   I am not using these cards
for a ccNUMA implementation, although they have versions of these 
adapters that can provide this capability, but for message passing with 
small windows of coherence between machines with push/pull DMA-style
behavior for high speed data transfers.  99.9% of the clustering 
stuff on Linux uses this model, so this requirement perhaps can be
restructured to be a better fit for Linux.

Just having the patch in the kernel for bigphysarea support would solve
this issue if it could be structured into a form Alan finds acceptable.
Absent this, we need a workaround that's more tailored for the 
requirments for Linux apps.

Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

2000-12-22 Thread Tim Wright


Hi Jeff,

On Fri, Dec 22, 2000 at 11:11:05AM -0700, Jeff V. Merkey wrote:
[...]
 SCI allows machines to create windows of shared memory across a cluster
 of nodes, and at 1 Gigabyte-per-second (Gigabyte not gigabit).  I am
 putting a sockets interface into the drivers so Apache, LVS, and 
 Pirahna can use these very high speed adapters for a clustered web 
 server.  Our M2FS clustered file system also is being architected 
 to use these cards.  

You're probably aware of this, but SCI allows a lot more then the creation
of windows of shared memory. The IBM NUMA-Q machines (what was Sequent), use
the SCI interconnect to build a single-system image machine with all memory
visible from all "nodes". In fact, all the commercial NUMA machines of which
I am aware have this property (all nodes see and can address all memory). The
non-uniform part of NUMA comes from the potentially differing latency and
speed of different parts of memory (local vs remote in this case).
AFAIK, the work that Kanoj Sarcar has been doing is to enable such machines.

It sounds like you have a different requirement of very high-speed shared
memory between different nodes that can be mapped and unmapped as required.
Do I understand this correctly ? That would make your requirements somewhat
orthogonal to the requirements those of us with NUMA architectures have.

 I will post the source code for the SCI cards at vger.timpanogas.org 
 and if you have time, please download this code and take a look at 
 how we are using the bigphysarea APIs to create these windows accros
 machines.  The current NUMA support in Linux is somewhat slim, and 
 I would like to use established APIs to do this if possible.

See above. It may be that you need different APIs anyway.

Regards,

Tim

-- 
Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
IBM Linux Technology Center, Beaverton, Oregon
"Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 11:37:29AM -0800, Tim Wright wrote:

I have been working with SCI since 1994.  The people who own 
Dolphin and the SCI chipsets also own TRG.  We dropped work in 
the P6 ccNUMA cards several years back because Intel was 
convinced that shared-nothing was the way to go (and it is).
However, SCI's ability to create explicit sharing makes it
the fastest shared nothing interface around for message passing
(go figure).

I think we do need some bettr APIs.  Grab the source at my FTP server,
and I'd love any input you could provide.

Thanks,

:-)

Jeff


 Hi Jeff,
 
 On Fri, Dec 22, 2000 at 11:11:05AM -0700, Jeff V. Merkey wrote:
 [...]
  SCI allows machines to create windows of shared memory across a cluster
  of nodes, and at 1 Gigabyte-per-second (Gigabyte not gigabit).  I am
  putting a sockets interface into the drivers so Apache, LVS, and 
  Pirahna can use these very high speed adapters for a clustered web 
  server.  Our M2FS clustered file system also is being architected 
  to use these cards.  
 
 You're probably aware of this, but SCI allows a lot more then the creation
 of windows of shared memory. The IBM NUMA-Q machines (what was Sequent), use
 the SCI interconnect to build a single-system image machine with all memory
 visible from all "nodes". In fact, all the commercial NUMA machines of which
 I am aware have this property (all nodes see and can address all memory). The
 non-uniform part of NUMA comes from the potentially differing latency and
 speed of different parts of memory (local vs remote in this case).
 AFAIK, the work that Kanoj Sarcar has been doing is to enable such machines.
 
 It sounds like you have a different requirement of very high-speed shared
 memory between different nodes that can be mapped and unmapped as required.
 Do I understand this correctly ? That would make your requirements somewhat
 orthogonal to the requirements those of us with NUMA architectures have.
 
  I will post the source code for the SCI cards at vger.timpanogas.org 
  and if you have time, please download this code and take a look at 
  how we are using the bigphysarea APIs to create these windows accros
  machines.  The current NUMA support in Linux is somewhat slim, and 
  I would like to use established APIs to do this if possible.
 
 See above. It may be that you need different APIs anyway.
 
 Regards,
 
 Tim
 
 -- 
 Tim Wright - [EMAIL PROTECTED] or [EMAIL PROTECTED] or [EMAIL PROTECTED]
 IBM Linux Technology Center, Beaverton, Oregon
 "Nobody ever said I was charming, they said "Rimmer, you're a git!"" RD VI
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Jeff V. Merkey


On Fri, Dec 22, 2000 at 10:39:43PM +0100, Erik Mouw wrote:
 On Fri, Dec 22, 2000 at 02:54:50PM -0700, Jeff V. Merkey wrote:
  Having a 1 Gigabyte per second fat pipe that runs over a prallel bus 
  fabric with a standard PCI card that costs @ $ 500 and can run LVS 
  and TUX at high speeds would be for the common good, particularly since
  NT and W2K both have implementations of Dolphin SCI that allow them 
  to exploit this hardware.  
 
 I'm just wondering how you are going to do 1 Gbyte per second when you
 still have to get the data through a PCI bus to that card. In theory,
 standard PCI can do 133 Mbyte/s, but only when you're very lucky to be
 able to burst large chunks of data. OK, 64 bit PCI at 66 MHz should
 quadruple the throughput, but that's still not enough for 1 Gbyte/s.

THe fabric supports this data rate.  PCI cards are limited to @ 130MB, 
but multiple nodes all running at the same time could generate this much
traffic.

Jeff

 
 
 Erik
 
 -- 
 J.A.K. (Erik) Mouw, Information and Communication Theory Group, Department
 of Electrical Engineering, Faculty of Information Technology and Systems,
 Delft University of Technology, PO BOX 5031,  2600 GA Delft, The Netherlands
 Phone: +31-15-2783635  Fax: +31-15-2781843  Email: [EMAIL PROTECTED]
 WWW: http://www-ict.its.tudelft.nl/~erik/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-22 Thread Albert D. Cahalan


 bigmem is 'last resort' stuff. I'd much rather it is as now a
 seperate allocator so you actually have to sit and think and
 decide to give up on kmalloc/vmalloc/better algorithms and
 only use it when the hardware sucks

It isn't just for sucky hardware. It is for performance too.

1. Linux isn't known for cache coloring ability. Even if it was,
   users want to take advantage of large pages or BAT registers
   to reduce TLB miss costs. (that is, mapping such areas into
   a process is needed... never mind security for now)

2. Programming a DMA controller with multiple addresses isn't
   as fast as programming it with one.

Consider what happens when you have the ability to make one
compute node DMA directly into the physical memory of another.
With a large block of physical memory, you only need to have
the destination node give the writer a single physical memory
address to send the data to. With loose pages, the destination
has to transmit a great big list. That might be 30 thousand!

The point of all this is to crunch data as fast as possible,
with Linux mostly getting out of the way. Perhaps you want
to generate real-time high-resolution video of a human heart
as it beats inside somebody. You process raw data (audio, X-ray,
magnetic resonance, or whatever) on one group of processors,
then hand off the data to another group of processors for the
rendering task. Actually there might be many stages. Playing
games with individual pages will cut into your performance.
The data stream is fat and relentless.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Jeff V. Merkey



Alan,

I am looking over the 2.4 bigphysarea patch, and I think I agree 
there needs to be a better approach.  It's a messy hack -- I agree.

:-)

Jeff

> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Jeff V. Merkey

On Thu, Dec 21, 2000 at 09:32:46PM +, Alan Cox wrote:
> > A question related to bigphysarea support in the native Linux
> > 2.2.19 and 2.4.0 kernels.
> > 
> > I know there are patches for this support, but is it planned for 
> > rolling into the kernel by default to support Dolphin SCI and 
> > some of the NUMA Clustering adapters.  I see it there for some 
> > of the video adapters.
> 
> bigphysarea is the wrong model for 2.4. The bootmem allocator means that
> drivers could do early claims via the bootmem interface during boot up. That
> would avoid all the cruft.
> 
> For 2.2 bigphysarea is a hack, but a neccessary add on patch and not one you
> can redo cleanly as we don't have bootmem
> 
> I belive Pauline Middelink had a patch implementing bigphysarea in terms of
> bootmem
> 
> Alan

Alan,

Thanks for the prompt response.  I am merging the Dolphin SCI High Speed
interconnect drivers into 2.2.18 and 2.4.0 for our M2FS project, and I 
am reviewing the big ugly nasty patch they have current as of 2.2.13 
(really old).  I will be looking over the 2.4 tree for a more clean 
manner to do what they want.

What's in the patch alters the /proc filesystem, and the VM code.  I will
submit a patch against 2.2.19 and 2.4.0 for this support for their SCI 
adapters after I get a handle on it.

:-)

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Alan Cox


> A question related to bigphysarea support in the native Linux
> 2.2.19 and 2.4.0 kernels.
> 
> I know there are patches for this support, but is it planned for 
> rolling into the kernel by default to support Dolphin SCI and 
> some of the NUMA Clustering adapters.  I see it there for some 
> of the video adapters.

bigphysarea is the wrong model for 2.4. The bootmem allocator means that
drivers could do early claims via the bootmem interface during boot up. That
would avoid all the cruft.

For 2.2 bigphysarea is a hack, but a neccessary add on patch and not one you
can redo cleanly as we don't have bootmem

I belive Pauline Middelink had a patch implementing bigphysarea in terms of
bootmem

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Jeff V. Merkey




A question related to bigphysarea support in the native Linux
2.2.19 and 2.4.0 kernels.

I know there are patches for this support, but is it planned for 
rolling into the kernel by default to support Dolphin SCI and 
some of the NUMA Clustering adapters.  I see it there for some 
of the video adapters.

Is this planned for the kernel proper, or will it remain a patch?
At the rate the VM and mm subsystems tend to get updated, I am 
wondering if there's a current version out for this.

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Jeff V. Merkey




A question related to bigphysarea support in the native Linux
2.2.19 and 2.4.0 kernels.

I know there are patches for this support, but is it planned for 
rolling into the kernel by default to support Dolphin SCI and 
some of the NUMA Clustering adapters.  I see it there for some 
of the video adapters.

Is this planned for the kernel proper, or will it remain a patch?
At the rate the VM and mm subsystems tend to get updated, I am 
wondering if there's a current version out for this.

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Alan Cox


 A question related to bigphysarea support in the native Linux
 2.2.19 and 2.4.0 kernels.
 
 I know there are patches for this support, but is it planned for 
 rolling into the kernel by default to support Dolphin SCI and 
 some of the NUMA Clustering adapters.  I see it there for some 
 of the video adapters.

bigphysarea is the wrong model for 2.4. The bootmem allocator means that
drivers could do early claims via the bootmem interface during boot up. That
would avoid all the cruft.

For 2.2 bigphysarea is a hack, but a neccessary add on patch and not one you
can redo cleanly as we don't have bootmem

I belive Pauline Middelink had a patch implementing bigphysarea in terms of
bootmem

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Jeff V. Merkey


On Thu, Dec 21, 2000 at 09:32:46PM +, Alan Cox wrote:
  A question related to bigphysarea support in the native Linux
  2.2.19 and 2.4.0 kernels.
  
  I know there are patches for this support, but is it planned for 
  rolling into the kernel by default to support Dolphin SCI and 
  some of the NUMA Clustering adapters.  I see it there for some 
  of the video adapters.
 
 bigphysarea is the wrong model for 2.4. The bootmem allocator means that
 drivers could do early claims via the bootmem interface during boot up. That
 would avoid all the cruft.
 
 For 2.2 bigphysarea is a hack, but a neccessary add on patch and not one you
 can redo cleanly as we don't have bootmem
 
 I belive Pauline Middelink had a patch implementing bigphysarea in terms of
 bootmem
 
 Alan

Alan,

Thanks for the prompt response.  I am merging the Dolphin SCI High Speed
interconnect drivers into 2.2.18 and 2.4.0 for our M2FS project, and I 
am reviewing the big ugly nasty patch they have current as of 2.2.13 
(really old).  I will be looking over the 2.4 tree for a more clean 
manner to do what they want.

What's in the patch alters the /proc filesystem, and the VM code.  I will
submit a patch against 2.2.19 and 2.4.0 for this support for their SCI 
adapters after I get a handle on it.

:-)

Jeff
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

2000-12-21 Thread Jeff V. Merkey



Alan,

I am looking over the 2.4 bigphysarea patch, and I think I agree 
there needs to be a better approach.  It's a messy hack -- I agree.

:-)

Jeff

 -
 To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
 the body of a message to [EMAIL PROTECTED]
 Please read the FAQ at http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
Please read the FAQ at http://www.tux.org/lkml/

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

Re: NUMA and SCI [was Re: bigphysarea support in 2.2.19 and 2.4.0 kernels]

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

bigphysarea support in 2.2.19 and 2.4.0 kernels

bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

Re: bigphysarea support in 2.2.19 and 2.4.0 kernels

35 matches

Site Navigation

Mail list logo

Footer information