Andi Kleen wrote:
On Tuesday 20 November 2007 04:50, Christoph Lameter wrote:
On Tue, 20 Nov 2007, Andi Kleen wrote:
You could in theory move the modules, but then you would need to implement
a full PIC dynamic linker for them first and also increase runtime overhead
for them because they
On Wed, 21 Nov 2007, Andi Kleen wrote:
> The whole mapping for all CPUs cannot fit into 2GB of course, but the
> reference
> linker managed range can.
Ok so you favor the solution where we subtract smp_processor_id() <<
shift?
> > The offset relative to %gs cannot be used if you have a loop a
> > All you need is a 2MB area (16MB is too large if you really
> > want 16k CPUs someday) somewhere in the -2GB or probably better
> > in +2GB. Then the linker puts stuff in there and you use
> > the offsets for referencing relative to %gs.
>
> 2MB * 16k = 32GB. Even with 4k cpus we will have 2M
On Wed, 21 Nov 2007, Andi Kleen wrote:
> On Wednesday 21 November 2007 02:16:11 Christoph Lameter wrote:
> > But one can subtract too...
>
> The linker cannot subtract (unless you add a new relocation types)
The compiler knows and emits assembly to compensate.
> All you need is a 2MB area (16
On Wednesday 21 November 2007 02:16:11 Christoph Lameter wrote:
> But one can subtract too...
The linker cannot subtract (unless you add a new relocation types)
> Hmmm... So the cpu area 0 could be put at
> the beginning of the 2GB kernel area and then grow downwards from
> 0x8000
But one can subtract too... Hmmm... So the cpu area 0 could be put at
the beginning of the 2GB kernel area and then grow downwards from
0x8000. The cost in terms of code is one subtract
instruction for each per_cpu() or CPU_PTR()
The next thing doward from 0x8000 is the vm
On Tue, 20 Nov 2007, Christoph Lameter wrote:
> 32bit sign extension for what? Absolute data references? The addressing
> that I have seen was IP relative. Thus I thought that the kernel could be
> moved lower.
Argh. This is all depending on a special gcc option to compile the
kernel and that
On Tue, 20 Nov 2007, H. Peter Anvin wrote:
> But you wouldn't actually *use* this address space. It's just for the linker
> to know what address to tag the references with; it gets relocated by gs_base
> down into proper kernel space. The linker can stash the initialized reference
> copy at any
On Tue, 20 Nov 2007, Andi Kleen wrote:
>
> >
> > Right so I could move the kernel to
> >
> > #define __PAGE_OFFSET _AC(0x8100, UL)
> > #define __START_KERNEL_map_AC(0xfff8, UL)
>
> That is -31GB unless I'm miscounting. But it needs to be >= -2GB
> (31bits)
The __S
Christoph Lameter wrote:
On Tue, 20 Nov 2007, Andi Kleen wrote:
This limitation shouldn't apply to the percpu area, since gs_base can be
pointed anywhere in the address space -- in effect we're always indirect.
The initial reference copy of the percpu area has to be addressed by
the linker.
Andi Kleen wrote:
This limitation shouldn't apply to the percpu area, since gs_base can be
pointed anywhere in the address space -- in effect we're always indirect.
The initial reference copy of the percpu area has to be addressed by
the linker.
Hmm, in theory since it is not actually used by
>
> Right so I could move the kernel to
>
> #define __PAGE_OFFSET _AC(0x8100, UL)
> #define __START_KERNEL_map_AC(0xfff8, UL)
That is -31GB unless I'm miscounting. But it needs to be >= -2GB
(31bits)
Right now it is at -2GB + 2MB, because it is loaded at physical
On Tue, 20 Nov 2007, Andi Kleen wrote:
>
> > This limitation shouldn't apply to the percpu area, since gs_base can be
> > pointed anywhere in the address space -- in effect we're always indirect.
>
> The initial reference copy of the percpu area has to be addressed by
> the linker.
Right that
> This limitation shouldn't apply to the percpu area, since gs_base can be
> pointed anywhere in the address space -- in effect we're always indirect.
The initial reference copy of the percpu area has to be addressed by
the linker.
Hmm, in theory since it is not actually used by itself I suppos
Andi Kleen wrote:
On Tuesday 20 November 2007 04:50, Christoph Lameter wrote:
On Tue, 20 Nov 2007, Andi Kleen wrote:
I might be pointing out the obvious, but on x86-64 there is definitely
not 256TB of VM available for this.
Well maybe in the future.
That would either require more than 4 leve
On Tue, 20 Nov 2007, Andi Kleen wrote:
> > So I think we have a 2GB area right?
>
> For everything that needs the -31bit offsets; that is everything linked
Of course.
> > 1GB kernel
> > 1GB - 1x per cpu area (128M?) modules?
> > cpu aree 0
> > 2GB limit
> > cpu area 1
> > cpu area 2
> > ..
On Tuesday 20 November 2007 04:50, Christoph Lameter wrote:
> On Tue, 20 Nov 2007, Andi Kleen wrote:
> > I might be pointing out the obvious, but on x86-64 there is definitely
> > not 256TB of VM available for this.
>
> Well maybe in the future.
That would either require more than 4 levels or larg
> > Yeah yea but the latencies are minimal making the NUMA logic too
> > expensive for most loads ... If you put a NUMA kernel onto those then
> > performance drops (I think someone measures 15-30%?)
>
> Small socket count systems are going to increasingly be NUMA in future.
> If CONFIG_NUMA hurts
On Tuesday 20 November 2007 13:02, Christoph Lameter wrote:
> On Mon, 19 Nov 2007, H. Peter Anvin wrote:
> > You're making the assumption here that NUMA = large number of CPUs. This
> > assumption is flat-out wrong.
>
> Well maybe. Usually one gets to NUMA because the hardware gets too big to
> be
On Tue, 20 Nov 2007, Andi Kleen wrote:
> I might be pointing out the obvious, but on x86-64 there is definitely not
> 256TB of VM available for this.
Well maybe in the future.
One of the issues that I ran into is that I had to place the cpu area
in between to make the offsets link right.
Howev
On Tuesday 20 November 2007 13:02, Christoph Lameter wrote:
> On Mon, 19 Nov 2007, H. Peter Anvin wrote:
> > You're making the assumption here that NUMA = large number of CPUs. This
> > assumption is flat-out wrong.
>
> Well maybe. Usually one gets to NUMA because the hardware gets too big to
> be
> 4k cpu configurations with 1k nodes:
>
> 4096 * 16MB = 64TB of virtual space.
>
> Maximum theoretical configuration 16384 processors 1k nodes:
>
> 16384 * 16MB = 256TB of virtual space.
>
> Both fit within the established limits established.
I might be pointing out the obvious, but
Christoph Lameter wrote:
On Mon, 19 Nov 2007, H. Peter Anvin wrote:
You're making the assumption here that NUMA = large number of CPUs. This
assumption is flat-out wrong.
Well maybe. Usually one gets to NUMA because the hardware gets too big to
be handleed the UMA way.
On x86-64, most two
On Mon, 19 Nov 2007, H. Peter Anvin wrote:
> You're making the assumption here that NUMA = large number of CPUs. This
> assumption is flat-out wrong.
Well maybe. Usually one gets to NUMA because the hardware gets too big to
be handleed the UMA way.
> On x86-64, most two-socket systems are still
Christoph Lameter wrote:
For the UP and SMP case map the area using 4k ptes. Typical use of per cpu
data is around 16k for UP and SMP configurations. It goes up to 45k when the
per cpu area is managed by cpu_alloc (see special x86_64 patchset).
Allocating in 2M segments would be overkill.
For N
64 bit:
Set up a cpu area that allows the use of up 16MB for each processor.
Cpu memory use can grow a bit. F.e. if we assume that a pageset
occupies 64 bytes of memory and we have 3 zones in each of 1024 nodes
then we need 3 * 1k * 16k = 50 million pagesets or 3096 pagesets per
processor. This r
26 matches
Mail list logo