Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Eric W. Biederman
"Joerg Roedel" <[EMAIL PROTECTED]> writes:

> On Tue, Feb 06, 2007 at 12:08:12PM -0700, [EMAIL PROTECTED] wrote:
>> "Andreas Herrmann" <[EMAIL PROTECTED]> writes:
>> > You are referring to current Linux implementation?
>> > The AMD64 architecture increased physical address size in PSE mode to
>> > 40 bits. So at least it would be possible to use more than 32 bits.
>> 
>> How do you get 40 physical bits in a 32bit page table entry? My memory
>> is that the low bits in the page table entry were well defined and
>> accounted for. I'm pretty certain I can account for 6 of the low bits
>> off the top of my head.  PSE is the page size extension allowing pages 
>> 2MB/4MB
>> pages.
>
> The access to 40 physical address bits is only possible using large pages
> (4MB on 32bit without PAE). In those page tables entrys you only use
> bits 22:31 for encoding the physical address. The bits 12:21 are
> unused. These unused bits are reused to encode bits 32:39 of the 40 bit
> physical address.

Yep.  I missed that feature, and I do see it in AMD documentation now
that I look.

I'm not certain what that would be useful for though.

I'm pretty certain doesn't use this feature, we just enable PAE mode.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Joerg Roedel
On Tue, Feb 06, 2007 at 12:08:12PM -0700, [EMAIL PROTECTED] wrote:
> "Andreas Herrmann" <[EMAIL PROTECTED]> writes:
> > You are referring to current Linux implementation?
> > The AMD64 architecture increased physical address size in PSE mode to
> > 40 bits. So at least it would be possible to use more than 32 bits.
> 
> How do you get 40 physical bits in a 32bit page table entry? My memory
> is that the low bits in the page table entry were well defined and
> accounted for. I'm pretty certain I can account for 6 of the low bits
> off the top of my head.  PSE is the page size extension allowing pages 2MB/4MB
> pages.

The access to 40 physical address bits is only possible using large pages
(4MB on 32bit without PAE). In those page tables entrys you only use
bits 22:31 for encoding the physical address. The bits 12:21 are
unused. These unused bits are reused to encode bits 32:39 of the 40 bit
physical address.

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Eric W. Biederman
"Andreas Herrmann" <[EMAIL PROTECTED]> writes:

> On Tue, Feb 06, 2007 at 10:54:23AM -0700, [EMAIL PROTECTED] wrote:
>> "Andreas Herrmann" <[EMAIL PROTECTED]> writes:
>> > On Mon, Feb 05, 2007 at 05:26:12PM -0700, [EMAIL PROTECTED] wrote:
>> >> "Andreas Herrmann" <[EMAIL PROTECTED]> writes:
>> >> >
>> >> The limit is per cpu not per architecture.  So if you run a
>> >> cpu that can run in 64bit mode in 32bit mode the limit
>> >> is not 36 bits.  Even PAE in 32bit mode doesn't have that limit.
>> >> 
>> > Good point.
>> >
>> > I totally ignored that on 64 bit cpus in legacy mode
>> > - PAE-paging means up to 52 physical address bits respectively
>> > "physical address size of the underlying implementation"
>> > - for non-PAE-paging with PSE enabled we have 40 bits for AMD and
>> > with PSE36 36 bits for Intel
>> 
>> For non PAE-paging you have 32bits.
>
> You are referring to current Linux implementation?
> The AMD64 architecture increased physical address size in PSE mode to
> 40 bits. So at least it would be possible to use more than 32 bits.

How do you get 40 physical bits in a 32bit page table entry? My memory
is that the low bits in the page table entry were well defined and
accounted for. I'm pretty certain I can account for 6 of the low bits
off the top of my head.  PSE is the page size extension allowing pages 2MB/4MB
pages.  

PAE (physical address extension) gives you a 64bit page table entry
and where you have a place for all of those extra physical bits kick
as I recall.  The limit is 52 bits and current cpus talk about support
40 bits with AMD in the process of going to 48 bits.

Is there a feature I have overlooked?  
That would allow 40 bits with PSE?

>> 
>> Yes.  So base needs to be come a u64. 
>
> I was afraid you'ld say that.
>
>> So base = ((base_hi << 32) | base_lo) >> PAGE_SHIFT.
>> 
>> I see where the 44bit limit comes in.  Do you actually have boxes
>> with > 16TB?
>
> No, I don't have access to such a box. Would be nice though.
>
>> 
>> Regardless it looks like base and possibly size needs to become
>> a u64.  At which time the extra >> PAGE_SHIFT could be meaningless.
>> Either that or because base and size need to be sized in something like
>> megabytes.
>> 
>> I suspect making it a u64 sized in bytes will get the job done and
>> result in simpler code.
>
> Right you are!

> Ok, it is best to do (3).
> I will come up with another patch asap.

Thanks.  Sorry for being a pain.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andreas Herrmann
On Tue, Feb 06, 2007 at 10:54:23AM -0700, [EMAIL PROTECTED] wrote:
> "Andreas Herrmann" <[EMAIL PROTECTED]> writes:
> > On Mon, Feb 05, 2007 at 05:26:12PM -0700, [EMAIL PROTECTED] wrote:
> >> "Andreas Herrmann" <[EMAIL PROTECTED]> writes:
> >> >
> >> The limit is per cpu not per architecture.  So if you run a
> >> cpu that can run in 64bit mode in 32bit mode the limit
> >> is not 36 bits.  Even PAE in 32bit mode doesn't have that limit.
> >> 
> > Good point.
> >
> > I totally ignored that on 64 bit cpus in legacy mode
> > - PAE-paging means up to 52 physical address bits respectively
> > "physical address size of the underlying implementation"
> > - for non-PAE-paging with PSE enabled we have 40 bits for AMD and
> > with PSE36 36 bits for Intel
> 
> For non PAE-paging you have 32bits.

You are referring to current Linux implementation?
The AMD64 architecture increased physical address size in PSE mode to
40 bits. So at least it would be possible to use more than 32 bits.

> >> > diff --git a/arch/i386/kernel/cpu/mtrr/generic.c
> >> > b/arch/i386/kernel/cpu/mtrr/generic.c
> >> > index f77fc53..aa21d15 100644
> >> > --- a/arch/i386/kernel/cpu/mtrr/generic.c
> >> > +++ b/arch/i386/kernel/cpu/mtrr/generic.c
> >> > @@ -172,7 +172,7 @@ int generic_get_free_region(unsigned long base, 
> >> > unsigned
> >> > long size, int replace_
> >> >  static void generic_get_mtrr(unsigned int reg, unsigned long *base,
> >> >   unsigned long *size, mtrr_type *type)
> >> >  {
> >> > -unsigned int mask_lo, mask_hi, base_lo, base_hi;
> >> > +unsigned long mask_lo, mask_hi, base_lo, base_hi;
> >> 
> >> Why?  Given the low and the high I am assuming these are all implicitly
> >> 32bit quantities.  unsigned int is fine.
> >
> > It is not, please refer to the function body, e.g.
> >
> > *base = base_hi << (32 - PAGE_SHIFT) | base_lo >> PAGE_SHIFT;
> >
> > All leading 20 bits of "unsigned int" base_hi are snipped away. Thus
> > limiting base to use 44 bit instead of 52 bit in 64 bit mode. An
> > option would have been to use a type cast while shifting.
> >
> > (Hmm, having your first remark in mind I have to admit that my fix is
> > mainly focused on 64 bit mode not on 64 bit cpu running in 32 bit ...)
> 
> Yes.  So base needs to be come a u64. 

I was afraid you'ld say that.

> So base = ((base_hi << 32) | base_lo) >> PAGE_SHIFT.
> 
> I see where the 44bit limit comes in.  Do you actually have boxes
> with > 16TB?

No, I don't have access to such a box. Would be nice though.

> 
> Regardless it looks like base and possibly size needs to become
> a u64.  At which time the extra >> PAGE_SHIFT could be meaningless.
> Either that or because base and size need to be sized in something like
> megabytes.
> 
> I suspect making it a u64 sized in bytes will get the job done and
> result in simpler code.

Right you are!

> >> > diff --git a/arch/i386/kernel/cpu/mtrr/if.c 
> >> > b/arch/i386/kernel/cpu/mtrr/if.c
> >> > index 5ae1705..3abc3f1 100644
> >> > --- a/arch/i386/kernel/cpu/mtrr/if.c
> >> > +++ b/arch/i386/kernel/cpu/mtrr/if.c
> >> > @@ -137,6 +137,10 @@ mtrr_write(struct file *file, const char __user 
> >> > *buf,
> >> > size_t len, loff_t * ppos)
> >> >  for (i = 0; i < MTRR_NUM_TYPES; ++i) {
> >> >  if (strcmp(ptr, mtrr_strings[i]))
> >> >  continue;
> >> > +#ifndef CONFIG_X86_64
> >> > +if (base > 0xfULL)
> >> > +return -EINVAL;
> >> > +#endif
> >> 
> >> That is just silly.  If the cpu is running in long mode or should
> >> not affect this capability.



> > So I could do one of the following:
> > (1) prepare new patch omitting this silly hunk (-> old behaviour)
> > (2) check for 44 bit address size instead of 36 bit address size to
> > reflect the implicit truncation (-> avoid silent truncation)
> > (3) fix all mtrr code to be able to use up to 52 bit width physical
> > addresses instead of 44 bit ones if running in 32 bit mode on 64 bit
> > cpus.
> >
> > I prefer to do (2).
> > (IMHO those who have the need for n>44 bit width base address in an MTRR
> > should stick to 64 bit mode.)
> 
> I prefer (3).  Since the code is shared between 32 and 64bit mode it
> should behave the same in both.  I know there are people who regularly
> test 32bit kernels on boxes with 128 cpus and 128MB of ram.
> 
> People sometimes want crazy things and since it just a matter of changing
> the type it should be no real work to get the code to work in 32bit mode.

Ok, it is best to do (3).
I will come up with another patch asap.


> Eric


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andreas Herrmann
On Tue, Feb 06, 2007 at 11:54:57AM +0100, Andi Kleen wrote:
> On Tuesday 06 February 2007 10:53, Jan Beulich wrote:
> > >> I don't think I remember a restriction here, at least not below 44 bits
> > >> (that's where pfn-s would need to become 64-bit wide).
> > >
> > >The i386 mm code only supports 4 entries in the PGD, so more than 36bit 
> > >cannot 
> > >be mapped right now.
> > 
> > That has nothing to do with the number of physical address bits.
> 
> You couldn't use the memory in any ways.
> 
> Anyways I give up -- the check is probably not needed, unless Andreas
> comes up with a good reason.

No, I haven't a good reason to restrict the base address to fewer
than 44 bits.

So the question is, should I completely remove that check or adapt it
to check for 44 bit instead of 36 bit?


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andreas Herrmann
On Tue, Feb 06, 2007 at 09:31:45AM +, Jan Beulich wrote:
> >>> Andi Kleen <[EMAIL PROTECTED]> 06.02.07 08:53 >>>
> >On Monday 05 February 2007 23:50, Siddha, Suresh B wrote:
> >> On Mon, Feb 05, 2007 at 06:19:59PM +0100, Andreas Herrmann wrote:
> >> > o added check to restrict base address to 36 bit on i386
> >> 
> >> Why is this? It can go upto implemented physical bits, right?
> >
> >In theory it can, but Linux doesn't support it.
> 
> I don't think I remember a restriction here, at least not below 44 bits
> (that's where pfn-s would need to become 64-bit wide).
> 
> Jan
> 

Hi all,

shame on me.
Wanted to fix and interface issue where base address is truncated
at 44 bit in mtrr_write().

(And just thought 36 bit would be more than enough for that 32 bit
Linux version :)


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andi Kleen
On Tuesday 06 February 2007 10:53, Jan Beulich wrote:
> >> I don't think I remember a restriction here, at least not below 44 bits
> >> (that's where pfn-s would need to become 64-bit wide).
> >
> >The i386 mm code only supports 4 entries in the PGD, so more than 36bit 
> >cannot 
> >be mapped right now.
> 
> That has nothing to do with the number of physical address bits.

You couldn't use the memory in any ways.

Anyways I give up -- the check is probably not needed, unless Andreas
comes up with a good reason.

> 
> >Also even 64MB barely works (many boxes don't boot), you would likely
> >need at least the 4:4 patch to go >64GB. Also we know there are tons
> >of possible deadlocks in various subsystems when the lowmem:highmem ratio 
> >gets so out of hand.
> >
> >Ok it could be probably all fixed with some work (at least the mm part,
> >the deadlocks would be more tricky), but would seem fairly 
> >pointless to me because all machines with >36bits support are 64bit capable.
> 
> That's a different story, and certainly a limiting factor. But this shouldn't
> e.g. disallow (hypothetical?) systems that have a very sparse memory map
> extending beyond 64G.

They would need a discontig kernel to boot most likely, otherwise
mem_map would fill up their memory. 

And I was told Windows doesn't like that, so it's unlikely there will ever be 
such
x86 machines.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Jan Beulich
>> I don't think I remember a restriction here, at least not below 44 bits
>> (that's where pfn-s would need to become 64-bit wide).
>
>The i386 mm code only supports 4 entries in the PGD, so more than 36bit cannot 
>be mapped right now.

That has nothing to do with the number of physical address bits.

>Also even 64MB barely works (many boxes don't boot), you would likely
>need at least the 4:4 patch to go >64GB. Also we know there are tons
>of possible deadlocks in various subsystems when the lowmem:highmem ratio 
>gets so out of hand.
>
>Ok it could be probably all fixed with some work (at least the mm part,
>the deadlocks would be more tricky), but would seem fairly 
>pointless to me because all machines with >36bits support are 64bit capable.

That's a different story, and certainly a limiting factor. But this shouldn't
e.g. disallow (hypothetical?) systems that have a very sparse memory map
extending beyond 64G.

Jan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andi Kleen
On Tuesday 06 February 2007 10:31, Jan Beulich wrote:
> >>> Andi Kleen <[EMAIL PROTECTED]> 06.02.07 08:53 >>>
> >On Monday 05 February 2007 23:50, Siddha, Suresh B wrote:
> >> On Mon, Feb 05, 2007 at 06:19:59PM +0100, Andreas Herrmann wrote:
> >> > o added check to restrict base address to 36 bit on i386
> >> 
> >> Why is this? It can go upto implemented physical bits, right?
> >
> >In theory it can, but Linux doesn't support it.
> 
> I don't think I remember a restriction here, at least not below 44 bits
> (that's where pfn-s would need to become 64-bit wide).

The i386 mm code only supports 4 entries in the PGD, so more than 36bit cannot 
be mapped right now.

Also even 64MB barely works (many boxes don't boot), you would likely
need at least the 4:4 patch to go >64GB. Also we know there are tons
of possible deadlocks in various subsystems when the lowmem:highmem ratio 
gets so out of hand.

Ok it could be probably all fixed with some work (at least the mm part,
the deadlocks would be more tricky), but would seem fairly 
pointless to me because all machines with >36bits support are 64bit capable.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Jan Beulich
>>> Andi Kleen <[EMAIL PROTECTED]> 06.02.07 08:53 >>>
>On Monday 05 February 2007 23:50, Siddha, Suresh B wrote:
>> On Mon, Feb 05, 2007 at 06:19:59PM +0100, Andreas Herrmann wrote:
>> > o added check to restrict base address to 36 bit on i386
>> 
>> Why is this? It can go upto implemented physical bits, right?
>
>In theory it can, but Linux doesn't support it.

I don't think I remember a restriction here, at least not below 44 bits
(that's where pfn-s would need to become 64-bit wide).

Jan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Jan Beulich
 Andi Kleen [EMAIL PROTECTED] 06.02.07 08:53 
On Monday 05 February 2007 23:50, Siddha, Suresh B wrote:
 On Mon, Feb 05, 2007 at 06:19:59PM +0100, Andreas Herrmann wrote:
  o added check to restrict base address to 36 bit on i386
 
 Why is this? It can go upto implemented physical bits, right?

In theory it can, but Linux doesn't support it.

I don't think I remember a restriction here, at least not below 44 bits
(that's where pfn-s would need to become 64-bit wide).

Jan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andi Kleen
On Tuesday 06 February 2007 10:31, Jan Beulich wrote:
  Andi Kleen [EMAIL PROTECTED] 06.02.07 08:53 
 On Monday 05 February 2007 23:50, Siddha, Suresh B wrote:
  On Mon, Feb 05, 2007 at 06:19:59PM +0100, Andreas Herrmann wrote:
   o added check to restrict base address to 36 bit on i386
  
  Why is this? It can go upto implemented physical bits, right?
 
 In theory it can, but Linux doesn't support it.
 
 I don't think I remember a restriction here, at least not below 44 bits
 (that's where pfn-s would need to become 64-bit wide).

The i386 mm code only supports 4 entries in the PGD, so more than 36bit cannot 
be mapped right now.

Also even 64MB barely works (many boxes don't boot), you would likely
need at least the 4:4 patch to go 64GB. Also we know there are tons
of possible deadlocks in various subsystems when the lowmem:highmem ratio 
gets so out of hand.

Ok it could be probably all fixed with some work (at least the mm part,
the deadlocks would be more tricky), but would seem fairly 
pointless to me because all machines with 36bits support are 64bit capable.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Jan Beulich
 I don't think I remember a restriction here, at least not below 44 bits
 (that's where pfn-s would need to become 64-bit wide).

The i386 mm code only supports 4 entries in the PGD, so more than 36bit cannot 
be mapped right now.

That has nothing to do with the number of physical address bits.

Also even 64MB barely works (many boxes don't boot), you would likely
need at least the 4:4 patch to go 64GB. Also we know there are tons
of possible deadlocks in various subsystems when the lowmem:highmem ratio 
gets so out of hand.

Ok it could be probably all fixed with some work (at least the mm part,
the deadlocks would be more tricky), but would seem fairly 
pointless to me because all machines with 36bits support are 64bit capable.

That's a different story, and certainly a limiting factor. But this shouldn't
e.g. disallow (hypothetical?) systems that have a very sparse memory map
extending beyond 64G.

Jan
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andi Kleen
On Tuesday 06 February 2007 10:53, Jan Beulich wrote:
  I don't think I remember a restriction here, at least not below 44 bits
  (that's where pfn-s would need to become 64-bit wide).
 
 The i386 mm code only supports 4 entries in the PGD, so more than 36bit 
 cannot 
 be mapped right now.
 
 That has nothing to do with the number of physical address bits.

You couldn't use the memory in any ways.

Anyways I give up -- the check is probably not needed, unless Andreas
comes up with a good reason.

 
 Also even 64MB barely works (many boxes don't boot), you would likely
 need at least the 4:4 patch to go 64GB. Also we know there are tons
 of possible deadlocks in various subsystems when the lowmem:highmem ratio 
 gets so out of hand.
 
 Ok it could be probably all fixed with some work (at least the mm part,
 the deadlocks would be more tricky), but would seem fairly 
 pointless to me because all machines with 36bits support are 64bit capable.
 
 That's a different story, and certainly a limiting factor. But this shouldn't
 e.g. disallow (hypothetical?) systems that have a very sparse memory map
 extending beyond 64G.

They would need a discontig kernel to boot most likely, otherwise
mem_map would fill up their memory. 

And I was told Windows doesn't like that, so it's unlikely there will ever be 
such
x86 machines.

-Andi
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andreas Herrmann
On Tue, Feb 06, 2007 at 09:31:45AM +, Jan Beulich wrote:
  Andi Kleen [EMAIL PROTECTED] 06.02.07 08:53 
 On Monday 05 February 2007 23:50, Siddha, Suresh B wrote:
  On Mon, Feb 05, 2007 at 06:19:59PM +0100, Andreas Herrmann wrote:
   o added check to restrict base address to 36 bit on i386
  
  Why is this? It can go upto implemented physical bits, right?
 
 In theory it can, but Linux doesn't support it.
 
 I don't think I remember a restriction here, at least not below 44 bits
 (that's where pfn-s would need to become 64-bit wide).
 
 Jan
 

Hi all,

shame on me.
Wanted to fix and interface issue where base address is truncated
at 44 bit in mtrr_write().

(And just thought 36 bit would be more than enough for that 32 bit
Linux version :)


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andreas Herrmann
On Tue, Feb 06, 2007 at 11:54:57AM +0100, Andi Kleen wrote:
 On Tuesday 06 February 2007 10:53, Jan Beulich wrote:
   I don't think I remember a restriction here, at least not below 44 bits
   (that's where pfn-s would need to become 64-bit wide).
  
  The i386 mm code only supports 4 entries in the PGD, so more than 36bit 
  cannot 
  be mapped right now.
  
  That has nothing to do with the number of physical address bits.
 
 You couldn't use the memory in any ways.
 
 Anyways I give up -- the check is probably not needed, unless Andreas
 comes up with a good reason.

No, I haven't a good reason to restrict the base address to fewer
than 44 bits.

So the question is, should I completely remove that check or adapt it
to check for 44 bit instead of 36 bit?


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Andreas Herrmann
On Tue, Feb 06, 2007 at 10:54:23AM -0700, [EMAIL PROTECTED] wrote:
 Andreas Herrmann [EMAIL PROTECTED] writes:
  On Mon, Feb 05, 2007 at 05:26:12PM -0700, [EMAIL PROTECTED] wrote:
  Andreas Herrmann [EMAIL PROTECTED] writes:
  
  The limit is per cpu not per architecture.  So if you run a
  cpu that can run in 64bit mode in 32bit mode the limit
  is not 36 bits.  Even PAE in 32bit mode doesn't have that limit.
  
  Good point.
 
  I totally ignored that on 64 bit cpus in legacy mode
  - PAE-paging means up to 52 physical address bits respectively
  physical address size of the underlying implementation
  - for non-PAE-paging with PSE enabled we have 40 bits for AMD and
  with PSE36 36 bits for Intel
 
 For non PAE-paging you have 32bits.

You are referring to current Linux implementation?
The AMD64 architecture increased physical address size in PSE mode to
40 bits. So at least it would be possible to use more than 32 bits.

   diff --git a/arch/i386/kernel/cpu/mtrr/generic.c
   b/arch/i386/kernel/cpu/mtrr/generic.c
   index f77fc53..aa21d15 100644
   --- a/arch/i386/kernel/cpu/mtrr/generic.c
   +++ b/arch/i386/kernel/cpu/mtrr/generic.c
   @@ -172,7 +172,7 @@ int generic_get_free_region(unsigned long base, 
   unsigned
   long size, int replace_
static void generic_get_mtrr(unsigned int reg, unsigned long *base,
 unsigned long *size, mtrr_type *type)
{
   -unsigned int mask_lo, mask_hi, base_lo, base_hi;
   +unsigned long mask_lo, mask_hi, base_lo, base_hi;
  
  Why?  Given the low and the high I am assuming these are all implicitly
  32bit quantities.  unsigned int is fine.
 
  It is not, please refer to the function body, e.g.
 
  *base = base_hi  (32 - PAGE_SHIFT) | base_lo  PAGE_SHIFT;
 
  All leading 20 bits of unsigned int base_hi are snipped away. Thus
  limiting base to use 44 bit instead of 52 bit in 64 bit mode. An
  option would have been to use a type cast while shifting.
 
  (Hmm, having your first remark in mind I have to admit that my fix is
  mainly focused on 64 bit mode not on 64 bit cpu running in 32 bit ...)
 
 Yes.  So base needs to be come a u64. 

I was afraid you'ld say that.

 So base = ((base_hi  32) | base_lo)  PAGE_SHIFT.
 
 I see where the 44bit limit comes in.  Do you actually have boxes
 with  16TB?

No, I don't have access to such a box. Would be nice though.

 
 Regardless it looks like base and possibly size needs to become
 a u64.  At which time the extra  PAGE_SHIFT could be meaningless.
 Either that or because base and size need to be sized in something like
 megabytes.
 
 I suspect making it a u64 sized in bytes will get the job done and
 result in simpler code.

Right you are!

   diff --git a/arch/i386/kernel/cpu/mtrr/if.c 
   b/arch/i386/kernel/cpu/mtrr/if.c
   index 5ae1705..3abc3f1 100644
   --- a/arch/i386/kernel/cpu/mtrr/if.c
   +++ b/arch/i386/kernel/cpu/mtrr/if.c
   @@ -137,6 +137,10 @@ mtrr_write(struct file *file, const char __user 
   *buf,
   size_t len, loff_t * ppos)
for (i = 0; i  MTRR_NUM_TYPES; ++i) {
if (strcmp(ptr, mtrr_strings[i]))
continue;
   +#ifndef CONFIG_X86_64
   +if (base  0xfULL)
   +return -EINVAL;
   +#endif
  
  That is just silly.  If the cpu is running in long mode or should
  not affect this capability.

snip

  So I could do one of the following:
  (1) prepare new patch omitting this silly hunk (- old behaviour)
  (2) check for 44 bit address size instead of 36 bit address size to
  reflect the implicit truncation (- avoid silent truncation)
  (3) fix all mtrr code to be able to use up to 52 bit width physical
  addresses instead of 44 bit ones if running in 32 bit mode on 64 bit
  cpus.
 
  I prefer to do (2).
  (IMHO those who have the need for n44 bit width base address in an MTRR
  should stick to 64 bit mode.)
 
 I prefer (3).  Since the code is shared between 32 and 64bit mode it
 should behave the same in both.  I know there are people who regularly
 test 32bit kernels on boxes with 128 cpus and 128MB of ram.
 
 People sometimes want crazy things and since it just a matter of changing
 the type it should be no real work to get the code to work in 32bit mode.

Ok, it is best to do (3).
I will come up with another patch asap.


 Eric


Regards,

Andreas

-- 
AMD Saxony, Dresden, Germany
Operating System Research Center



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Eric W. Biederman
Andreas Herrmann [EMAIL PROTECTED] writes:

 On Tue, Feb 06, 2007 at 10:54:23AM -0700, [EMAIL PROTECTED] wrote:
 Andreas Herrmann [EMAIL PROTECTED] writes:
  On Mon, Feb 05, 2007 at 05:26:12PM -0700, [EMAIL PROTECTED] wrote:
  Andreas Herrmann [EMAIL PROTECTED] writes:
  
  The limit is per cpu not per architecture.  So if you run a
  cpu that can run in 64bit mode in 32bit mode the limit
  is not 36 bits.  Even PAE in 32bit mode doesn't have that limit.
  
  Good point.
 
  I totally ignored that on 64 bit cpus in legacy mode
  - PAE-paging means up to 52 physical address bits respectively
  physical address size of the underlying implementation
  - for non-PAE-paging with PSE enabled we have 40 bits for AMD and
  with PSE36 36 bits for Intel
 
 For non PAE-paging you have 32bits.

 You are referring to current Linux implementation?
 The AMD64 architecture increased physical address size in PSE mode to
 40 bits. So at least it would be possible to use more than 32 bits.

How do you get 40 physical bits in a 32bit page table entry? My memory
is that the low bits in the page table entry were well defined and
accounted for. I'm pretty certain I can account for 6 of the low bits
off the top of my head.  PSE is the page size extension allowing pages 2MB/4MB
pages.  

PAE (physical address extension) gives you a 64bit page table entry
and where you have a place for all of those extra physical bits kick
as I recall.  The limit is 52 bits and current cpus talk about support
40 bits with AMD in the process of going to 48 bits.

Is there a feature I have overlooked?  
That would allow 40 bits with PSE?

 
 Yes.  So base needs to be come a u64. 

 I was afraid you'ld say that.

 So base = ((base_hi  32) | base_lo)  PAGE_SHIFT.
 
 I see where the 44bit limit comes in.  Do you actually have boxes
 with  16TB?

 No, I don't have access to such a box. Would be nice though.

 
 Regardless it looks like base and possibly size needs to become
 a u64.  At which time the extra  PAGE_SHIFT could be meaningless.
 Either that or because base and size need to be sized in something like
 megabytes.
 
 I suspect making it a u64 sized in bytes will get the job done and
 result in simpler code.

 Right you are!

 Ok, it is best to do (3).
 I will come up with another patch asap.

Thanks.  Sorry for being a pain.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Joerg Roedel
On Tue, Feb 06, 2007 at 12:08:12PM -0700, [EMAIL PROTECTED] wrote:
 Andreas Herrmann [EMAIL PROTECTED] writes:
  You are referring to current Linux implementation?
  The AMD64 architecture increased physical address size in PSE mode to
  40 bits. So at least it would be possible to use more than 32 bits.
 
 How do you get 40 physical bits in a 32bit page table entry? My memory
 is that the low bits in the page table entry were well defined and
 accounted for. I'm pretty certain I can account for 6 of the low bits
 off the top of my head.  PSE is the page size extension allowing pages 2MB/4MB
 pages.

The access to 40 physical address bits is only possible using large pages
(4MB on 32bit without PAE). In those page tables entrys you only use
bits 22:31 for encoding the physical address. The bits 12:21 are
unused. These unused bits are reused to encode bits 32:39 of the 40 bit
physical address.

Joerg

-- 
Joerg Roedel
Operating System Research Center
AMD Saxony LLC  Co. KG


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [discuss] [patch] mtrr: fix issues with large addresses

2007-02-06 Thread Eric W. Biederman
Joerg Roedel [EMAIL PROTECTED] writes:

 On Tue, Feb 06, 2007 at 12:08:12PM -0700, [EMAIL PROTECTED] wrote:
 Andreas Herrmann [EMAIL PROTECTED] writes:
  You are referring to current Linux implementation?
  The AMD64 architecture increased physical address size in PSE mode to
  40 bits. So at least it would be possible to use more than 32 bits.
 
 How do you get 40 physical bits in a 32bit page table entry? My memory
 is that the low bits in the page table entry were well defined and
 accounted for. I'm pretty certain I can account for 6 of the low bits
 off the top of my head.  PSE is the page size extension allowing pages 
 2MB/4MB
 pages.

 The access to 40 physical address bits is only possible using large pages
 (4MB on 32bit without PAE). In those page tables entrys you only use
 bits 22:31 for encoding the physical address. The bits 12:21 are
 unused. These unused bits are reused to encode bits 32:39 of the 40 bit
 physical address.

Yep.  I missed that feature, and I do see it in AMD documentation now
that I look.

I'm not certain what that would be useful for though.

I'm pretty certain doesn't use this feature, we just enable PAE mode.

Eric
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/