from:"Bob Picco"

Re: [PATCH v12 00/11] complete deferred page initialization

2017-10-13 Thread Bob Picco

Pavel Tatashin wrote:   [Fri Oct 13 2017, 01:32:03PM EDT]
> Changelog:
> v12 - v11
> - Improved comments for mm: zero reserved and unavailable struct pages
> - Added back patch: mm: deferred_init_memmap improvements
> - Added patch from Will Deacon: arm64: kasan: Avoid using
>   vmemmap_populate to initialise shadow
[...]
> Pavel Tatashin (10):
>   mm: deferred_init_memmap improvements
>   x86/mm: setting fields in deferred pages
>   sparc64/mm: setting fields in deferred pages
>   sparc64: simplify vmemmap_populate
>   mm: defining memblock_virt_alloc_try_nid_raw
>   mm: zero reserved and unavailable struct pages
>   x86/kasan: add and use kasan_map_populate()
>   arm64/kasan: add and use kasan_map_populate()
>   mm: stop zeroing memory during allocation in vmemmap
>   sparc64: optimized struct page zeroing
> 
> Will Deacon (1):
>   arm64: kasan: Avoid using vmemmap_populate to initialise shadow
> 
>  arch/arm64/Kconfig  |   2 +-
>  arch/arm64/mm/kasan_init.c  | 130 +
>  arch/sparc/include/asm/pgtable_64.h |  30 +
>  arch/sparc/mm/init_64.c |  32 +++---
>  arch/x86/mm/init_64.c   |  10 +-
>  arch/x86/mm/kasan_init_64.c |  75 +++-
>  include/linux/bootmem.h |  27 +
>  include/linux/memblock.h|  16 +++
>  include/linux/mm.h  |  26 +
>  mm/memblock.c   |  60 --
>  mm/page_alloc.c | 224 
> +---
>  mm/sparse-vmemmap.c |  15 ++-
>  mm/sparse.c |   6 +-
>  13 files changed, 469 insertions(+), 184 deletions(-)
> 
> -- 
> 2.14.2
> 
Boot tested on ThunderX2 VM.
Tested-by: Bob Picco <bob.pi...@oracle.com>

Re: [PATCH v12 00/11] complete deferred page initialization

2017-10-13 Thread Bob Picco

Pavel Tatashin wrote:   [Fri Oct 13 2017, 01:32:03PM EDT]
> Changelog:
> v12 - v11
> - Improved comments for mm: zero reserved and unavailable struct pages
> - Added back patch: mm: deferred_init_memmap improvements
> - Added patch from Will Deacon: arm64: kasan: Avoid using
>   vmemmap_populate to initialise shadow
[...]
> Pavel Tatashin (10):
>   mm: deferred_init_memmap improvements
>   x86/mm: setting fields in deferred pages
>   sparc64/mm: setting fields in deferred pages
>   sparc64: simplify vmemmap_populate
>   mm: defining memblock_virt_alloc_try_nid_raw
>   mm: zero reserved and unavailable struct pages
>   x86/kasan: add and use kasan_map_populate()
>   arm64/kasan: add and use kasan_map_populate()
>   mm: stop zeroing memory during allocation in vmemmap
>   sparc64: optimized struct page zeroing
> 
> Will Deacon (1):
>   arm64: kasan: Avoid using vmemmap_populate to initialise shadow
> 
>  arch/arm64/Kconfig  |   2 +-
>  arch/arm64/mm/kasan_init.c  | 130 +
>  arch/sparc/include/asm/pgtable_64.h |  30 +
>  arch/sparc/mm/init_64.c |  32 +++---
>  arch/x86/mm/init_64.c   |  10 +-
>  arch/x86/mm/kasan_init_64.c |  75 +++-
>  include/linux/bootmem.h |  27 +
>  include/linux/memblock.h|  16 +++
>  include/linux/mm.h  |  26 +
>  mm/memblock.c   |  60 --
>  mm/page_alloc.c | 224 
> +---
>  mm/sparse-vmemmap.c |  15 ++-
>  mm/sparse.c |   6 +-
>  13 files changed, 469 insertions(+), 184 deletions(-)
> 
> -- 
> 2.14.2
> 
Boot tested on ThunderX2 VM.
Tested-by: Bob Picco

Re: 4.0.0-rc4: panic in free_block

2015-03-24 Thread Bob Picco

David Miller wrote: [Mon Mar 23 2015, 12:25:30PM EDT]
> From: David Miller 
> Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)
> 
> > I'll work on a fix.
> 
> Ok, here is what I committed.   David et al., let me know if you still
> see the crashes with this applied.
> 
> Of course, I'll queue this up for -stable as well.
> 
> Thanks!
> 
> 
> [PATCH] sparc64: Fix several bugs in memmove().
> 
> Firstly, handle zero length calls properly.  Believe it or not there
> are a few of these happening during early boot.
> 
> Next, we can't just drop to a memcpy() call in the forward copy case
> where dst <= src.  The reason is that the cache initializing stores
> used in the Niagara memcpy() implementations can end up clearing out
> cache lines before we've sourced their original contents completely.
> 
> For example, considering NG4memcpy, the main unrolled loop begins like
> this:
> 
>  load   src + 0x00
>  load   src + 0x08
>  load   src + 0x10
>  load   src + 0x18
>  load   src + 0x20
>  store  dst + 0x00
> 
> Assume dst is 64 byte aligned and let's say that dst is src - 8 for
> this memcpy() call.  That store at the end there is the one to the
> first line in the cache line, thus clearing the whole line, which thus
> clobbers "src + 0x28" before it even gets loaded.
> 
> To avoid this, just fall through to a simple copy only mildly
> optimized for the case where src and dst are 8 byte aligned and the
> length is a multiple of 8 as well.  We could get fancy and call
> GENmemcpy() but this is good enough for how this thing is actually
> used.
> 
> Reported-by: David Ahern 
> Reported-by: Bob Picco 
> Signed-off-by: David S. Miller 
> ---
Seems solid with 2.6.39 on M7-4. Jalap?no is happy with current sparc.git.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 4.0.0-rc4: panic in free_block

2015-03-24 Thread Bob Picco

David Miller wrote: [Mon Mar 23 2015, 12:25:30PM EDT]
 From: David Miller da...@davemloft.net
 Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)

  I'll work on a fix.

 Ok, here is what I committed.   David et al., let me know if you still
 see the crashes with this applied.

 Of course, I'll queue this up for -stable as well.

 Thanks!

 [PATCH] sparc64: Fix several bugs in memmove().

 Firstly, handle zero length calls properly.  Believe it or not there
 are a few of these happening during early boot.

 Next, we can't just drop to a memcpy() call in the forward copy case
 where dst = src.  The reason is that the cache initializing stores
 used in the Niagara memcpy() implementations can end up clearing out
 cache lines before we've sourced their original contents completely.

 For example, considering NG4memcpy, the main unrolled loop begins like
 this:

  load   src + 0x00
  load   src + 0x08
  load   src + 0x10
  load   src + 0x18
  load   src + 0x20
  store  dst + 0x00

 Assume dst is 64 byte aligned and let's say that dst is src - 8 for
 this memcpy() call.  That store at the end there is the one to the
 first line in the cache line, thus clearing the whole line, which thus
 clobbers src + 0x28 before it even gets loaded.

 To avoid this, just fall through to a simple copy only mildly
 optimized for the case where src and dst are 8 byte aligned and the
 length is a multiple of 8 as well.  We could get fancy and call
 GENmemcpy() but this is good enough for how this thing is actually
 used.

 Reported-by: David Ahern david.ah...@oracle.com
 Reported-by: Bob Picco bpi...@meloft.net
 Signed-off-by: David S. Miller da...@davemloft.net
 ---
Seems solid with 2.6.39 on M7-4. Jalap?no is happy with current sparc.git.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 4.0.0-rc4: panic in free_block

2015-03-22 Thread Bob Picco

David Miller wrote: [Sun Mar 22 2015, 01:36:03PM EDT]
> From: Linus Torvalds 
> Date: Sat, 21 Mar 2015 11:49:12 -0700
> 
> > Davem? I don't read sparc assembly, so I'm *really* not going to try
> > to verify that (a) all the memcpy implementations always copy
> > low-to-high and (b) that I even read the address comparisons in
> > memmove.S right.
> 
> All of the sparc memcpy implementations copy from low to high.
> I'll eat my hat if they don't. :-)
> 
> The guard tests at the beginning of memmove() are saying:
> 
>   if (dst <= src)
>   memcpy(...);
>   if (src + len <= dst)
>   memcpy(...);
> 
> And then the reverse copy loop (and we do have to copy in reverse for
> correctness) is basically:
> 
>   src = (src + len - 1);
>   dst = (dst + len - 1);
> 
> 1:tmp = *(u8 *)src;
>   len -= 1;
>   src -= 1;
>   *(u8 *)dst = tmp;
>   dst -= 1;
>   if (len != 0)
>   goto 1b;
> 
> And then we return the original 'dst' pointer.
> 
> So at first glance it looks at least correct.
> 
> memmove() is a good idea to look into though, as SLAB and SLUB are the
> only really heavy users of it, and they do so with overlapping
> contents.
> 
> And they end up using that byte-at-a-time code, since SLAB and SLUB
> do mmemove() calls of the form:
> 
>   memmove(X + N, X, LEN);
> 
> In which case neither of the memcpy() guard tests will pass.
> 
> Maybe there is some subtle bug in there I just don't see right now.
My original pursuit of this issue focused on transfers to and from the shared
array. Basically substituting memcpy-s with a primitive unsigned long memory
mover. This might have been incorrect.

There were substantial doubts because of large modifications to 2.6.39 too.
Unstabile hardware cause(d|s) issue too.

Eliminating the shared array functions correctly. Though this removal changes
performance and timing dramatically.

This afternoon I included modification of two memmove-s and no issue thus far.
The issue APPEARS to come from memmove-s within cache_flusharray() and/or
drain_array(). Now we are covering moves within an array_cache.

The above was done on 2.6.39.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 4.0.0-rc4: panic in free_block

2015-03-22 Thread Bob Picco

David Miller wrote: [Sun Mar 22 2015, 01:36:03PM EDT]
 From: Linus Torvalds torva...@linux-foundation.org
 Date: Sat, 21 Mar 2015 11:49:12 -0700

  Davem? I don't read sparc assembly, so I'm *really* not going to try
  to verify that (a) all the memcpy implementations always copy
  low-to-high and (b) that I even read the address comparisons in
  memmove.S right.

 All of the sparc memcpy implementations copy from low to high.
 I'll eat my hat if they don't. :-)

 The guard tests at the beginning of memmove() are saying:

   if (dst = src)
   memcpy(...);
   if (src + len = dst)
   memcpy(...);

 And then the reverse copy loop (and we do have to copy in reverse for
 correctness) is basically:

   src = (src + len - 1);
   dst = (dst + len - 1);

 1:tmp = *(u8 *)src;
   len -= 1;
   src -= 1;
   *(u8 *)dst = tmp;
   dst -= 1;
   if (len != 0)
   goto 1b;

 And then we return the original 'dst' pointer.

 So at first glance it looks at least correct.

 memmove() is a good idea to look into though, as SLAB and SLUB are the
 only really heavy users of it, and they do so with overlapping
 contents.

 And they end up using that byte-at-a-time code, since SLAB and SLUB
 do mmemove() calls of the form:

   memmove(X + N, X, LEN);

 In which case neither of the memcpy() guard tests will pass.

 Maybe there is some subtle bug in there I just don't see right now.
My original pursuit of this issue focused on transfers to and from the shared
array. Basically substituting memcpy-s with a primitive unsigned long memory
mover. This might have been incorrect.

There were substantial doubts because of large modifications to 2.6.39 too.
Unstabile hardware cause(d|s) issue too.

Eliminating the shared array functions correctly. Though this removal changes
performance and timing dramatically.

This afternoon I included modification of two memmove-s and no issue thus far.
The issue APPEARS to come from memmove-s within cache_flusharray() and/or
drain_array(). Now we are covering moves within an array_cache.

The above was done on 2.6.39.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc7-mm1 ia64 build issue in efi.c

2007-09-24 Thread Bob Picco


There isn't a total_memory identifier within this function's scope. The
patch was compile/link tested.

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 arch/ia64/kernel/efi.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c
===
--- linux-2.6.23-rc7-mm1.orig/arch/ia64/kernel/efi.c2007-09-24 
09:54:40.0 -0400
+++ linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c 2007-09-24 10:50:51.0 
-0400
@@ -1085,7 +1085,7 @@ efi_memmap_init(unsigned long *s, unsign
*s = (u64)kern_memmap;
*e = (u64)++k;
 
-   return total_memory;
+   return total_mem;
 }
 
 void
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.23-rc7-mm1 ia64 build issue in efi.c

2007-09-24 Thread Bob Picco


There isn't a total_memory identifier within this function's scope. The
patch was compile/link tested.

Signed-off-by: Bob Picco [EMAIL PROTECTED]

 arch/ia64/kernel/efi.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c
===
--- linux-2.6.23-rc7-mm1.orig/arch/ia64/kernel/efi.c2007-09-24 
09:54:40.0 -0400
+++ linux-2.6.23-rc7-mm1/arch/ia64/kernel/efi.c 2007-09-24 10:50:51.0 
-0400
@@ -1085,7 +1085,7 @@ efi_memmap_init(unsigned long *s, unsign
*s = (u64)kern_memmap;
*e = (u64)++k;
 
-   return total_memory;
+   return total_mem;
 }
 
 void
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.23-rc5] driver/char/hpet.c: remove clocksourcewarning on !IA64

2007-09-03 Thread Bob Picco

luck wrote: [Sun Sep 02 2007, 03:13:45AM EDT]
> > This patch eliminates the warnings when the clocksoure isn't used.
> > It also removes some other unused stuff that goes along with the
> > clocksource ..
> >
> > I don't have access to an ia64 machine, or even a compiler .. So this
> > one is untested for that. I'm hoping one of the ia64 guys can help
> > test this
> 
> Well, it does *compile* cleanly for ia64 [none of my test systems actually
> have a functional HPET ... the closest I get is one that fails the hd_nirqs
> part of the "if (!data.hd_address || !data.hd_nirqs) {" test in 
> hpet_acpi_add].
> 
> Bob: Does this work for you?
I boot tested on the HP internal platform simulator. Should it be
necessary, I could have it boot tested later in the week on real
hardware.

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.23-rc5] driver/char/hpet.c: remove clocksourcewarning on !IA64

2007-09-03 Thread Bob Picco

luck wrote: [Sun Sep 02 2007, 03:13:45AM EDT]
  This patch eliminates the warnings when the clocksoure isn't used.
  It also removes some other unused stuff that goes along with the
  clocksource ..
 
  I don't have access to an ia64 machine, or even a compiler .. So this
  one is untested for that. I'm hoping one of the ia64 guys can help
  test this
 
 Well, it does *compile* cleanly for ia64 [none of my test systems actually
 have a functional HPET ... the closest I get is one that fails the hd_nirqs
 part of the if (!data.hd_address || !data.hd_nirqs) { test in 
 hpet_acpi_add].
 
 Bob: Does this work for you?
I boot tested on the HP internal platform simulator. Should it be
necessary, I could have it boot tested later in the week on real
hardware.

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: "double" hpet clocksource && hard freeze [bisected]

2007-08-24 Thread Bob Picco

john stultz wrote:  [Thu Aug 23 2007, 05:41:45PM EDT]
> On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
> > On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > > I have a double "hpet" entry in "available_clocksource":
> > > > $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > > > tsc hpet hpet acpi_pm jiffies
> > > 
> > > Oops.  If seems that both drivers/char/hpet.c and 
> > > arch/x86_64/kernel/hpet.c
> > > both register a clocksource named "hpet".  Probably a result of bringing
> > > back to life a long lost patch, and having someone else (John Stultz, 
> > > according
> > > to git blame) make a similar change to a different file in the intervening
> > > time.
> > > 
> > > Presumably the thing to do would be merge the x86_64 specific version
> > > into the drivers/char/hpet.c version?
> > 
> > Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> > duplication, but at the moment I'm not comfortable that the
> > driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> > why the shift value is only 10?).
> > 
> > 
> > I'm a little surprised by this, as the clocksource code use to prevent
> > duplicate named clocksources from being registered, so I'm not sure how
> > that check got dropped.  Also I'm not quite sure I see where the hard
> > freeze is coming from.
> > 
> > My initial reaction would be to either ifdef ia64 implementation in
> > drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> > really usable by all arches.
> 
> Here is a possible quick fix. I'm open to other approaches, but I also
> want to avoid too much churn before 2.6.23 goes out.
> 
> Paolo, could you verify this fixes the issue for you?
> 
> thanks
> -john
> 
[snip]

I saw what was missed by me in my brief examination of this last night.
The platform registers the hpet clocksource too.

Instead of adding the config flag to hpet driver, how about the patch
below? Since you already check for duplication by address then adding
a check for by name too seems okay to me.

bob


Prevent duplicate names being registered with clocksource. This also
eliminates the duplication of hpet clock registration when the arch
uses the hpet timer and the hpet driver does too. The patch was
compile and link tested.


Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 kernel/time/clocksource.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc3/kernel/time/clocksource.c
===
--- linux-2.6.23-rc3.orig/kernel/time/clocksource.c 2007-08-23 
16:44:03.0 -0400
+++ linux-2.6.23-rc3/kernel/time/clocksource.c  2007-08-24 08:36:41.0 
-0400
@@ -281,7 +281,7 @@ static int clocksource_enqueue(struct cl
struct clocksource *cs;
 
cs = list_entry(tmp, struct clocksource, list);
-   if (cs == c)
+   if (cs == c || !strcmp(cs->name, c->name))
return -EBUSY;
/* Keep track of the place, where to insert */
if (cs->rating >= c->rating)
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: double hpet clocksource hard freeze [bisected]

2007-08-24 Thread Bob Picco

john stultz wrote:  [Thu Aug 23 2007, 05:41:45PM EDT]
 On Thu, 2007-08-23 at 14:05 -0700, john stultz wrote:
  On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
I have a double hpet entry in available_clocksource:
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet hpet acpi_pm jiffies
   
   Oops.  If seems that both drivers/char/hpet.c and 
   arch/x86_64/kernel/hpet.c
   both register a clocksource named hpet.  Probably a result of bringing
   back to life a long lost patch, and having someone else (John Stultz, 
   according
   to git blame) make a similar change to a different file in the intervening
   time.
   
   Presumably the thing to do would be merge the x86_64 specific version
   into the drivers/char/hpet.c version?
  
  Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
  duplication, but at the moment I'm not comfortable that the
  driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
  why the shift value is only 10?).
  
  
  I'm a little surprised by this, as the clocksource code use to prevent
  duplicate named clocksources from being registered, so I'm not sure how
  that check got dropped.  Also I'm not quite sure I see where the hard
  freeze is coming from.
  
  My initial reaction would be to either ifdef ia64 implementation in
  drivers/char/hpet.c or move the code under the ia64 arch dir until it is
  really usable by all arches.
 
 Here is a possible quick fix. I'm open to other approaches, but I also
 want to avoid too much churn before 2.6.23 goes out.
 
 Paolo, could you verify this fixes the issue for you?
 
 thanks
 -john
 
[snip]

I saw what was missed by me in my brief examination of this last night.
The platform registers the hpet clocksource too.

Instead of adding the config flag to hpet driver, how about the patch
below? Since you already check for duplication by address then adding
a check for by name too seems okay to me.

bob


Prevent duplicate names being registered with clocksource. This also
eliminates the duplication of hpet clock registration when the arch
uses the hpet timer and the hpet driver does too. The patch was
compile and link tested.


Signed-off-by: Bob Picco [EMAIL PROTECTED]

 kernel/time/clocksource.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc3/kernel/time/clocksource.c
===
--- linux-2.6.23-rc3.orig/kernel/time/clocksource.c 2007-08-23 
16:44:03.0 -0400
+++ linux-2.6.23-rc3/kernel/time/clocksource.c  2007-08-24 08:36:41.0 
-0400
@@ -281,7 +281,7 @@ static int clocksource_enqueue(struct cl
struct clocksource *cs;
 
cs = list_entry(tmp, struct clocksource, list);
-   if (cs == c)
+   if (cs == c || !strcmp(cs-name, c-name))
return -EBUSY;
/* Keep track of the place, where to insert */
if (cs-rating = c-rating)
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: "double" hpet clocksource && hard freeze [bisected]

2007-08-23 Thread Bob Picco

john stultz wrote:  [Thu Aug 23 2007, 05:05:35PM EDT]
> On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
> > > I have a double "hpet" entry in "available_clocksource":
> > >   $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
> > >   tsc hpet hpet acpi_pm jiffies
> > 
> > Oops.  If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
> > both register a clocksource named "hpet".  Probably a result of bringing
> > back to life a long lost patch, and having someone else (John Stultz, 
> > according
> > to git blame) make a similar change to a different file in the intervening
> > time.
> > 
> > Presumably the thing to do would be merge the x86_64 specific version
> > into the drivers/char/hpet.c version?
> 
> Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
> duplication, but at the moment I'm not comfortable that the
> driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
> why the shift value is only 10?).
No I don't have a clue why Pete chose this value.
> 
> 
> I'm a little surprised by this, as the clocksource code use to prevent
> duplicate named clocksources from being registered, so I'm not sure how
> that check got dropped.  Also I'm not quite sure I see where the hard
> freeze is coming from.
> 
> My initial reaction would be to either ifdef ia64 implementation in
> drivers/char/hpet.c or move the code under the ia64 arch dir until it is
> really usable by all arches.
> 
> Bob, your thoughts?
It appears the ACPI for this platform might work. We don't know because
of a hpet driver probe error discussed below. I assume you're suggesting
the driver is only required by ia64? I think that might not be true.

Well I'm slightly confused. The fs_initcall was first into hpet_alloc.
It appears ACPI discovery failed during driver initialization because
of:
hpet_resources: 0xfed0 is busy
from dmesg. So why do we have a second hpet registered? Also hpet_alloc
is suspose to check for redundant registration. I need to look more
tomorrow.
> 
bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: double hpet clocksource hard freeze [bisected]

2007-08-23 Thread Bob Picco

john stultz wrote:  [Thu Aug 23 2007, 05:05:35PM EDT]
 On Thu, 2007-08-23 at 13:41 -0700, Luck, Tony wrote:
   I have a double hpet entry in available_clocksource:
 $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
 tsc hpet hpet acpi_pm jiffies
  
  Oops.  If seems that both drivers/char/hpet.c and arch/x86_64/kernel/hpet.c
  both register a clocksource named hpet.  Probably a result of bringing
  back to life a long lost patch, and having someone else (John Stultz, 
  according
  to git blame) make a similar change to a different file in the intervening
  time.
  
  Presumably the thing to do would be merge the x86_64 specific version
  into the drivers/char/hpet.c version?
 
 Ugh. Yea. i386 has an hpet clocksource as well. We should kill the
 duplication, but at the moment I'm not comfortable that the
 driver/char/hpet.c is ok to be used for i386/x86_64 (Bob: Do you know
 why the shift value is only 10?).
No I don't have a clue why Pete chose this value.
 
 
 I'm a little surprised by this, as the clocksource code use to prevent
 duplicate named clocksources from being registered, so I'm not sure how
 that check got dropped.  Also I'm not quite sure I see where the hard
 freeze is coming from.
 
 My initial reaction would be to either ifdef ia64 implementation in
 drivers/char/hpet.c or move the code under the ia64 arch dir until it is
 really usable by all arches.
 
 Bob, your thoughts?
It appears the ACPI for this platform might work. We don't know because
of a hpet driver probe error discussed below. I assume you're suggesting
the driver is only required by ia64? I think that might not be true.

Well I'm slightly confused. The fs_initcall was first into hpet_alloc.
It appears ACPI discovery failed during driver initialization because
of:
hpet_resources: 0xfed0 is busy
from dmesg. So why do we have a second hpet registered? Also hpet_alloc
is suspose to check for redundant registration. I need to look more
tomorrow.
 
bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linus 2.6.23-rc1

2007-07-23 Thread Bob Picco

Gabriel C wrote:[Sun Jul 22 2007, 10:00:39PM EDT]
> Linus Torvalds wrote:
> > Ok, right on time, two weeks afetr 2.6.22, there's a 2.6.23-rc1 out there.
> 
> 
> ...
> 
> drivers/char/hpet.c:76: warning: integer constant is too large for 'long' type
> 
> ...
> 
> Introduced by 0aa366f351d044703e25c8425e508170e80d83b1 
> 
> 
> 
> 

Sorry about that. I thought my review had caught all of these.

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 drivers/char/hpet.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc1/drivers/char/hpet.c
===
--- linux-2.6.23-rc1.orig/drivers/char/hpet.c   2007-07-23 10:08:58.0 
-0400
+++ linux-2.6.23-rc1/drivers/char/hpet.c2007-07-23 11:46:12.0 
-0400
@@ -73,7 +73,7 @@ static struct clocksource clocksource_hp
 .name   = "hpet",
 .rating = 250,
 .read   = read_hpet,
-.mask   = 0x,
+.mask   = 0xULL,
 .mult   = 0, /*to be caluclated*/
 .shift  = 10,
 .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Linus 2.6.23-rc1

2007-07-23 Thread Bob Picco

Gabriel C wrote:[Sun Jul 22 2007, 10:00:39PM EDT]
 Linus Torvalds wrote:
  Ok, right on time, two weeks afetr 2.6.22, there's a 2.6.23-rc1 out there.
 
 
 ...
 
 drivers/char/hpet.c:76: warning: integer constant is too large for 'long' type
 
 ...
 
 Introduced by 0aa366f351d044703e25c8425e508170e80d83b1 
 
 
 
 

Sorry about that. I thought my review had caught all of these.

Signed-off-by: Bob Picco [EMAIL PROTECTED]

 drivers/char/hpet.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.23-rc1/drivers/char/hpet.c
===
--- linux-2.6.23-rc1.orig/drivers/char/hpet.c   2007-07-23 10:08:58.0 
-0400
+++ linux-2.6.23-rc1/drivers/char/hpet.c2007-07-23 11:46:12.0 
-0400
@@ -73,7 +73,7 @@ static struct clocksource clocksource_hp
 .name   = hpet,
 .rating = 250,
 .read   = read_hpet,
-.mask   = 0x,
+.mask   = 0xULL,
 .mult   = 0, /*to be caluclated*/
 .shift  = 10,
 .flags  = CLOCK_SOURCE_IS_CONTINUOUS,
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-06-20-10-12.tar.gz uploaded

2007-06-20 Thread Bob Picco

Randy Dunlap wrote: [Wed Jun 20 2007, 09:07:11PM EDT]
> On Wed, 20 Jun 2007 20:51:22 -0400 Bob Picco wrote:
> 
> > [EMAIL PROTECTED] wrote:[Wed Jun 20 2007, 01:14:34PM EDT]
> > [snip]
> > 
> > Build breakage. pci_mmcfg_late_init is for i386.
> 
> then you want CONFIG_X86_32 instead of CONFIG_X86.
> CONFIG_X86 is set/true for both X86_32 and X86_64.
Then what I stated within the patch description is incorrect. pci.h which is the
required include for the declaration is conditionally for CONFIG_X86. So it is
both I guess?

bob
> 
> 
> > Signed-off-by: Bob Picco <[EMAIL PROTECTED]>
> > 
> >  drivers/acpi/bus.c |2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > Index: linux-2.6.22-rc5-mm1/drivers/acpi/bus.c
> > ===
> > --- linux-2.6.22-rc5-mm1.orig/drivers/acpi/bus.c2007-06-20 
> > 14:09:07.0 -0400
> > +++ linux-2.6.22-rc5-mm1/drivers/acpi/bus.c 2007-06-20 20:32:00.0 
> > -0400
> > @@ -757,7 +757,9 @@ static int __init acpi_init(void)
> > result = acpi_bus_init();
> >  
> > if (!result) {
> > +#ifdef CONFIG_X86
> > pci_mmcfg_late_init();
> > +#endif
> >  #ifdef CONFIG_PM_LEGACY
> > if (!PM_IS_ACTIVE())
> > pm_active = 1;
> 
> 
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-06-20-10-12.tar.gz uploaded

2007-06-20 Thread Bob Picco

[EMAIL PROTECTED] wrote:[Wed Jun 20 2007, 01:14:34PM EDT]
[snip] 

More build breakage. efi_range_is_wc is referenced but not declared.


Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 include/asm-ia64/fb.h |1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.22-rc5-mm1/include/asm-ia64/fb.h
===
--- linux-2.6.22-rc5-mm1.orig/include/asm-ia64/fb.h 2007-06-20 
14:09:18.0 -0400
+++ linux-2.6.22-rc5-mm1/include/asm-ia64/fb.h  2007-06-20 20:40:48.0 
-0400
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 #include 
 
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-06-20-10-12.tar.gz uploaded

2007-06-20 Thread Bob Picco

[EMAIL PROTECTED] wrote:[Wed Jun 20 2007, 01:14:34PM EDT]
[snip]

Build breakage. pci_mmcfg_late_init is for i386.

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 drivers/acpi/bus.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6.22-rc5-mm1/drivers/acpi/bus.c
===
--- linux-2.6.22-rc5-mm1.orig/drivers/acpi/bus.c2007-06-20 
14:09:07.0 -0400
+++ linux-2.6.22-rc5-mm1/drivers/acpi/bus.c 2007-06-20 20:32:00.0 
-0400
@@ -757,7 +757,9 @@ static int __init acpi_init(void)
result = acpi_bus_init();
 
if (!result) {
+#ifdef CONFIG_X86
pci_mmcfg_late_init();
+#endif
 #ifdef CONFIG_PM_LEGACY
if (!PM_IS_ACTIVE())
pm_active = 1;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [31/37] Large blocksize support: Core piece

2007-06-20 Thread Bob Picco

Christoph Lameter wrote:[Wed Jun 20 2007, 02:29:38PM EDT]
> Provide an alternate definition for the page_cache_xxx(mapping, ...)
> functions that can determine the current page size from the mapping
> and generate the appropriate shifts, sizes and mask for the page cache
> operations. Change the basic functions that allocate pages for the
> page cache to be able to handle higher order allocations.
> 
> Provide a new function
> 
> mapping_setup(stdruct address_space *, gfp_t mask, int order)
> 
> that allows the setup of a mapping of any compound page order.
> 
> mapping_set_gfp_mask() is still provided but it sets mappings to order 0.
> Calls to mapping_set_gfp_mask() must be converted to mapping_setup() in
> order for the filesystem to be able to use larger pages. For some key block
> devices and filesystems the conversion is done here.
> 
> mapping_setup() for higher order is only allowed if the mapping does not
> use DMA mappings or HIGHMEM since we do not support bouncing at the moment.
> Thus we currently BUG() on DMA mappings and clear the highmem bit of higher
> order mappings.
> 
> Modify the set_blocksize() function so that an arbitrary blocksize can be set.
> Blocksizes up to MAX_ORDER can be set. This is typically 8MB on many
> platforms (order 11). Typically file systems are not only limited by the core
> VM but also by the structure of their internal data structures. The core VM
> limitations fall away with this patch. The functionality provided here
> can do nothing about the internal limitations of filesystems.
> 
> Known internal limitations:
> 
> Ext2  64k
> XFS   64k
> Reiserfs  8k
> Ext3  4k
> Ext4  4k
> 
> Signed-off-by: Christoph Lameter <[EMAIL PROTECTED]>
> 
> ---
>  block/Kconfig   |   17 ++
>  drivers/block/rd.c  |6 +-
>  fs/block_dev.c  |   29 +++
>  fs/buffer.c |2 
>  fs/inode.c  |7 +-
>  fs/xfs/linux-2.6/xfs_buf.c  |3 -
>  include/linux/buffer_head.h |   12 
>  include/linux/fs.h  |5 +
>  include/linux/pagemap.h |  116 
> +---
>  mm/filemap.c|   17 --
>  10 files changed, 186 insertions(+), 28 deletions(-)
> 
> Index: linux-2.6.22-rc4-mm2/include/linux/pagemap.h
> ===
> --- linux-2.6.22-rc4-mm2.orig/include/linux/pagemap.h 2007-06-19 
> 23:33:44.0 -0700
> +++ linux-2.6.22-rc4-mm2/include/linux/pagemap.h  2007-06-19 
> 23:50:55.0 -0700
> @@ -39,10 +39,30 @@ static inline gfp_t mapping_gfp_mask(str
>   * This is non-atomic.  Only to be used before the mapping is activated.
>   * Probably needs a barrier...
>   */
> -static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
> +static inline void mapping_setup(struct address_space *m,
> + gfp_t mask, int order)
>  {
>   m->flags = (m->flags & ~(__force unsigned long)__GFP_BITS_MASK) |
>   (__force unsigned long)mask;
> +
> +#ifdef CONFIG_LARGE_BLOCKSIZE
> + m->order = order;
> + m->shift = order + PAGE_SHIFT;
> + m->offset_mask = (PAGE_SIZE << order) - 1;
> + if (order) {
> + /*
> +  * Bouncing is not supported. Requests for DMA
> +  * memory will not work
> +  */
> + BUG_ON(m->flags & (__GFP_DMA|__GFP_DMA32));
> + /*
> +  * Bouncing not supported. We cannot use HIGHMEM
> +  */
> + m->flags &= ~__GFP_HIGHMEM;
> + m->flags |= __GFP_COMP;
> + raise_kswapd_order(order);
> + }
> +#endif
>  }
>  
>  /*
> @@ -62,6 +82,78 @@ static inline void mapping_set_gfp_mask(
>  #define PAGE_CACHE_ALIGN(addr)   
> (((addr)+PAGE_CACHE_SIZE-1)_CACHE_MASK)
>  
>  /*
> + * The next set of functions allow to write code that is capable of dealing
> + * with multiple page sizes.
> + */
> +#ifdef CONFIG_LARGE_BLOCKSIZE
> +/*
> + * Determine page order from the blkbits in the inode structure
> + */
> +static inline int page_cache_blkbits_to_order(int shift)
> +{
> + BUG_ON(shift < 9);
> +
> + if (shift < PAGE_SHIFT)
> + return 0;
> +
> + return shift - PAGE_SHIFT;
> +}
> +
> +/*
> + * Determine page order from a given blocksize
> + */
> +static inline int page_cache_blocksize_to_order(unsigned long size)
> +{
> + return page_cache_blkbits_to_order(ilog2(size));
> +}
> +
> +static inline int mapping_order(struct address_space *a)
> +{
> + return a->order;
> +}
> +
> +static inline int page_cache_shift(struct address_space *a)
> +{
> + return a->shift;
> +}
> +
> +static inline unsigned int page_cache_size(struct address_space *a)
> +{
> + return a->offset_mask + 1;
> +}
> +
> +static inline loff_t page_cache_mask(struct address_space *a)
> +{
> + return ~a->offset_mask;
> +}
> +
>

Re: [31/37] Large blocksize support: Core piece

2007-06-20 Thread Bob Picco

Christoph Lameter wrote:[Wed Jun 20 2007, 02:29:38PM EDT]
 Provide an alternate definition for the page_cache_xxx(mapping, ...)
 functions that can determine the current page size from the mapping
 and generate the appropriate shifts, sizes and mask for the page cache
 operations. Change the basic functions that allocate pages for the
 page cache to be able to handle higher order allocations.
 
 Provide a new function
 
 mapping_setup(stdruct address_space *, gfp_t mask, int order)
 
 that allows the setup of a mapping of any compound page order.
 
 mapping_set_gfp_mask() is still provided but it sets mappings to order 0.
 Calls to mapping_set_gfp_mask() must be converted to mapping_setup() in
 order for the filesystem to be able to use larger pages. For some key block
 devices and filesystems the conversion is done here.
 
 mapping_setup() for higher order is only allowed if the mapping does not
 use DMA mappings or HIGHMEM since we do not support bouncing at the moment.
 Thus we currently BUG() on DMA mappings and clear the highmem bit of higher
 order mappings.
 
 Modify the set_blocksize() function so that an arbitrary blocksize can be set.
 Blocksizes up to MAX_ORDER can be set. This is typically 8MB on many
 platforms (order 11). Typically file systems are not only limited by the core
 VM but also by the structure of their internal data structures. The core VM
 limitations fall away with this patch. The functionality provided here
 can do nothing about the internal limitations of filesystems.
 
 Known internal limitations:
 
 Ext2  64k
 XFS   64k
 Reiserfs  8k
 Ext3  4k
 Ext4  4k
 
 Signed-off-by: Christoph Lameter [EMAIL PROTECTED]
 
 ---
  block/Kconfig   |   17 ++
  drivers/block/rd.c  |6 +-
  fs/block_dev.c  |   29 +++
  fs/buffer.c |2 
  fs/inode.c  |7 +-
  fs/xfs/linux-2.6/xfs_buf.c  |3 -
  include/linux/buffer_head.h |   12 
  include/linux/fs.h  |5 +
  include/linux/pagemap.h |  116 
 +---
  mm/filemap.c|   17 --
  10 files changed, 186 insertions(+), 28 deletions(-)
 
 Index: linux-2.6.22-rc4-mm2/include/linux/pagemap.h
 ===
 --- linux-2.6.22-rc4-mm2.orig/include/linux/pagemap.h 2007-06-19 
 23:33:44.0 -0700
 +++ linux-2.6.22-rc4-mm2/include/linux/pagemap.h  2007-06-19 
 23:50:55.0 -0700
 @@ -39,10 +39,30 @@ static inline gfp_t mapping_gfp_mask(str
   * This is non-atomic.  Only to be used before the mapping is activated.
   * Probably needs a barrier...
   */
 -static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask)
 +static inline void mapping_setup(struct address_space *m,
 + gfp_t mask, int order)
  {
   m-flags = (m-flags  ~(__force unsigned long)__GFP_BITS_MASK) |
   (__force unsigned long)mask;
 +
 +#ifdef CONFIG_LARGE_BLOCKSIZE
 + m-order = order;
 + m-shift = order + PAGE_SHIFT;
 + m-offset_mask = (PAGE_SIZE  order) - 1;
 + if (order) {
 + /*
 +  * Bouncing is not supported. Requests for DMA
 +  * memory will not work
 +  */
 + BUG_ON(m-flags  (__GFP_DMA|__GFP_DMA32));
 + /*
 +  * Bouncing not supported. We cannot use HIGHMEM
 +  */
 + m-flags = ~__GFP_HIGHMEM;
 + m-flags |= __GFP_COMP;
 + raise_kswapd_order(order);
 + }
 +#endif
  }
  
  /*
 @@ -62,6 +82,78 @@ static inline void mapping_set_gfp_mask(
  #define PAGE_CACHE_ALIGN(addr)   
 (((addr)+PAGE_CACHE_SIZE-1)PAGE_CACHE_MASK)
  
  /*
 + * The next set of functions allow to write code that is capable of dealing
 + * with multiple page sizes.
 + */
 +#ifdef CONFIG_LARGE_BLOCKSIZE
 +/*
 + * Determine page order from the blkbits in the inode structure
 + */
 +static inline int page_cache_blkbits_to_order(int shift)
 +{
 + BUG_ON(shift  9);
 +
 + if (shift  PAGE_SHIFT)
 + return 0;
 +
 + return shift - PAGE_SHIFT;
 +}
 +
 +/*
 + * Determine page order from a given blocksize
 + */
 +static inline int page_cache_blocksize_to_order(unsigned long size)
 +{
 + return page_cache_blkbits_to_order(ilog2(size));
 +}
 +
 +static inline int mapping_order(struct address_space *a)
 +{
 + return a-order;
 +}
 +
 +static inline int page_cache_shift(struct address_space *a)
 +{
 + return a-shift;
 +}
 +
 +static inline unsigned int page_cache_size(struct address_space *a)
 +{
 + return a-offset_mask + 1;
 +}
 +
 +static inline loff_t page_cache_mask(struct address_space *a)
 +{
 + return ~a-offset_mask;
 +}
 +
 +static inline unsigned int page_cache_offset(struct address_space *a,
 + loff_t pos)
 +{
 + return pos  a-offset_mask;
 +}
 +#else
 +/*
 + *

Re: mm snapshot broken-out-2007-06-20-10-12.tar.gz uploaded

2007-06-20 Thread Bob Picco

[EMAIL PROTECTED] wrote:[Wed Jun 20 2007, 01:14:34PM EDT]
[snip]

Build breakage. pci_mmcfg_late_init is for i386.

Signed-off-by: Bob Picco [EMAIL PROTECTED]

 drivers/acpi/bus.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6.22-rc5-mm1/drivers/acpi/bus.c
===
--- linux-2.6.22-rc5-mm1.orig/drivers/acpi/bus.c2007-06-20 
14:09:07.0 -0400
+++ linux-2.6.22-rc5-mm1/drivers/acpi/bus.c 2007-06-20 20:32:00.0 
-0400
@@ -757,7 +757,9 @@ static int __init acpi_init(void)
result = acpi_bus_init();
 
if (!result) {
+#ifdef CONFIG_X86
pci_mmcfg_late_init();
+#endif
 #ifdef CONFIG_PM_LEGACY
if (!PM_IS_ACTIVE())
pm_active = 1;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-06-20-10-12.tar.gz uploaded

2007-06-20 Thread Bob Picco

[EMAIL PROTECTED] wrote:[Wed Jun 20 2007, 01:14:34PM EDT]
[snip] 

More build breakage. efi_range_is_wc is referenced but not declared.


Signed-off-by: Bob Picco [EMAIL PROTECTED]

 include/asm-ia64/fb.h |1 +
 1 file changed, 1 insertion(+)

Index: linux-2.6.22-rc5-mm1/include/asm-ia64/fb.h
===
--- linux-2.6.22-rc5-mm1.orig/include/asm-ia64/fb.h 2007-06-20 
14:09:18.0 -0400
+++ linux-2.6.22-rc5-mm1/include/asm-ia64/fb.h  2007-06-20 20:40:48.0 
-0400
@@ -3,6 +3,7 @@
 
 #include linux/fb.h
 #include linux/fs.h
+#include linux/efi.h
 #include asm/page.h
 
 static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: mm snapshot broken-out-2007-06-20-10-12.tar.gz uploaded

2007-06-20 Thread Bob Picco

Randy Dunlap wrote: [Wed Jun 20 2007, 09:07:11PM EDT]
 On Wed, 20 Jun 2007 20:51:22 -0400 Bob Picco wrote:
 
  [EMAIL PROTECTED] wrote:[Wed Jun 20 2007, 01:14:34PM EDT]
  [snip]
  
  Build breakage. pci_mmcfg_late_init is for i386.
 
 then you want CONFIG_X86_32 instead of CONFIG_X86.
 CONFIG_X86 is set/true for both X86_32 and X86_64.
Then what I stated within the patch description is incorrect. pci.h which is the
required include for the declaration is conditionally for CONFIG_X86. So it is
both I guess?

bob
 
 
  Signed-off-by: Bob Picco [EMAIL PROTECTED]
  
   drivers/acpi/bus.c |2 ++
   1 file changed, 2 insertions(+)
  
  Index: linux-2.6.22-rc5-mm1/drivers/acpi/bus.c
  ===
  --- linux-2.6.22-rc5-mm1.orig/drivers/acpi/bus.c2007-06-20 
  14:09:07.0 -0400
  +++ linux-2.6.22-rc5-mm1/drivers/acpi/bus.c 2007-06-20 20:32:00.0 
  -0400
  @@ -757,7 +757,9 @@ static int __init acpi_init(void)
  result = acpi_bus_init();
   
  if (!result) {
  +#ifdef CONFIG_X86
  pci_mmcfg_late_init();
  +#endif
   #ifdef CONFIG_PM_LEGACY
  if (!PM_IS_ACTIVE())
  pm_active = 1;
 
 
 ---
 ~Randy
 *** Remember to use Documentation/SubmitChecklist when testing your code ***
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ia64: Scalability improvement of gettimeofday with jitter compensation

2007-06-13 Thread Bob Picco

Hidetoshi Seto wrote:   [Mon Jun 11 2007, 11:14:44PM EDT]
[snip]
I didn't examine your patch closely but did notice modification to the
time interpolator. I'll attempt to examine your patch with more scrutiny
in the next few days.

I will be pushing Peter Keilty's clocksource ia64 patches within the next
week or so.  At that time I'll ask for inclusion into -mm. Please see:
http://marc.info/?t=11788158551=1=2
You'll notice that the time interpolator is removed.

BTW. Peter is no longer with HP. He participated in the Employee Early
Retirement program and left May 31.

thanks,

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ia64: Scalability improvement of gettimeofday with jitter compensation

2007-06-13 Thread Bob Picco

Hidetoshi Seto wrote:   [Mon Jun 11 2007, 11:14:44PM EDT]
[snip]
I didn't examine your patch closely but did notice modification to the
time interpolator. I'll attempt to examine your patch with more scrutiny
in the next few days.

I will be pushing Peter Keilty's clocksource ia64 patches within the next
week or so.  At that time I'll ask for inclusion into -mm. Please see:
http://marc.info/?t=11788158551r=1w=2
You'll notice that the time interpolator is removed.

BTW. Peter is no longer with HP. He participated in the Employee Early
Retirement program and left May 31.

thanks,

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] sysrq-m oops

2007-06-07 Thread Bob Picco

john stultz wrote:  [Thu Jun 07 2007, 03:54:41PM EDT]
[snip]
john you are welcome.

We aren't sampling for holes in memory. Thus we encounter a section hole with
empty section map pointer for SPARSEMEM and OOPs for show_mem. This issue
has been seen in 2.6.21, current git and current mm. The patch below
is for mainline and mm. It was boot tested for SPARSEMEM, current
VMEMMAP of Andy's in mm ml and DISCONTIGMEM. A slightly different patch
will be posted to stable for 2.6.21.

Previous to commit f0a5a58aa812b31fd9f197c4ba48245942364eae memory_present
was called for node_start_pfn to node_end_pfn. This would cover the hole(s)
with reserved pages and valid sections. Most SPARSEMEM supported arches
do a pfn_valid check in show_mem before computing the page structure address.

This issue was brought to my attention on IRC by Arnaldo Carvalho de Melo at
[EMAIL PROTECTED] Thanks to Arnaldo for testing.

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 arch/x86_64/mm/init.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6.22-rc4-mm1/arch/x86_64/mm/init.c
===
--- linux-2.6.22-rc4-mm1.orig/arch/x86_64/mm/init.c 2007-06-06 
12:59:26.0 -0400
+++ linux-2.6.22-rc4-mm1/arch/x86_64/mm/init.c  2007-06-07 11:14:31.0 
-0400
@@ -79,6 +79,8 @@ void show_mem(void)
if (unlikely(i % MAX_ORDER_NR_PAGES == 0)) {
touch_nmi_watchdog();
}
+   if (!pfn_valid(pgdat->node_start_pfn + i))
+   continue;
page = pfn_to_page(pgdat->node_start_pfn + i);
total++;
if (PageReserved(page))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] sysrq-m oops

2007-06-07 Thread Bob Picco

Chuck Ebbert wrote: [Thu Jun 07 2007, 11:42:38AM EDT]
> On 06/06/2007 08:27 PM, john stultz wrote:
> > Hey All,
> > With 2.6.21 and the current -git, we're seeing the following oops when
> > we try sysrq-m:
> > 
> 
> It's here in arch/x86_64/mm/init.c::show_mem():
> 
> for_each_online_pgdat(pgdat) {
>for (i = 0; i < pgdat->node_spanned_pages; ++i) {
This is probably with sparsemem? I'm working with [EMAIL PROTECTED] to
test a patch. Basically you need to validate the pfn because it
could be in a hole. Most arches which support sparsemem perform this
check.

if (!pfn_valid(pgdat->node_start_pfn + i))
continue;
bob
> page = pfn_to_page(pgdat->node_start_pfn + i);
> total++;
>  ==>if (PageReserved(page))
> reserved++;
> else if (PageSwapCache(page))
> cached++;
> else if (page_count(page))
> shared += page_count(page) - 1;
>}
> }
> 
> page is completely bogus (it's 0x0348)
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] sysrq-m oops

2007-06-07 Thread Bob Picco

john stultz wrote:  [Thu Jun 07 2007, 03:54:41PM EDT]
[snip]
john you are welcome.

We aren't sampling for holes in memory. Thus we encounter a section hole with
empty section map pointer for SPARSEMEM and OOPs for show_mem. This issue
has been seen in 2.6.21, current git and current mm. The patch below
is for mainline and mm. It was boot tested for SPARSEMEM, current
VMEMMAP of Andy's in mm ml and DISCONTIGMEM. A slightly different patch
will be posted to stable for 2.6.21.

Previous to commit f0a5a58aa812b31fd9f197c4ba48245942364eae memory_present
was called for node_start_pfn to node_end_pfn. This would cover the hole(s)
with reserved pages and valid sections. Most SPARSEMEM supported arches
do a pfn_valid check in show_mem before computing the page structure address.

This issue was brought to my attention on IRC by Arnaldo Carvalho de Melo at
[EMAIL PROTECTED] Thanks to Arnaldo for testing.

Signed-off-by: Bob Picco [EMAIL PROTECTED]

 arch/x86_64/mm/init.c |2 ++
 1 file changed, 2 insertions(+)

Index: linux-2.6.22-rc4-mm1/arch/x86_64/mm/init.c
===
--- linux-2.6.22-rc4-mm1.orig/arch/x86_64/mm/init.c 2007-06-06 
12:59:26.0 -0400
+++ linux-2.6.22-rc4-mm1/arch/x86_64/mm/init.c  2007-06-07 11:14:31.0 
-0400
@@ -79,6 +79,8 @@ void show_mem(void)
if (unlikely(i % MAX_ORDER_NR_PAGES == 0)) {
touch_nmi_watchdog();
}
+   if (!pfn_valid(pgdat-node_start_pfn + i))
+   continue;
page = pfn_to_page(pgdat-node_start_pfn + i);
total++;
if (PageReserved(page))
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUG] sysrq-m oops

2007-06-07 Thread Bob Picco

Chuck Ebbert wrote: [Thu Jun 07 2007, 11:42:38AM EDT]
 On 06/06/2007 08:27 PM, john stultz wrote:
  Hey All,
  With 2.6.21 and the current -git, we're seeing the following oops when
  we try sysrq-m:
  
 
 It's here in arch/x86_64/mm/init.c::show_mem():
 
 for_each_online_pgdat(pgdat) {
for (i = 0; i  pgdat-node_spanned_pages; ++i) {
This is probably with sparsemem? I'm working with [EMAIL PROTECTED] to
test a patch. Basically you need to validate the pfn because it
could be in a hole. Most arches which support sparsemem perform this
check.

if (!pfn_valid(pgdat-node_start_pfn + i))
continue;
bob
 page = pfn_to_page(pgdat-node_start_pfn + i);
 total++;
  ==if (PageReserved(page))
 reserved++;
 else if (PageSwapCache(page))
 cached++;
 else if (page_count(page))
 shared += page_count(page) - 1;
}
 }
 
 page is completely bogus (it's 0x0348)
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: your mail

2007-05-16 Thread Bob Picco

Olof Johansson wrote:   [Wed May 16 2007, 01:11:00PM EDT]
> On Wed, May 16, 2007 at 11:43:41AM -0500, Linas Vepstas wrote:
> > On Wed, May 16, 2007 at 09:30:46AM -0400, Bob Picco wrote:
> > > Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
> > > 
> > > /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: 
> > > error: unknown field `subsys' specified in initializer
> > > /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: 
> > > warning: initialization from incompatible pointer type
> > > make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
> > > make[3]: *** [drivers/pci/hotplug] Error 2
> > > make[2]: *** [drivers/pci] Error 2
> > > make[1]: *** [drivers] Error 2
> > > make: *** [_all] Error 2
> > 
> > John Rose is working to fix this "real soon now".
> 
> Do you mean the fix Al Viro posted yesterday?
> 
> http://patchwork.ozlabs.org/linuxppc/patch?id=11177
> 
> 
> -Olof
Missed that patch.

thanks,

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[no subject]

2007-05-16 Thread Bob Picco

[EMAIL PROTECTED]
Bcc: 
Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
Reply-To: 
In-Reply-To: <[EMAIL PROTECTED]>

/usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: error: 
unknown field `subsys' specified in initializer
/usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: 
warning: initialization from incompatible pointer type
make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
make[3]: *** [drivers/pci/hotplug] Error 2
make[2]: *** [drivers/pci] Error 2
make[1]: *** [drivers] Error 2
make: *** [_all] Error 2


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.22-rc1-mm1
# Wed May 16 06:51:38 2007
#
CONFIG_PPC64=y
CONFIG_64BIT=y
CONFIG_PPC_MERGE=y
CONFIG_MMU=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_IRQ_PER_CPU=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_ILOG2_U32=y
CONFIG_ARCH_HAS_ILOG2_U64=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_PPC=y
CONFIG_EARLY_PRINTK=y
CONFIG_COMPAT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_OF=y
CONFIG_PPC_UDBG_16550=y
CONFIG_GENERIC_TBSYNC=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_DEFAULT_UIMAGE is not set

#
# Processor support
#
# CONFIG_POWER4_ONLY is not set
CONFIG_POWER3=y
CONFIG_POWER4=y
CONFIG_PPC_FPU=y
# CONFIG_PPC_DCR_NATIVE is not set
# CONFIG_PPC_DCR_MMIO is not set
# CONFIG_PPC_OF_PLATFORM_PCI is not set
# CONFIG_ALTIVEC is not set
CONFIG_PPC_STD_MMU=y
CONFIG_PPC_MM_SLICES=y
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_SMP=y
CONFIG_NR_CPUS=32
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_PROC_SMAPS=y
CONFIG_PROC_CLEAR_REFS=y
CONFIG_PROC_PAGEMAP=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="anticipatory"

#
# Platform support
#
CONFIG_PPC_MULTIPLATFORM=y
# CONFIG_EMBEDDED6xx is not set
# CONFIG_APUS is not set
CONFIG_PPC_PSERIES=y
# CONFIG_PPC_SPLPAR is not set
CONFIG_EEH=y
CONFIG_SCANLOG=y
CONFIG_LPARCFG=y
# CONFIG_PPC_ISERIES is not set
# CONFIG_PPC_MPC52xx is not set
# CONFIG_PPC_MPC5200 is not set
# CONFIG_PPC_PMAC is not set
# CONFIG_PPC_MAPLE is not set
CONFIG_PPC_PASEMI=y

#
# PA Semi PWRficient options
#
# CONFIG_PPC_PASEMI_IOMMU is not set
CONFIG_PPC_PASEMI_MDIO=y
# CONFIG_PPC_CELLEB is not set
# CONFIG_PPC_PS3 is not set
# CONFIG_PPC_CELL is not set
# CONFIG_PPC_CELL_NATIVE is not set
# CONFIG_PPC_IBM_CELL_BLADE is not set
# CONFIG_PQ2ADS is not set
CONFIG_PPC_NATIVE=y
# CONFIG_UDBG_RTAS_CONSOLE is not set
CONFIG_XICS=y
CONFIG_MPIC=y
# CONFIG_MPIC_WEIRD is not set
CONFIG_PPC_I8259=y
# CONFIG_U3_DART is not set
CONFIG_PPC_RTAS=y
CONFIG_RTAS_ERROR_LOGGING=y
CONFIG_RTAS_PROC=y
CONFIG_RTAS_FLASH=y
# CONFIG_MMIO_NVRAM is not set
CONFIG_IBMVIO=y
CONFIG_IBMEBUS=y
# CONFIG_PPC_MPC106 is not set
# CONFIG_PPC_970_NAP is not set
# CONFIG_PPC_INDIRECT_IO is not set
# CONFIG_GENERIC_IOMAP is not set
# CONFIG_CPU_FREQ is not set
# CONFIG_CPM2 is not set

#
# Kernel options
#
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
CONFIG_FORCE_MAX_ZONEORDER=13
# CONFIG_IOMMU_VMERGE is not set
# CONFIG_HOTPLUG_CPU is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
# CONFIG_IRQ_ALL_CPUS is not set
CONFIG_NUMA=y

[no subject]

2007-05-16 Thread Bob Picco

[EMAIL PROTECTED]
Bcc: 
Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
Reply-To: 
In-Reply-To: [EMAIL PROTECTED]

/usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: error: 
unknown field `subsys' specified in initializer
/usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: 
warning: initialization from incompatible pointer type
make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
make[3]: *** [drivers/pci/hotplug] Error 2
make[2]: *** [drivers/pci] Error 2
make[1]: *** [drivers] Error 2
make: *** [_all] Error 2


#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.22-rc1-mm1
# Wed May 16 06:51:38 2007
#
CONFIG_PPC64=y
CONFIG_64BIT=y
CONFIG_PPC_MERGE=y
CONFIG_MMU=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_IRQ_PER_CPU=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_ARCH_HAS_ILOG2_U32=y
CONFIG_ARCH_HAS_ILOG2_U64=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_GENERIC_FIND_NEXT_BIT=y
CONFIG_PPC=y
CONFIG_EARLY_PRINTK=y
CONFIG_COMPAT=y
CONFIG_SYSVIPC_COMPAT=y
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_PPC_OF=y
CONFIG_PPC_UDBG_16550=y
CONFIG_GENERIC_TBSYNC=y
CONFIG_AUDIT_ARCH=y
CONFIG_GENERIC_BUG=y
# CONFIG_DEFAULT_UIMAGE is not set

#
# Processor support
#
# CONFIG_POWER4_ONLY is not set
CONFIG_POWER3=y
CONFIG_POWER4=y
CONFIG_PPC_FPU=y
# CONFIG_PPC_DCR_NATIVE is not set
# CONFIG_PPC_DCR_MMIO is not set
# CONFIG_PPC_OF_PLATFORM_PCI is not set
# CONFIG_ALTIVEC is not set
CONFIG_PPC_STD_MMU=y
CONFIG_PPC_MM_SLICES=y
CONFIG_VIRT_CPU_ACCOUNTING=y
CONFIG_SMP=y
CONFIG_NR_CPUS=32
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32

#
# General setup
#
CONFIG_LOCALVERSION=
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SWAP_PREFETCH=y
CONFIG_SYSVIPC=y
# CONFIG_IPC_NS is not set
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
# CONFIG_TASKSTATS is not set
# CONFIG_UTS_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CPUSETS is not set
CONFIG_SYSFS_DEPRECATED=y
# CONFIG_RELAY is not set
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_TIMERFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLAB=y
# CONFIG_SLUB is not set
# CONFIG_SLOB is not set
CONFIG_PROC_SMAPS=y
CONFIG_PROC_CLEAR_REFS=y
CONFIG_PROC_PAGEMAP=y
CONFIG_PROC_KPAGEMAP=y
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
# CONFIG_MODULES is not set
CONFIG_BLOCK=y
# CONFIG_BLK_DEV_IO_TRACE is not set

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
CONFIG_DEFAULT_AS=y
# CONFIG_DEFAULT_DEADLINE is not set
# CONFIG_DEFAULT_CFQ is not set
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=anticipatory

#
# Platform support
#
CONFIG_PPC_MULTIPLATFORM=y
# CONFIG_EMBEDDED6xx is not set
# CONFIG_APUS is not set
CONFIG_PPC_PSERIES=y
# CONFIG_PPC_SPLPAR is not set
CONFIG_EEH=y
CONFIG_SCANLOG=y
CONFIG_LPARCFG=y
# CONFIG_PPC_ISERIES is not set
# CONFIG_PPC_MPC52xx is not set
# CONFIG_PPC_MPC5200 is not set
# CONFIG_PPC_PMAC is not set
# CONFIG_PPC_MAPLE is not set
CONFIG_PPC_PASEMI=y

#
# PA Semi PWRficient options
#
# CONFIG_PPC_PASEMI_IOMMU is not set
CONFIG_PPC_PASEMI_MDIO=y
# CONFIG_PPC_CELLEB is not set
# CONFIG_PPC_PS3 is not set
# CONFIG_PPC_CELL is not set
# CONFIG_PPC_CELL_NATIVE is not set
# CONFIG_PPC_IBM_CELL_BLADE is not set
# CONFIG_PQ2ADS is not set
CONFIG_PPC_NATIVE=y
# CONFIG_UDBG_RTAS_CONSOLE is not set
CONFIG_XICS=y
CONFIG_MPIC=y
# CONFIG_MPIC_WEIRD is not set
CONFIG_PPC_I8259=y
# CONFIG_U3_DART is not set
CONFIG_PPC_RTAS=y
CONFIG_RTAS_ERROR_LOGGING=y
CONFIG_RTAS_PROC=y
CONFIG_RTAS_FLASH=y
# CONFIG_MMIO_NVRAM is not set
CONFIG_IBMVIO=y
CONFIG_IBMEBUS=y
# CONFIG_PPC_MPC106 is not set
# CONFIG_PPC_970_NAP is not set
# CONFIG_PPC_INDIRECT_IO is not set
# CONFIG_GENERIC_IOMAP is not set
# CONFIG_CPU_FREQ is not set
# CONFIG_CPM2 is not set

#
# Kernel options
#
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_BKL=y
CONFIG_BINFMT_ELF=y
# CONFIG_BINFMT_MISC is not set
CONFIG_FORCE_MAX_ZONEORDER=13
# CONFIG_IOMMU_VMERGE is not set
# CONFIG_HOTPLUG_CPU is not set
CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y
# CONFIG_KEXEC is not set
# CONFIG_CRASH_DUMP is not set
# CONFIG_IRQ_ALL_CPUS is not set
CONFIG_NUMA=y
CONFIG_NODES_SHIFT=4

Re: your mail

2007-05-16 Thread Bob Picco

Olof Johansson wrote:   [Wed May 16 2007, 01:11:00PM EDT]
 On Wed, May 16, 2007 at 11:43:41AM -0500, Linas Vepstas wrote:
  On Wed, May 16, 2007 at 09:30:46AM -0400, Bob Picco wrote:
   Subject: Re: 2.6.22-rc1-mm1 powerpc build breakage
   
   /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: 
   error: unknown field `subsys' specified in initializer
   /usr/src/linux-2.6.22-rc1-mm1/drivers/pci/hotplug/rpadlpar_sysfs.c:132: 
   warning: initialization from incompatible pointer type
   make[4]: *** [drivers/pci/hotplug/rpadlpar_sysfs.o] Error 1
   make[3]: *** [drivers/pci/hotplug] Error 2
   make[2]: *** [drivers/pci] Error 2
   make[1]: *** [drivers] Error 2
   make: *** [_all] Error 2
  
  John Rose is working to fix this real soon now.
 
 Do you mean the fix Al Viro posted yesterday?
 
 http://patchwork.ozlabs.org/linuxppc/patch?id=11177
 
 
 -Olof
Missed that patch.

thanks,

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: hpet on hp proliant dl380 g5 4 cores

2007-05-10 Thread Bob Picco

Hi:
Raz Ben-Jehuda(caro) wrote: [Wed May 09 2007, 02:02:24PM EDT]
> Hello Robert.
> I have noticed that you are the writer of the hpet driver in linux.
> I have been running the test tool provided in
> linux/Documenration/hpet.txt  on an hp dl380. It runs for few seconds
> and then the hpet timer clock stopped interrupting ( I looked at
> /proc/interrupts). It happens faster when i increased the frequency
> (starting from 1000 to 16000) . But is always happens.
> 
> Do you why ?
I don't have a clue. All my dl's don't have hpet bios enabled.

bob
> 
> Thank you
> -- 
> Raz
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: hpet on hp proliant dl380 g5 4 cores

2007-05-10 Thread Bob Picco

Hi:
Raz Ben-Jehuda(caro) wrote: [Wed May 09 2007, 02:02:24PM EDT]
 Hello Robert.
 I have noticed that you are the writer of the hpet driver in linux.
 I have been running the test tool provided in
 linux/Documenration/hpet.txt  on an hp dl380. It runs for few seconds
 and then the hpet timer clock stopped interrupting ( I looked at
 /proc/interrupts). It happens faster when i increased the frequency
 (starting from 1000 to 16000) . But is always happens.
 
 Do you why ?
I don't have a clue. All my dl's don't have hpet bios enabled.

bob
 
 Thank you
 -- 
 Raz
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-19 Thread Bob Picco

Ingo Molnar wrote:  [Thu Apr 19 2007, 02:29:36AM EDT]
> 
> * Bob Picco <[EMAIL PROTECTED]> wrote:
> 
> > I had hoped to collect more data with CFS V2. It crashes in 
> > scale_nice_down for s2ram when attempting to disable_nonboot_cpus. So 
> > part of traceback looks like (typed by hand with obvious omissions):
> > 
> > scale_nice_down
> > update_stats_wait_end - not shown in traceback because inlined
> > pick_next_task_fair
> > migration_call
> > task_rq_lock
> > notifier_call_chain
> > _cpu_down
> > disable_nonboot_cpus
> 
> ok, this looks similar to the jpeg Christian did. Does the patch below 
> fix the crash for you?
> 
>   Ingo
> 
> ---
>  kernel/sched.c |2 ++
>  1 file changed, 2 insertions(+)
> 
> Index: linux/kernel/sched.c
> ===
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -4425,6 +4425,8 @@ static void migrate_dead_tasks(unsigned 
>   struct task_struct *next;
>  
>   for (;;) {
> + if (!rq->nr_running)
> + break;
>   next = pick_next_task(rq, rq->curr);
>   if (!next)
>   break;
This patch repairs s2ram issue. 

Thanks.

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-19 Thread Bob Picco

Ingo Molnar wrote:  [Thu Apr 19 2007, 02:29:36AM EDT]
 
 * Bob Picco [EMAIL PROTECTED] wrote:
 
  I had hoped to collect more data with CFS V2. It crashes in 
  scale_nice_down for s2ram when attempting to disable_nonboot_cpus. So 
  part of traceback looks like (typed by hand with obvious omissions):
  
  scale_nice_down
  update_stats_wait_end - not shown in traceback because inlined
  pick_next_task_fair
  migration_call
  task_rq_lock
  notifier_call_chain
  _cpu_down
  disable_nonboot_cpus
 
 ok, this looks similar to the jpeg Christian did. Does the patch below 
 fix the crash for you?
 
   Ingo
 
 ---
  kernel/sched.c |2 ++
  1 file changed, 2 insertions(+)
 
 Index: linux/kernel/sched.c
 ===
 --- linux.orig/kernel/sched.c
 +++ linux/kernel/sched.c
 @@ -4425,6 +4425,8 @@ static void migrate_dead_tasks(unsigned 
   struct task_struct *next;
  
   for (;;) {
 + if (!rq-nr_running)
 + break;
   next = pick_next_task(rq, rq-curr);
   if (!next)
   break;
This patch repairs s2ram issue. 

Thanks.

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Bob Picco

Ingo Molnar wrote:  [Wed Apr 18 2007, 06:02:28PM EDT]
> 
> * Christian Hesse <[EMAIL PROTECTED]> wrote:
> 
> > > although probably your suspend2 problem is still not fixed, it's 
> > > worth a try nevertheless. Which suspend2 patch did you apply, and 
> > > was it against -rc6 or -rc7?
> > 
> > You are right again. ;-)
> > 
> > Linux 2.6.21-rc7
> > Suspend2 2.2.9.11 (applies cleanly to -rc7)
> > CFS v3 (without any additional patches)
> > 
> > And it still hangs on suspend.
> 
> what's the easiest way for me to try suspend2? Apply the patch, reboot 
> into the kernel, then execute what command to suspend? (there's a 
> confusing mismash of initiators of all the suspend variants. Can i drive 
> this by echoing to /sys/power/state?)
> 
>   Ingo
I had hoped to collect more data with CFS V2. It crashes in
scale_nice_down for s2ram when attempting to disable_nonboot_cpus. 
So part of traceback looks like (typed by hand with obvious omissions):

scale_nice_down
update_stats_wait_end - not shown in traceback because inlined
pick_next_task_fair
migration_call
task_rq_lock
notifier_call_chain
_cpu_down
disable_nonboot_cpus
...

This is standard -rc7 with V2 CFS applied. It could be a completely
unrelated issue. I'll attempt to debug further tomorrow.

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: CFS and suspend2: hang in atomic copy

2007-04-18 Thread Bob Picco

Ingo Molnar wrote:  [Wed Apr 18 2007, 06:02:28PM EDT]
 
 * Christian Hesse [EMAIL PROTECTED] wrote:
 
   although probably your suspend2 problem is still not fixed, it's 
   worth a try nevertheless. Which suspend2 patch did you apply, and 
   was it against -rc6 or -rc7?
  
  You are right again. ;-)
  
  Linux 2.6.21-rc7
  Suspend2 2.2.9.11 (applies cleanly to -rc7)
  CFS v3 (without any additional patches)
  
  And it still hangs on suspend.
 
 what's the easiest way for me to try suspend2? Apply the patch, reboot 
 into the kernel, then execute what command to suspend? (there's a 
 confusing mismash of initiators of all the suspend variants. Can i drive 
 this by echoing to /sys/power/state?)
 
   Ingo
I had hoped to collect more data with CFS V2. It crashes in
scale_nice_down for s2ram when attempting to disable_nonboot_cpus. 
So part of traceback looks like (typed by hand with obvious omissions):

scale_nice_down
update_stats_wait_end - not shown in traceback because inlined
pick_next_task_fair
migration_call
task_rq_lock
notifier_call_chain
_cpu_down
disable_nonboot_cpus
...

This is standard -rc7 with V2 CFS applied. It could be a completely
unrelated issue. I'll attempt to debug further tomorrow.

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-04 Thread Bob Picco

Christoph Lameter wrote:[Mon Apr 02 2007, 05:28:30PM EDT]
> On Mon, 2 Apr 2007, Dave Hansen wrote:
> 
> > On Mon, 2007-04-02 at 13:30 -0700, Christoph Lameter wrote:
> > > On Mon, 2 Apr 2007, Dave Hansen wrote:
> > > > I completely agree, it looks like it should be faster.  The code
> > > > certainly has potential benefits.  But, to add this neato, apparently
> > > > more performant feature, we unfortunately have to add code.  Adding the
> > > > code has a cost: code maintenance.  This isn't a runtime cost, but it is
> > > > a real, honest to goodness tradeoff.
> > > 
> > > Its just the opposite. The vmemmap code is so efficient that we can 
> > > remove 
> > > lots of other code and gops of these alternate implementations.
> > 
> > We do want to make sure that there isn't anyone relying on these.  Are
> > you thinking of simple sparsemem vs. extreme vs. sparsemem vmemmap?  Or,
> > are you thinking of sparsemem vs. discontig?
> 
> I am thinking sparsemem default and then get rid discontig, flatmem etc.
> On many platforms this will work. Flatmem for embedded could just be a 
> variation on sparse_virtual.
> 
> > Amen, brother.  I'd love to see DISCONTIG die, with sufficient testing,
> > of course.  Andi, do you have any ideas on how to get sparsemem out of
> > the 'experimental' phase?
> 
> Note that these arguments on DISCONTIG are flame bait for many SGIers. 
> We usually see this as an attack on DISCONTIG/VMEMMAP which is the 
> existing best performing implementation for page_to_pfn and vice 
> versa. Please lets stop the polarization. We want one consistent scheme 
> to manage memory everywhere. I do not care what its called as long as it 
> covers all the bases and is not a glaring performance regresssion (like 
> SPARSEMEM so far).
Well you must have forgotten about these two postings in regards to
performance numbers:
http://marc.info/?l=linux-ia64=111990276501051=2
and
http://marc.info/?l=linux-kernel=116664638611634=2
.

I took your first patchset and ran some numbers on an amd64 machine with
two dual core sockets and 4Gb of memory. More iterations should be done
and perhaps larger number of tasks. The aim7 numbers are below.

bob

2.6.21-rc5+sparsemem
Benchmark   Version Machine Run Date
AIM Multiuser Benchmark - Suite VII "1.1"   rcc5Apr  2 05:04:33 2007

Tasks   Jobs/MinJTI RealCPU Jobs/sec/task
1   13.8100 421.3   2.2 0.2303
101 527.8   97  1113.8  111.5   0.0871
201 565.0   97  2070.6  222.7   0.0468
301 570.9   96  3068.7  334.7   0.0316
401 573.0   97  4072.7  445.6   0.0238
501 583.3   99  4998.5  558.6   0.0194
601 583.8   99  5991.1  672.9   0.0162

2.6.21-rc5+sparsemem+patchset
Benchmark   Version Machine Run Date
AIM Multiuser Benchmark - Suite VII "1.1"   vmemApr  4 02:22:24 2007

Tasks   Jobs/MinJTI RealCPU Jobs/sec/task
1   13.7100 424.0   2.1 0.2288
101 500.3   97  1175.0  112.0   0.0826
201 554.2   97  2111.0  223.6   0.0460
301 578.5   97  3028.3  334.9   0.0320
401 586.2   97  3981.3  448.1   0.0244
501 584.2   99  4990.8  561.8   0.0194
601 584.4   98  5985.2  675.5   0.0162

> 
> > I have noticed before that sparsemem should be able to cover the flatmem
> > case if we make MAX_PHYSMEM_BITS == SECTION_SIZE_BITS and massage from
> > there.  
> 
> Right. But for embedded the memorymap base cannot be constant because 
> they may not be able to have a fixed address in memory. So memory map 
> needs to become a variable.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [EMAIL PROTECTED]  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"[EMAIL PROTECTED]"> [EMAIL PROTECTED] 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] x86_64: Switch to SPARSE_VIRTUAL

2007-04-04 Thread Bob Picco

Christoph Lameter wrote:[Mon Apr 02 2007, 05:28:30PM EDT]
 On Mon, 2 Apr 2007, Dave Hansen wrote:
 
  On Mon, 2007-04-02 at 13:30 -0700, Christoph Lameter wrote:
   On Mon, 2 Apr 2007, Dave Hansen wrote:
I completely agree, it looks like it should be faster.  The code
certainly has potential benefits.  But, to add this neato, apparently
more performant feature, we unfortunately have to add code.  Adding the
code has a cost: code maintenance.  This isn't a runtime cost, but it is
a real, honest to goodness tradeoff.
   
   Its just the opposite. The vmemmap code is so efficient that we can 
   remove 
   lots of other code and gops of these alternate implementations.
  
  We do want to make sure that there isn't anyone relying on these.  Are
  you thinking of simple sparsemem vs. extreme vs. sparsemem vmemmap?  Or,
  are you thinking of sparsemem vs. discontig?
 
 I am thinking sparsemem default and then get rid discontig, flatmem etc.
 On many platforms this will work. Flatmem for embedded could just be a 
 variation on sparse_virtual.
 
  Amen, brother.  I'd love to see DISCONTIG die, with sufficient testing,
  of course.  Andi, do you have any ideas on how to get sparsemem out of
  the 'experimental' phase?
 
 Note that these arguments on DISCONTIG are flame bait for many SGIers. 
 We usually see this as an attack on DISCONTIG/VMEMMAP which is the 
 existing best performing implementation for page_to_pfn and vice 
 versa. Please lets stop the polarization. We want one consistent scheme 
 to manage memory everywhere. I do not care what its called as long as it 
 covers all the bases and is not a glaring performance regresssion (like 
 SPARSEMEM so far).
Well you must have forgotten about these two postings in regards to
performance numbers:
http://marc.info/?l=linux-ia64m=111990276501051w=2
and
http://marc.info/?l=linux-kernelm=116664638611634w=2
.

I took your first patchset and ran some numbers on an amd64 machine with
two dual core sockets and 4Gb of memory. More iterations should be done
and perhaps larger number of tasks. The aim7 numbers are below.

bob

2.6.21-rc5+sparsemem
Benchmark   Version Machine Run Date
AIM Multiuser Benchmark - Suite VII 1.1   rcc5Apr  2 05:04:33 2007

Tasks   Jobs/MinJTI RealCPU Jobs/sec/task
1   13.8100 421.3   2.2 0.2303
101 527.8   97  1113.8  111.5   0.0871
201 565.0   97  2070.6  222.7   0.0468
301 570.9   96  3068.7  334.7   0.0316
401 573.0   97  4072.7  445.6   0.0238
501 583.3   99  4998.5  558.6   0.0194
601 583.8   99  5991.1  672.9   0.0162

2.6.21-rc5+sparsemem+patchset
Benchmark   Version Machine Run Date
AIM Multiuser Benchmark - Suite VII 1.1   vmemApr  4 02:22:24 2007

Tasks   Jobs/MinJTI RealCPU Jobs/sec/task
1   13.7100 424.0   2.1 0.2288
101 500.3   97  1175.0  112.0   0.0826
201 554.2   97  2111.0  223.6   0.0460
301 578.5   97  3028.3  334.9   0.0320
401 586.2   97  3981.3  448.1   0.0244
501 584.2   99  4990.8  561.8   0.0194
601 584.4   98  5985.2  675.5   0.0162

 
  I have noticed before that sparsemem should be able to cover the flatmem
  case if we make MAX_PHYSMEM_BITS == SECTION_SIZE_BITS and massage from
  there.  
 
 Right. But for embedded the memorymap base cannot be constant because 
 they may not be able to have a fixed address in memory. So memory map 
 needs to become a variable.
 
 --
 To unsubscribe, send a message with 'unsubscribe linux-mm' in
 the body to [EMAIL PROTECTED]  For more info on Linux MM,
 see: http://www.linux-mm.org/ .
 Don't email: a href=mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] /a
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] pfn_valid_within() HOLES_WITHIN_ZONES helper

2007-03-21 Thread Bob Picco

Andy Wihitcroft wrote:  [Wed Mar 21 2007, 02:14:55PM EST]
> The thought of having a helper for the holes within zones code
> has come up on two different threads in the last couple of days.
> So I took the pfn_valid_within() patch I had developed for the
> linear reclaim series and pulled it forward to 2.6.21-rc4-mm1.
> I have split it into a three patch series to better align with the
> affected patch sets within -mm.
> 
> add-pfn_valid_within-helper-for-sub-MAX_ORDER-hole-detection --
>   adds the base helper and utilises it within the buddy allocator,
> 
> anti-fragmentation-switch-over-to-pfn_valid_within() -- changes
>   references within Mel Gormans anti-fragmentation patch series, and
> 
> lumpy-move-to-using-pfn_valid_within() -- changes references with
>   my lumpy reclaim patch series.
> 
> -apw
Andy,

Thanks for doing this.

Acked-by: Bob Picco <[EMAIL PROTECTED]>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] pfn_valid_within() HOLES_WITHIN_ZONES helper

2007-03-21 Thread Bob Picco

Andy Wihitcroft wrote:  [Wed Mar 21 2007, 02:14:55PM EST]
 The thought of having a helper for the holes within zones code
 has come up on two different threads in the last couple of days.
 So I took the pfn_valid_within() patch I had developed for the
 linear reclaim series and pulled it forward to 2.6.21-rc4-mm1.
 I have split it into a three patch series to better align with the
 affected patch sets within -mm.
 
 add-pfn_valid_within-helper-for-sub-MAX_ORDER-hole-detection --
   adds the base helper and utilises it within the buddy allocator,
 
 anti-fragmentation-switch-over-to-pfn_valid_within() -- changes
   references within Mel Gormans anti-fragmentation patch series, and
 
 lumpy-move-to-using-pfn_valid_within() -- changes references with
   my lumpy reclaim patch series.
 
 -apw
Andy,

Thanks for doing this.

Acked-by: Bob Picco [EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH] split file and anonymous page queues #2

2007-03-20 Thread Bob Picco

Rik van Riel wrote: [Mon Mar 19 2007, 07:52:34PM EST]
> Split the anonymous and file backed pages out onto their own pageout
> queues.  This we do not unnecessarily churn through lots of anonymous
> pages when we do not want to swap them out anyway.
> 
> This should (with additional tuning) be a great step forward in
> scalability, allowing Linux to run well on very large systems where
> scanning through the anonymous memory (on our way to the page cache
> memory we do want to evict) is slowing systems down significantly.
> 
> This patch has been stress tested and seems to work, but has not
> been fine tuned or benchmarked yet.  For now the swappiness parameter
> can be used to tweak swap aggressiveness up and down as desired, but
> in the long run we may want to simply measure IO cost of page cache
> and anonymous memory and auto-adjust.
> 
> We apply pressure to each of sets of the pageout queues based on:
> - the size of each queue
> - the fraction of recently referenced pages in each queue,
>not counting used-once file pages
> - swappiness (file IO is more efficient than swap IO)
> 
> Please take this patch for a spin and let me know what goes well
> and what goes wrong.
> 
> More info on the patch can be found on:
> 
> http://linux-mm.org/PageReplacementDesign
> 
> Signed-off-by: Rik van Riel <[EMAIL PROTECTED]>
> 
> Changelog:
> - Fix page_anon() to put all the file pages really on the
>   file list.
> - Fix get_scan_ratio() to return more stable numbers, by
>   properly keeping track of the scanned anon and file pages.
> 
> -- 
> Politics is the struggle between those who want to make their country
> the best in the world, and those who believe it already is.  Each group
> calls the other unpatriotic.

> --- linux-2.6.20.x86_64/fs/proc/proc_misc.c.vmsplit   2007-03-19 
> 12:00:11.0 -0400
> +++ linux-2.6.20.x86_64/fs/proc/proc_misc.c   2007-03-19 12:00:23.0 
> -0400
> @@ -147,43 +147,47 @@ static int meminfo_read_proc(char *page,
>* Tagged format, for easy grepping and expansion.
>*/
>   len = sprintf(page,
> - "MemTotal: %8lu kB\n"
> - "MemFree:  %8lu kB\n"
> - "Buffers:  %8lu kB\n"
> - "Cached:   %8lu kB\n"
> - "SwapCached:   %8lu kB\n"
> - "Active:   %8lu kB\n"
> - "Inactive: %8lu kB\n"
> + "MemTotal:   %8lu kB\n"
> + "MemFree:%8lu kB\n"
> + "Buffers:%8lu kB\n"
> + "Cached: %8lu kB\n"
> + "SwapCached: %8lu kB\n"
> + "Active(anon):   %8lu kB\n"
> + "Inactive(anon): %8lu kB\n"
> + "Active(file):   %8lu kB\n"
> + "Inactive(file): %8lu kB\n"
>  #ifdef CONFIG_HIGHMEM
> - "HighTotal:%8lu kB\n"
> - "HighFree: %8lu kB\n"
> - "LowTotal: %8lu kB\n"
> - "LowFree:  %8lu kB\n"
> -#endif
> - "SwapTotal:%8lu kB\n"
> - "SwapFree: %8lu kB\n"
> - "Dirty:%8lu kB\n"
> - "Writeback:%8lu kB\n"
> - "AnonPages:%8lu kB\n"
> - "Mapped:   %8lu kB\n"
> - "Slab: %8lu kB\n"
> - "SReclaimable: %8lu kB\n"
> - "SUnreclaim:   %8lu kB\n"
> - "PageTables:   %8lu kB\n"
> - "NFS_Unstable: %8lu kB\n"
> - "Bounce:   %8lu kB\n"
> - "CommitLimit:  %8lu kB\n"
> - "Committed_AS: %8lu kB\n"
> - "VmallocTotal: %8lu kB\n"
> - "VmallocUsed:  %8lu kB\n"
> - "VmallocChunk: %8lu kB\n",
> + "HighTotal:  %8lu kB\n"
> + "HighFree:   %8lu kB\n"
> + "LowTotal:   %8lu kB\n"
> + "LowFree:%8lu kB\n"
> +#endif
> + "SwapTotal:  %8lu kB\n"
> + "SwapFree:   %8lu kB\n"
> + "Dirty:  %8lu kB\n"
> + "Writeback:  %8lu kB\n"
> + "AnonPages:  %8lu kB\n"
> + "Mapped: %8lu kB\n"
> + "Slab:   %8lu kB\n"
> + "SReclaimable:   %8lu kB\n"
> + "SUnreclaim: %8lu kB\n"
> + "PageTables: %8lu kB\n"
> + "NFS_Unstable:   %8lu kB\n"
> + "Bounce: %8lu kB\n"
> + "CommitLimit:%8lu kB\n"
> + "Committed_AS:   %8lu kB\n"
> + "VmallocTotal:   %8lu kB\n"
> + "VmallocUsed:%8lu kB\n"
> + "VmallocChunk:   %8lu kB\n",
>   K(i.totalram),
>   K(i.freeram),
>   K(i.bufferram),
>   K(cached),
>   K(total_swapcache_pages),
> - K(global_page_state(NR_ACTIVE)),
> - K(global_page_state(NR_INACTIVE)),
> + K(global_page_state(NR_ACTIVE_ANON)),
> +

Re: [RFC][PATCH] split file and anonymous page queues #2

2007-03-20 Thread Bob Picco

Rik van Riel wrote: [Mon Mar 19 2007, 07:52:34PM EST]
 Split the anonymous and file backed pages out onto their own pageout
 queues.  This we do not unnecessarily churn through lots of anonymous
 pages when we do not want to swap them out anyway.
 
 This should (with additional tuning) be a great step forward in
 scalability, allowing Linux to run well on very large systems where
 scanning through the anonymous memory (on our way to the page cache
 memory we do want to evict) is slowing systems down significantly.
 
 This patch has been stress tested and seems to work, but has not
 been fine tuned or benchmarked yet.  For now the swappiness parameter
 can be used to tweak swap aggressiveness up and down as desired, but
 in the long run we may want to simply measure IO cost of page cache
 and anonymous memory and auto-adjust.
 
 We apply pressure to each of sets of the pageout queues based on:
 - the size of each queue
 - the fraction of recently referenced pages in each queue,
not counting used-once file pages
 - swappiness (file IO is more efficient than swap IO)
 
 Please take this patch for a spin and let me know what goes well
 and what goes wrong.
 
 More info on the patch can be found on:
 
 http://linux-mm.org/PageReplacementDesign
 
 Signed-off-by: Rik van Riel [EMAIL PROTECTED]
 
 Changelog:
 - Fix page_anon() to put all the file pages really on the
   file list.
 - Fix get_scan_ratio() to return more stable numbers, by
   properly keeping track of the scanned anon and file pages.
 
 -- 
 Politics is the struggle between those who want to make their country
 the best in the world, and those who believe it already is.  Each group
 calls the other unpatriotic.

 --- linux-2.6.20.x86_64/fs/proc/proc_misc.c.vmsplit   2007-03-19 
 12:00:11.0 -0400
 +++ linux-2.6.20.x86_64/fs/proc/proc_misc.c   2007-03-19 12:00:23.0 
 -0400
 @@ -147,43 +147,47 @@ static int meminfo_read_proc(char *page,
* Tagged format, for easy grepping and expansion.
*/
   len = sprintf(page,
 - MemTotal: %8lu kB\n
 - MemFree:  %8lu kB\n
 - Buffers:  %8lu kB\n
 - Cached:   %8lu kB\n
 - SwapCached:   %8lu kB\n
 - Active:   %8lu kB\n
 - Inactive: %8lu kB\n
 + MemTotal:   %8lu kB\n
 + MemFree:%8lu kB\n
 + Buffers:%8lu kB\n
 + Cached: %8lu kB\n
 + SwapCached: %8lu kB\n
 + Active(anon):   %8lu kB\n
 + Inactive(anon): %8lu kB\n
 + Active(file):   %8lu kB\n
 + Inactive(file): %8lu kB\n
  #ifdef CONFIG_HIGHMEM
 - HighTotal:%8lu kB\n
 - HighFree: %8lu kB\n
 - LowTotal: %8lu kB\n
 - LowFree:  %8lu kB\n
 -#endif
 - SwapTotal:%8lu kB\n
 - SwapFree: %8lu kB\n
 - Dirty:%8lu kB\n
 - Writeback:%8lu kB\n
 - AnonPages:%8lu kB\n
 - Mapped:   %8lu kB\n
 - Slab: %8lu kB\n
 - SReclaimable: %8lu kB\n
 - SUnreclaim:   %8lu kB\n
 - PageTables:   %8lu kB\n
 - NFS_Unstable: %8lu kB\n
 - Bounce:   %8lu kB\n
 - CommitLimit:  %8lu kB\n
 - Committed_AS: %8lu kB\n
 - VmallocTotal: %8lu kB\n
 - VmallocUsed:  %8lu kB\n
 - VmallocChunk: %8lu kB\n,
 + HighTotal:  %8lu kB\n
 + HighFree:   %8lu kB\n
 + LowTotal:   %8lu kB\n
 + LowFree:%8lu kB\n
 +#endif
 + SwapTotal:  %8lu kB\n
 + SwapFree:   %8lu kB\n
 + Dirty:  %8lu kB\n
 + Writeback:  %8lu kB\n
 + AnonPages:  %8lu kB\n
 + Mapped: %8lu kB\n
 + Slab:   %8lu kB\n
 + SReclaimable:   %8lu kB\n
 + SUnreclaim: %8lu kB\n
 + PageTables: %8lu kB\n
 + NFS_Unstable:   %8lu kB\n
 + Bounce: %8lu kB\n
 + CommitLimit:%8lu kB\n
 + Committed_AS:   %8lu kB\n
 + VmallocTotal:   %8lu kB\n
 + VmallocUsed:%8lu kB\n
 + VmallocChunk:   %8lu kB\n,
   K(i.totalram),
   K(i.freeram),
   K(i.bufferram),
   K(cached),
   K(total_swapcache_pages),
 - K(global_page_state(NR_ACTIVE)),
 - K(global_page_state(NR_INACTIVE)),
 + K(global_page_state(NR_ACTIVE_ANON)),
 + K(global_page_state(NR_INACTIVE_ANON)),
 + K(global_page_state(NR_ACTIVE_FILE)),
 + K(global_page_state(NR_INACTIVE_FILE)),
  #ifdef CONFIG_HIGHMEM
   K(i.totalhigh),
   K(i.freehigh),
 ---

Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Bob Picco

Andi Kleen wrote:   [Tue Feb 13 2007, 01:18:45PM EST]
> 
> > I wasn't suggesting having NULL pointers for pgdats, if that's what you
> > mean. 
> 
> That is what started the original thread at least. Can happen on some
> ia64 platforms.
I don't believe there is a NULL pgdat. The code for memory less nodes in
ia64 discontig.c allocates the memory less nodes pgdat from the best
memory node candidate. If there is a NULL pgdat, then it's a bug. Instead
for memory less nodes you don't have any present pages. 

I thought the bug was because the process wanted to bind on just one
memoryless node and MPOL_BIND didn't handle that correctly and return
an error to the process.

bob
> 
> > Just nodes with no memory in them, the pgdat would still be there. 
> > pgdat = struct node, except everything's badly named.
> 
> Ok those can happen even on x86-64, mostly because it's possible
> to fill up a node early during boot up with bootmem and then
> it's effectively empty.
> 
> [there is even still a open bug when this happens on node 0]
>  
> Handling out of memory here of course has to be always done.
> 
> Just NULL pointers in core data structures are evil. But I'm glad we 
> agree here.
> 
> Now if it's better to set up a empty node or use a nearby node
> for a memory less cpu can be further discussed. I still think
> I lean towards the later.
> 
> -Andi
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread Bob Picco

Andi Kleen wrote:   [Tue Feb 13 2007, 01:18:45PM EST]
 
  I wasn't suggesting having NULL pointers for pgdats, if that's what you
  mean. 
 
 That is what started the original thread at least. Can happen on some
 ia64 platforms.
I don't believe there is a NULL pgdat. The code for memory less nodes in
ia64 discontig.c allocates the memory less nodes pgdat from the best
memory node candidate. If there is a NULL pgdat, then it's a bug. Instead
for memory less nodes you don't have any present pages. 

I thought the bug was because the process wanted to bind on just one
memoryless node and MPOL_BIND didn't handle that correctly and return
an error to the process.

bob
 
  Just nodes with no memory in them, the pgdat would still be there. 
  pgdat = struct node, except everything's badly named.
 
 Ok those can happen even on x86-64, mostly because it's possible
 to fill up a node early during boot up with bootmem and then
 it's effectively empty.
 
 [there is even still a open bug when this happens on node 0]
  
 Handling out of memory here of course has to be always done.
 
 Just NULL pointers in core data structures are evil. But I'm glad we 
 agree here.
 
 Now if it's better to set up a empty node or use a nearby node
 for a memory less cpu can be further discussed. I still think
 I lean towards the later.
 
 -Andi
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-08 Thread Bob Picco

Hiroyuki KAMEZAWA wrote:[Wed Feb 07 2007, 03:36:47AM EST]
> On Wed, 7 Feb 2007 00:04:41 -0800 (PST)
> Christoph Lameter <[EMAIL PROTECTED]> wrote:
> 
> > On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote:
> > 
> > > > Hmmm... Remove the node from the node_online_map instead?
> > > > 
> > > Changing defintion of node_online_map is harmfil. (there are 
> > > cpu-only-nodes.)
> > > How about adding  nodemask for nodes equips memory ?
> > 
> > Ok that is better but...
> > 
> > Would it be possible to attach the cpus to the 
> > next nodes with memory and mark the node offline? That way we could avoid 
> > another mask that we constantly have to check?
> > 
> Added ia64 list to CC.
> I know ia64 kernel did what you say in old days (I know RHEL4/2.6.9 kernel 
> does it).
> Someone changed it and created cpu-only-node for some purpose, I don't know 
> why.
That was me. It will probably be later today or Friday before I've had
time to review the code.  For reference look for string memory_less in
arch/ia64/mm/discontig.c.

The short story is HP ships NUMA boxes with interleaved memory only by
default which is represented by a single memory only node. Originally all
the CPU nodes where assigned to the memory node. The code was very
complicated and incorrect to me. Subsequently, and what we have now, the CPU
only nodes are revealed and the memory only node too. I do believe that a
cpu only nodes should be possible but now there seems to be a new issue.

bob
> 
> 
> > Or fix the location where the error occurred to be able to tolerate a node 
> > with no zones?
> > 
> Hmm, 
> In this case, MPOL_MBIND, the user requests to allocate memory from specified 
> nodes.
> I think it's better to tell him "you can't do that" than silently allocating 
> memory
> from other places.
> 
> -Kame
> 
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-08 Thread Bob Picco

Hiroyuki KAMEZAWA wrote:[Wed Feb 07 2007, 03:36:47AM EST]
 On Wed, 7 Feb 2007 00:04:41 -0800 (PST)
 Christoph Lameter [EMAIL PROTECTED] wrote:
 
  On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote:
  
Hmmm... Remove the node from the node_online_map instead?

   Changing defintion of node_online_map is harmfil. (there are 
   cpu-only-nodes.)
   How about adding  nodemask for nodes equips memory ?
  
  Ok that is better but...
  
  Would it be possible to attach the cpus to the 
  next nodes with memory and mark the node offline? That way we could avoid 
  another mask that we constantly have to check?
  
 Added ia64 list to CC.
 I know ia64 kernel did what you say in old days (I know RHEL4/2.6.9 kernel 
 does it).
 Someone changed it and created cpu-only-node for some purpose, I don't know 
 why.
That was me. It will probably be later today or Friday before I've had
time to review the code.  For reference look for string memory_less in
arch/ia64/mm/discontig.c.

The short story is HP ships NUMA boxes with interleaved memory only by
default which is represented by a single memory only node. Originally all
the CPU nodes where assigned to the memory node. The code was very
complicated and incorrect to me. Subsequently, and what we have now, the CPU
only nodes are revealed and the memory only node too. I do believe that a
cpu only nodes should be possible but now there seems to be a new issue.

bob
 
 
  Or fix the location where the error occurred to be able to tolerate a node 
  with no zones?
  
 Hmm, 
 In this case, MPOL_MBIND, the user requests to allocate memory from specified 
 nodes.
 I think it's better to tell him you can't do that than silently allocating 
 memory
 from other places.
 
 -Kame
 
 
 
 
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc6-mm[2-3] ACPI issues

2007-02-02 Thread Bob Picco

Alexey Starikovskiy wrote:  [Fri Feb 02 2007, 09:20:35AM EST]
> Bob Picco wrote:
> >BTW, this isn't specific to rx2600. Lee Schermerhorn reported same -mm3 
> >problem on rx8620. Stephane Eranian reported the -mm2 problem mentioned 
> >above on rx2620.
> >
> >The debug information you requested is below. 
> >
> >thanks,
> >
> >bob
> Bob, thanks for debug information.
> Could you please try following patch?
> 
> Thanks,
your welcome,

It boots rx2600 and NUMA simulator successfully. The NUMA simulator has my M$
SRAT 1.0 hack applied.

thanks,

bob
>Alex.

> Copy space_id of GAS structure to newly created GAS.
> 
> From: Alexey Starikovskiy <[EMAIL PROTECTED]>
> 
> 
> ---
> 
>  drivers/acpi/tables/tbfadt.c |5 +
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/acpi/tables/tbfadt.c b/drivers/acpi/tables/tbfadt.c
> index 30350dd..807c711 100644
> --- a/drivers/acpi/tables/tbfadt.c
> +++ b/drivers/acpi/tables/tbfadt.c
> @@ -333,6 +333,8 @@ static void acpi_tb_convert_fadt(void)
>pm1_register_length,
>(acpi_gbl_FADT.xpm1a_event_block.address +
> pm1_register_length));
> + /* Don't forget to copy space_id of the GAS */
> + acpi_gbl_xpm1a_enable.space_id = 
> acpi_gbl_FADT.xpm1a_event_block.space_id;
>  
>   /* The PM1B register block is optional, ignore if not present */
>  
> @@ -341,6 +343,9 @@ static void acpi_tb_convert_fadt(void)
>pm1_register_length,
>(acpi_gbl_FADT.xpm1b_event_block.
> address + pm1_register_length));
> + /* Don't forget to copy space_id of the GAS */
> + acpi_gbl_xpm1b_enable.space_id = 
> acpi_gbl_FADT.xpm1a_event_block.space_id;
> +
>   }
>  }
>  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc6-mm[2-3] ACPI issues

2007-02-02 Thread Bob Picco

Alexey Starikovskiy wrote:  [Fri Feb 02 2007, 09:20:35AM EST]
 Bob Picco wrote:
 BTW, this isn't specific to rx2600. Lee Schermerhorn reported same -mm3 
 problem on rx8620. Stephane Eranian reported the -mm2 problem mentioned 
 above on rx2620.
 
 The debug information you requested is below. 
 
 thanks,
 
 bob
 Bob, thanks for debug information.
 Could you please try following patch?
 
 Thanks,
your welcome,

It boots rx2600 and NUMA simulator successfully. The NUMA simulator has my M$
SRAT 1.0 hack applied.

thanks,

bob
Alex.

 Copy space_id of GAS structure to newly created GAS.
 
 From: Alexey Starikovskiy [EMAIL PROTECTED]
 
 
 ---
 
  drivers/acpi/tables/tbfadt.c |5 +
  1 files changed, 5 insertions(+), 0 deletions(-)
 
 diff --git a/drivers/acpi/tables/tbfadt.c b/drivers/acpi/tables/tbfadt.c
 index 30350dd..807c711 100644
 --- a/drivers/acpi/tables/tbfadt.c
 +++ b/drivers/acpi/tables/tbfadt.c
 @@ -333,6 +333,8 @@ static void acpi_tb_convert_fadt(void)
pm1_register_length,
(acpi_gbl_FADT.xpm1a_event_block.address +
 pm1_register_length));
 + /* Don't forget to copy space_id of the GAS */
 + acpi_gbl_xpm1a_enable.space_id = 
 acpi_gbl_FADT.xpm1a_event_block.space_id;
  
   /* The PM1B register block is optional, ignore if not present */
  
 @@ -341,6 +343,9 @@ static void acpi_tb_convert_fadt(void)
pm1_register_length,
(acpi_gbl_FADT.xpm1b_event_block.
 address + pm1_register_length));
 + /* Don't forget to copy space_id of the GAS */
 + acpi_gbl_xpm1b_enable.space_id = 
 acpi_gbl_FADT.xpm1a_event_block.space_id;
 +
   }
  }
  

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 2.6.20-rc6-mm[2-3] ACPI issues

2007-02-01 Thread Bob Picco

Len Brown wrote:[Thu Feb 01 2007, 01:55:01AM EST]
> On Wednesday 31 January 2007 15:25, Bob Picco wrote:
> > Hi Len,
> > 
> > On 2.6.20-rc6-mm2 my rx2600 wouldn't boot unless I reverted all changes
> > to drivers/acpi/tables.c.
> 
> 2.6.20-rc6-mm2 git-acpi.patch contained only the acpi_table_parse()
> fix for bugzilla-7465.  The version of the patch that made -mm2 had a flaw
> where if the HPET were not connfigured, it would BUG_ON() due to a NULL
> handler it didn't expect -- and this generally happened before VGA was 
> enabled.
> That bug got fixed, and also bugzilla-7465 is not in -mm3 -- which explains
> why it went away.
> 
> > Nearly all major early boot detected ACPI tables 
> > weren't discovered.
> 
> That part I can't explain.
> 
> > I never had time to resolve because 2.6.20-rc6-mm3 
> > showed up. The -mm2 problem appears corrected but the machine now crashes in
> > acpi_init for -mm3. 2.6.20-rc6 and 2.6.20-rc6-mm3 boot logs are included
> > at end of email.
> > 
> > For 2.6.20-rc6-mm3 the HP internal simualtor for a NUMA machine is
> > getting a preposterous pxm value and subsequently MCAs in pxm_to_node
> > because of large pxm. It seems like table parsing is being done
> > incorrectly.
> > 
> > Nope this is the issue:
> > 
> > Index: linux-2.6.20-rc6-mm3/include/acpi/actbl1.h
> > ===
> > --- linux-2.6.20-rc6-mm3.orig/include/acpi/actbl1.h 2007-01-30 
> > 09:27:44.0 -0500
> > +++ linux-2.6.20-rc6-mm3/include/acpi/actbl1.h  2007-01-31 
> > 14:41:32.0 -0500
> > @@ -654,8 +654,8 @@ struct acpi_srat_cpu_affinity {
> >  
> >  struct acpi_srat_mem_affinity {
> > struct acpi_subtable_header header;
> > -   u32 proximity_domain;
> > -   u16 reserved;   /* Reserved, must be zero */
> > +   u8  proximity_domain;
> > +   u8  reserved[5];/* Reserved, must be zero */
> > u64 base_address;
> > u64 length;
> > u32 memory_type;/* See acpi_address_range_id */
> > Index: linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c
> > ===
> > --- linux-2.6.20-rc6-mm3.orig/arch/ia64/kernel/acpi.c   2007-01-30 
> > 13:55:08.0 -0500
> > +++ linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c2007-01-31 
> > 14:49:26.0 -0500
> > @@ -423,7 +423,7 @@ int get_memory_proximity_domain(struct a
> >  
> > pxm = ma->proximity_domain;
> > if (ia64_platform_is("sn2"))
> > -   pxm += ma->reserved << 8;
> > +   pxm += ma->reserved[0] << 8;
> >  
> > return pxm;
> >  }
> > 
> > I doubt you'll want to apply this patch. It appears HP firmware has some
> > of the reserved field not initialized to zero. This results in the huge
> > pxm. Was the pxm size expanded with a recent ACPI spec revision? 
> 
> Yep.
> The original code was programmed to the Microsoft SRAT spec -- which
> identifies itself as version 1.  The new code is talking to ACPI 3.0 SRAT spec
> which identifies itself as version 2.
> 
> In the SRAT memory affinity structure, the difference is that the 
> proximity_domain
> is now 4-bytes instead of 1.
> 
> We need to be checking for the SRAT revision and handling both revisions.
> 
> Might be safer to build w/o NUMA until we get the SRAT fixed.
> 
> > Well with this patch I can pursue the acpi_init panic on simulator.
> > 
> > rx2600 (2 CPU MP) and NUMA simulator (1 node and 4 cpus)  boot successfully
> > on 2.6.20-rc6.
> > 
> > bob
> > 
> > 
> > Linux version 2.6.20-rc6 ([EMAIL PROTECTED]) (gcc version 3.4.1) #1 SMP Mon 
> > Jan 29 14:40:17 EST 2007
> > EFI v1.10 by HP: SALsystab=0x3fb38000 ACPI 2.0=0x3fb2e000 SMBIOS=0x3fb3a000 
> > HCDP=0x3fb2c000
> > PCDP: v0 at 0x3fb2c000
> > Early serial console at MMIO 0xf803 (options '9600n8')
> > ACPI: RSDP (v002 HP) @ 
> > 0x3fb2e000
> > ACPI: XSDT (v001 HP   rx2600 0x HP 0x) @ 
> > 0x3fb2e02c
> > ACPI: FADT (v003 HP   rx2600 0x HP 0x) @ 
> > 0x3fb369e0
> > ACPI: SPCR (v001 HP   rx2600 0x HP 0x) @ 
> > 0x3fb36b18
> > ACPI: DBGP (v001 HP   rx2600 0x HP 0x) @ 
> > 0x3fb36b68
> > ACPI: MADT (v001 HP   rx2600 0x HP 0x) @ 
> > 0x3fb36c28
> > ACPI: SPMI (v004 HP   rx2600 0x

Re: 2.6.20-rc6-mm[2-3] ACPI issues

2007-02-01 Thread Bob Picco

Len Brown wrote:[Thu Feb 01 2007, 01:55:01AM EST]
 On Wednesday 31 January 2007 15:25, Bob Picco wrote:
  Hi Len,
  
  On 2.6.20-rc6-mm2 my rx2600 wouldn't boot unless I reverted all changes
  to drivers/acpi/tables.c.
 
 2.6.20-rc6-mm2 git-acpi.patch contained only the acpi_table_parse()
 fix for bugzilla-7465.  The version of the patch that made -mm2 had a flaw
 where if the HPET were not connfigured, it would BUG_ON() due to a NULL
 handler it didn't expect -- and this generally happened before VGA was 
 enabled.
 That bug got fixed, and also bugzilla-7465 is not in -mm3 -- which explains
 why it went away.
 
  Nearly all major early boot detected ACPI tables 
  weren't discovered.
 
 That part I can't explain.
 
  I never had time to resolve because 2.6.20-rc6-mm3 
  showed up. The -mm2 problem appears corrected but the machine now crashes in
  acpi_init for -mm3. 2.6.20-rc6 and 2.6.20-rc6-mm3 boot logs are included
  at end of email.
  
  For 2.6.20-rc6-mm3 the HP internal simualtor for a NUMA machine is
  getting a preposterous pxm value and subsequently MCAs in pxm_to_node
  because of large pxm. It seems like table parsing is being done
  incorrectly.
  
  Nope this is the issue:
  
  Index: linux-2.6.20-rc6-mm3/include/acpi/actbl1.h
  ===
  --- linux-2.6.20-rc6-mm3.orig/include/acpi/actbl1.h 2007-01-30 
  09:27:44.0 -0500
  +++ linux-2.6.20-rc6-mm3/include/acpi/actbl1.h  2007-01-31 
  14:41:32.0 -0500
  @@ -654,8 +654,8 @@ struct acpi_srat_cpu_affinity {
   
   struct acpi_srat_mem_affinity {
  struct acpi_subtable_header header;
  -   u32 proximity_domain;
  -   u16 reserved;   /* Reserved, must be zero */
  +   u8  proximity_domain;
  +   u8  reserved[5];/* Reserved, must be zero */
  u64 base_address;
  u64 length;
  u32 memory_type;/* See acpi_address_range_id */
  Index: linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c
  ===
  --- linux-2.6.20-rc6-mm3.orig/arch/ia64/kernel/acpi.c   2007-01-30 
  13:55:08.0 -0500
  +++ linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c2007-01-31 
  14:49:26.0 -0500
  @@ -423,7 +423,7 @@ int get_memory_proximity_domain(struct a
   
  pxm = ma-proximity_domain;
  if (ia64_platform_is(sn2))
  -   pxm += ma-reserved  8;
  +   pxm += ma-reserved[0]  8;
   
  return pxm;
   }
  
  I doubt you'll want to apply this patch. It appears HP firmware has some
  of the reserved field not initialized to zero. This results in the huge
  pxm. Was the pxm size expanded with a recent ACPI spec revision? 
 
 Yep.
 The original code was programmed to the Microsoft SRAT spec -- which
 identifies itself as version 1.  The new code is talking to ACPI 3.0 SRAT spec
 which identifies itself as version 2.
 
 In the SRAT memory affinity structure, the difference is that the 
 proximity_domain
 is now 4-bytes instead of 1.
 
 We need to be checking for the SRAT revision and handling both revisions.
 
 Might be safer to build w/o NUMA until we get the SRAT fixed.
 
  Well with this patch I can pursue the acpi_init panic on simulator.
  
  rx2600 (2 CPU MP) and NUMA simulator (1 node and 4 cpus)  boot successfully
  on 2.6.20-rc6.
  
  bob
  
  
  Linux version 2.6.20-rc6 ([EMAIL PROTECTED]) (gcc version 3.4.1) #1 SMP Mon 
  Jan 29 14:40:17 EST 2007
  EFI v1.10 by HP: SALsystab=0x3fb38000 ACPI 2.0=0x3fb2e000 SMBIOS=0x3fb3a000 
  HCDP=0x3fb2c000
  PCDP: v0 at 0x3fb2c000
  Early serial console at MMIO 0xf803 (options '9600n8')
  ACPI: RSDP (v002 HP) @ 
  0x3fb2e000
  ACPI: XSDT (v001 HP   rx2600 0x HP 0x) @ 
  0x3fb2e02c
  ACPI: FADT (v003 HP   rx2600 0x HP 0x) @ 
  0x3fb369e0
  ACPI: SPCR (v001 HP   rx2600 0x HP 0x) @ 
  0x3fb36b18
  ACPI: DBGP (v001 HP   rx2600 0x HP 0x) @ 
  0x3fb36b68
  ACPI: MADT (v001 HP   rx2600 0x HP 0x) @ 
  0x3fb36c28
  ACPI: SPMI (v004 HP   rx2600 0x HP 0x) @ 
  0x3fb36ba0
  ACPI: CPEP (v001 HP   rx2600 0x HP 0x) @ 
  0x3fb36bf0
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb33870
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb33a50
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb33da0
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb347c0
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb351e0
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb35c00
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
  0x3fb36620
  ACPI: SSDT (v001 HP   rx2600 0x0006 INTL

2.6.20-rc6-mm[2-3] ACPI issues

2007-01-31 Thread Bob Picco

Hi Len,

On 2.6.20-rc6-mm2 my rx2600 wouldn't boot unless I reverted all changes
to drivers/acpi/tables.c. Nearly all major early boot detected ACPI tables
weren't discovered.  I never had time to resolve because 2.6.20-rc6-mm3
showed up. The -mm2 problem appears corrected but the machine now crashes in
acpi_init for -mm3. 2.6.20-rc6 and 2.6.20-rc6-mm3 boot logs are included
at end of email.

For 2.6.20-rc6-mm3 the HP internal simualtor for a NUMA machine is
getting a preposterous pxm value and subsequently MCAs in pxm_to_node
because of large pxm. It seems like table parsing is being done
incorrectly.

Nope this is the issue:

Index: linux-2.6.20-rc6-mm3/include/acpi/actbl1.h
===
--- linux-2.6.20-rc6-mm3.orig/include/acpi/actbl1.h 2007-01-30 
09:27:44.0 -0500
+++ linux-2.6.20-rc6-mm3/include/acpi/actbl1.h  2007-01-31 14:41:32.0 
-0500
@@ -654,8 +654,8 @@ struct acpi_srat_cpu_affinity {
 
 struct acpi_srat_mem_affinity {
struct acpi_subtable_header header;
-   u32 proximity_domain;
-   u16 reserved;   /* Reserved, must be zero */
+   u8  proximity_domain;
+   u8  reserved[5];/* Reserved, must be zero */
u64 base_address;
u64 length;
u32 memory_type;/* See acpi_address_range_id */
Index: linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c
===
--- linux-2.6.20-rc6-mm3.orig/arch/ia64/kernel/acpi.c   2007-01-30 
13:55:08.0 -0500
+++ linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c2007-01-31 
14:49:26.0 -0500
@@ -423,7 +423,7 @@ int get_memory_proximity_domain(struct a
 
pxm = ma->proximity_domain;
if (ia64_platform_is("sn2"))
-   pxm += ma->reserved << 8;
+   pxm += ma->reserved[0] << 8;
 
return pxm;
 }

I doubt you'll want to apply this patch. It appears HP firmware has some
of the reserved field not initialized to zero. This results in the huge
pxm. Was the pxm size expanded with a recent ACPI spec revision? 

Well with this patch I can pursue the acpi_init panic on simulator.

rx2600 (2 CPU MP) and NUMA simulator (1 node and 4 cpus)  boot successfully
on 2.6.20-rc6.

bob


Linux version 2.6.20-rc6 ([EMAIL PROTECTED]) (gcc version 3.4.1) #1 SMP Mon Jan 
29 14:40:17 EST 2007
EFI v1.10 by HP: SALsystab=0x3fb38000 ACPI 2.0=0x3fb2e000 SMBIOS=0x3fb3a000 
HCDP=0x3fb2c000
PCDP: v0 at 0x3fb2c000
Early serial console at MMIO 0xf803 (options '9600n8')
ACPI: RSDP (v002 HP) @ 0x3fb2e000
ACPI: XSDT (v001 HP   rx2600 0x HP 0x) @ 0x3fb2e02c
ACPI: FADT (v003 HP   rx2600 0x HP 0x) @ 0x3fb369e0
ACPI: SPCR (v001 HP   rx2600 0x HP 0x) @ 0x3fb36b18
ACPI: DBGP (v001 HP   rx2600 0x HP 0x) @ 0x3fb36b68
ACPI: MADT (v001 HP   rx2600 0x HP 0x) @ 0x3fb36c28
ACPI: SPMI (v004 HP   rx2600 0x HP 0x) @ 0x3fb36ba0
ACPI: CPEP (v001 HP   rx2600 0x HP 0x) @ 0x3fb36bf0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb33870
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb33a50
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb33da0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb347c0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb351e0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb35c00
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb36620
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb36800
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb368f0
ACPI: DSDT (v001 HP   rx2600 0x0007 INTL 0x02012044) @ 
0x
SAL 3.1: HP version 2.21
SAL Platform features: None
SAL: AP wakeup using external interrupt vector 0xff
No logical to physical processor mapping available
ACPI: Local APIC address c000fee0
GSI 36 (level, low) -> CPU 0 (0x) vector 48
2 CPUs available, 2 CPUs total
MCA related initialization done
Entering add_active_range(0, 1025, 4096) 0 entries of 12800 used
Entering add_active_range(0, 4825, 64889) 1 entries of 12800 used
Entering add_active_range(0, 65216, 65227) 2 entries of 12800 used
Entering add_active_range(0, 16842752, 17038305) 3 entries of 12800 used
Entering add_active_range(0, 17038307, 17038312) 4 entries of 12800 used
Entering add_active_range(0, 17038313, 17039193) 5 entries of 12800 used
Entering add_active_range(0, 17039209, 17039236) 6 entries of 12800 used
Entering add_active_range(0, 17039264, 17039343) 7 entries of 12800 used
Zone PFN ranges:
  DMA  1025 ->   262144
  Normal 262144 -> 17039360

2.6.20-rc6-mm[2-3] ACPI issues

2007-01-31 Thread Bob Picco

Hi Len,

On 2.6.20-rc6-mm2 my rx2600 wouldn't boot unless I reverted all changes
to drivers/acpi/tables.c. Nearly all major early boot detected ACPI tables
weren't discovered.  I never had time to resolve because 2.6.20-rc6-mm3
showed up. The -mm2 problem appears corrected but the machine now crashes in
acpi_init for -mm3. 2.6.20-rc6 and 2.6.20-rc6-mm3 boot logs are included
at end of email.

For 2.6.20-rc6-mm3 the HP internal simualtor for a NUMA machine is
getting a preposterous pxm value and subsequently MCAs in pxm_to_node
because of large pxm. It seems like table parsing is being done
incorrectly.

Nope this is the issue:

Index: linux-2.6.20-rc6-mm3/include/acpi/actbl1.h
===
--- linux-2.6.20-rc6-mm3.orig/include/acpi/actbl1.h 2007-01-30 
09:27:44.0 -0500
+++ linux-2.6.20-rc6-mm3/include/acpi/actbl1.h  2007-01-31 14:41:32.0 
-0500
@@ -654,8 +654,8 @@ struct acpi_srat_cpu_affinity {
 
 struct acpi_srat_mem_affinity {
struct acpi_subtable_header header;
-   u32 proximity_domain;
-   u16 reserved;   /* Reserved, must be zero */
+   u8  proximity_domain;
+   u8  reserved[5];/* Reserved, must be zero */
u64 base_address;
u64 length;
u32 memory_type;/* See acpi_address_range_id */
Index: linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c
===
--- linux-2.6.20-rc6-mm3.orig/arch/ia64/kernel/acpi.c   2007-01-30 
13:55:08.0 -0500
+++ linux-2.6.20-rc6-mm3/arch/ia64/kernel/acpi.c2007-01-31 
14:49:26.0 -0500
@@ -423,7 +423,7 @@ int get_memory_proximity_domain(struct a
 
pxm = ma-proximity_domain;
if (ia64_platform_is(sn2))
-   pxm += ma-reserved  8;
+   pxm += ma-reserved[0]  8;
 
return pxm;
 }

I doubt you'll want to apply this patch. It appears HP firmware has some
of the reserved field not initialized to zero. This results in the huge
pxm. Was the pxm size expanded with a recent ACPI spec revision? 

Well with this patch I can pursue the acpi_init panic on simulator.

rx2600 (2 CPU MP) and NUMA simulator (1 node and 4 cpus)  boot successfully
on 2.6.20-rc6.

bob


Linux version 2.6.20-rc6 ([EMAIL PROTECTED]) (gcc version 3.4.1) #1 SMP Mon Jan 
29 14:40:17 EST 2007
EFI v1.10 by HP: SALsystab=0x3fb38000 ACPI 2.0=0x3fb2e000 SMBIOS=0x3fb3a000 
HCDP=0x3fb2c000
PCDP: v0 at 0x3fb2c000
Early serial console at MMIO 0xf803 (options '9600n8')
ACPI: RSDP (v002 HP) @ 0x3fb2e000
ACPI: XSDT (v001 HP   rx2600 0x HP 0x) @ 0x3fb2e02c
ACPI: FADT (v003 HP   rx2600 0x HP 0x) @ 0x3fb369e0
ACPI: SPCR (v001 HP   rx2600 0x HP 0x) @ 0x3fb36b18
ACPI: DBGP (v001 HP   rx2600 0x HP 0x) @ 0x3fb36b68
ACPI: MADT (v001 HP   rx2600 0x HP 0x) @ 0x3fb36c28
ACPI: SPMI (v004 HP   rx2600 0x HP 0x) @ 0x3fb36ba0
ACPI: CPEP (v001 HP   rx2600 0x HP 0x) @ 0x3fb36bf0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb33870
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb33a50
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb33da0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb347c0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb351e0
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb35c00
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb36620
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb36800
ACPI: SSDT (v001 HP   rx2600 0x0006 INTL 0x02012044) @ 
0x3fb368f0
ACPI: DSDT (v001 HP   rx2600 0x0007 INTL 0x02012044) @ 
0x
SAL 3.1: HP version 2.21
SAL Platform features: None
SAL: AP wakeup using external interrupt vector 0xff
No logical to physical processor mapping available
ACPI: Local APIC address c000fee0
GSI 36 (level, low) - CPU 0 (0x) vector 48
2 CPUs available, 2 CPUs total
MCA related initialization done
Entering add_active_range(0, 1025, 4096) 0 entries of 12800 used
Entering add_active_range(0, 4825, 64889) 1 entries of 12800 used
Entering add_active_range(0, 65216, 65227) 2 entries of 12800 used
Entering add_active_range(0, 16842752, 17038305) 3 entries of 12800 used
Entering add_active_range(0, 17038307, 17038312) 4 entries of 12800 used
Entering add_active_range(0, 17038313, 17039193) 5 entries of 12800 used
Entering add_active_range(0, 17039209, 17039236) 6 entries of 12800 used
Entering add_active_range(0, 17039264, 17039343) 7 entries of 12800 used
Zone PFN ranges:
  DMA  1025 -   262144
  Normal 262144 - 17039360

[PATCH] clean up sparsemem memory_present calls for ia64 and x86_64

2007-01-17 Thread Bob Picco


Eliminate arch specific memory_present calls for ia64 NUMA and x86_64 NUMA by
utilizing sparse_memory_present_with_active_regions. It was boot tested
for both arches.
 
Acked-by: Mel Gorman <[EMAIL PROTECTED]>
Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 arch/ia64/mm/discontig.c |   36 +++-
 arch/x86_64/mm/numa.c|   17 ++---
 2 files changed, 5 insertions(+), 48 deletions(-)

Index: linux-2.6.20-rc4/arch/ia64/mm/discontig.c
===
--- linux-2.6.20-rc4.orig/arch/ia64/mm/discontig.c  2007-01-11 
12:11:54.0 -0500
+++ linux-2.6.20-rc4/arch/ia64/mm/discontig.c   2007-01-11 12:12:38.0 
-0500
@@ -412,37 +412,6 @@ static void __init memory_less_nodes(voi
return;
 }
 
-#ifdef CONFIG_SPARSEMEM
-/**
- * register_sparse_mem - notify SPARSEMEM that this memory range exists.
- * @start: physical start of range
- * @end: physical end of range
- * @arg: unused
- *
- * Simply calls SPARSEMEM to register memory section(s).
- */
-static int __init register_sparse_mem(unsigned long start, unsigned long end,
-   void *arg)
-{
-   int nid;
-
-   start = __pa(start) >> PAGE_SHIFT;
-   end = __pa(end) >> PAGE_SHIFT;
-   nid = early_pfn_to_nid(start);
-   memory_present(nid, start, end);
-
-   return 0;
-}
-
-static void __init arch_sparse_init(void)
-{
-   efi_memmap_walk(register_sparse_mem, NULL);
-   sparse_init();
-}
-#else
-#define arch_sparse_init() do {} while (0)
-#endif
-
 /**
  * find_memory - walk the EFI memory map and setup the bootmem allocator
  *
@@ -688,10 +657,11 @@ void __init paging_init(void)
 
max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS) >> PAGE_SHIFT;
 
-   arch_sparse_init();
-
efi_memmap_walk(filter_rsvd_memory, count_node_pages);
 
+   sparse_memory_present_with_active_regions(MAX_NUMNODES);
+   sparse_init();
+
 #ifdef CONFIG_VIRTUAL_MEM_MAP
vmalloc_end -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
sizeof(struct page));
Index: linux-2.6.20-rc4/arch/x86_64/mm/numa.c
===
--- linux-2.6.20-rc4.orig/arch/x86_64/mm/numa.c 2007-01-11 12:11:08.0 
-0500
+++ linux-2.6.20-rc4/arch/x86_64/mm/numa.c  2007-01-11 12:11:57.0 
-0500
@@ -321,20 +321,6 @@ unsigned long __init numa_free_all_bootm
return pages;
 } 
 
-#ifdef CONFIG_SPARSEMEM
-static void __init arch_sparse_init(void)
-{
-   int i;
-
-   for_each_online_node(i)
-   memory_present(i, node_start_pfn(i), node_end_pfn(i));
-
-   sparse_init();
-}
-#else
-#define arch_sparse_init() do {} while (0)
-#endif
-
 void __init paging_init(void)
 { 
int i;
@@ -344,7 +330,8 @@ void __init paging_init(void)
max_zone_pfns[ZONE_DMA32] = MAX_DMA32_PFN;
max_zone_pfns[ZONE_NORMAL] = end_pfn;
 
-   arch_sparse_init();
+   sparse_memory_present_with_active_regions(MAX_NUMNODES);
+   sparse_init();
 
for_each_online_node(i) {
setup_node_zones(i); 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] clean up sparsemem memory_present calls for ia64 and x86_64

2007-01-17 Thread Bob Picco


Eliminate arch specific memory_present calls for ia64 NUMA and x86_64 NUMA by
utilizing sparse_memory_present_with_active_regions. It was boot tested
for both arches.
 
Acked-by: Mel Gorman [EMAIL PROTECTED]
Signed-off-by: Bob Picco [EMAIL PROTECTED]

 arch/ia64/mm/discontig.c |   36 +++-
 arch/x86_64/mm/numa.c|   17 ++---
 2 files changed, 5 insertions(+), 48 deletions(-)

Index: linux-2.6.20-rc4/arch/ia64/mm/discontig.c
===
--- linux-2.6.20-rc4.orig/arch/ia64/mm/discontig.c  2007-01-11 
12:11:54.0 -0500
+++ linux-2.6.20-rc4/arch/ia64/mm/discontig.c   2007-01-11 12:12:38.0 
-0500
@@ -412,37 +412,6 @@ static void __init memory_less_nodes(voi
return;
 }
 
-#ifdef CONFIG_SPARSEMEM
-/**
- * register_sparse_mem - notify SPARSEMEM that this memory range exists.
- * @start: physical start of range
- * @end: physical end of range
- * @arg: unused
- *
- * Simply calls SPARSEMEM to register memory section(s).
- */
-static int __init register_sparse_mem(unsigned long start, unsigned long end,
-   void *arg)
-{
-   int nid;
-
-   start = __pa(start)  PAGE_SHIFT;
-   end = __pa(end)  PAGE_SHIFT;
-   nid = early_pfn_to_nid(start);
-   memory_present(nid, start, end);
-
-   return 0;
-}
-
-static void __init arch_sparse_init(void)
-{
-   efi_memmap_walk(register_sparse_mem, NULL);
-   sparse_init();
-}
-#else
-#define arch_sparse_init() do {} while (0)
-#endif
-
 /**
  * find_memory - walk the EFI memory map and setup the bootmem allocator
  *
@@ -688,10 +657,11 @@ void __init paging_init(void)
 
max_dma = virt_to_phys((void *) MAX_DMA_ADDRESS)  PAGE_SHIFT;
 
-   arch_sparse_init();
-
efi_memmap_walk(filter_rsvd_memory, count_node_pages);
 
+   sparse_memory_present_with_active_regions(MAX_NUMNODES);
+   sparse_init();
+
 #ifdef CONFIG_VIRTUAL_MEM_MAP
vmalloc_end -= PAGE_ALIGN(ALIGN(max_low_pfn, MAX_ORDER_NR_PAGES) *
sizeof(struct page));
Index: linux-2.6.20-rc4/arch/x86_64/mm/numa.c
===
--- linux-2.6.20-rc4.orig/arch/x86_64/mm/numa.c 2007-01-11 12:11:08.0 
-0500
+++ linux-2.6.20-rc4/arch/x86_64/mm/numa.c  2007-01-11 12:11:57.0 
-0500
@@ -321,20 +321,6 @@ unsigned long __init numa_free_all_bootm
return pages;
 } 
 
-#ifdef CONFIG_SPARSEMEM
-static void __init arch_sparse_init(void)
-{
-   int i;
-
-   for_each_online_node(i)
-   memory_present(i, node_start_pfn(i), node_end_pfn(i));
-
-   sparse_init();
-}
-#else
-#define arch_sparse_init() do {} while (0)
-#endif
-
 void __init paging_init(void)
 { 
int i;
@@ -344,7 +330,8 @@ void __init paging_init(void)
max_zone_pfns[ZONE_DMA32] = MAX_DMA32_PFN;
max_zone_pfns[ZONE_NORMAL] = end_pfn;
 
-   arch_sparse_init();
+   sparse_memory_present_with_active_regions(MAX_NUMNODES);
+   sparse_init();
 
for_each_online_node(i) {
setup_node_zones(i); 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] CPUSET related breakage of sys_mbind

2007-01-15 Thread Bob Picco


current->mems_allowed is defined for CONFIG_CPUSETS. This broke !CPUSETS
build. I compiled and linked tested both variants.

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 include/linux/cpuset.h |6 ++
 mm/mempolicy.c |2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

Index: linux-2.6.20-rc4-mm1/mm/mempolicy.c
===
--- linux-2.6.20-rc4-mm1.orig/mm/mempolicy.c2007-01-15 09:21:58.0 
-0500
+++ linux-2.6.20-rc4-mm1/mm/mempolicy.c 2007-01-15 17:51:15.0 -0500
@@ -882,9 +882,9 @@ asmlinkage long sys_mbind(unsigned long 
int err;
 
err = get_nodes(, nmask, maxnode);
-   nodes_and(nodes, nodes, current->mems_allowed);
if (err)
return err;
+   cpuset_nodes_allowed();
return do_mbind(start, len, mode, , flags);
 }
 
Index: linux-2.6.20-rc4-mm1/include/linux/cpuset.h
===
--- linux-2.6.20-rc4-mm1.orig/include/linux/cpuset.h2007-01-15 
09:21:32.0 -0500
+++ linux-2.6.20-rc4-mm1/include/linux/cpuset.h 2007-01-15 14:01:30.0 
-0500
@@ -75,6 +75,11 @@ static inline int cpuset_do_slab_mem_spr
 
 extern void cpuset_track_online_nodes(void);
 
+static inline void cpuset_nodes_allowed(nodemask_t *nodes)
+{
+   nodes_and(*nodes, *nodes, current->mems_allowed);
+}
+
 #else /* !CONFIG_CPUSETS */
 
 static inline int cpuset_init_early(void) { return 0; }
@@ -145,6 +150,7 @@ static inline int cpuset_do_slab_mem_spr
 }
 
 static inline void cpuset_track_online_nodes(void) {}
+static inline void cpuset_nodes_allowed(nodemask_t *nodes) {}
 
 #endif /* !CONFIG_CPUSETS */
 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] CPUSET related breakage of sys_mbind

2007-01-15 Thread Bob Picco


current-mems_allowed is defined for CONFIG_CPUSETS. This broke !CPUSETS
build. I compiled and linked tested both variants.

Signed-off-by: Bob Picco [EMAIL PROTECTED]

 include/linux/cpuset.h |6 ++
 mm/mempolicy.c |2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

Index: linux-2.6.20-rc4-mm1/mm/mempolicy.c
===
--- linux-2.6.20-rc4-mm1.orig/mm/mempolicy.c2007-01-15 09:21:58.0 
-0500
+++ linux-2.6.20-rc4-mm1/mm/mempolicy.c 2007-01-15 17:51:15.0 -0500
@@ -882,9 +882,9 @@ asmlinkage long sys_mbind(unsigned long 
int err;
 
err = get_nodes(nodes, nmask, maxnode);
-   nodes_and(nodes, nodes, current-mems_allowed);
if (err)
return err;
+   cpuset_nodes_allowed(nodes);
return do_mbind(start, len, mode, nodes, flags);
 }
 
Index: linux-2.6.20-rc4-mm1/include/linux/cpuset.h
===
--- linux-2.6.20-rc4-mm1.orig/include/linux/cpuset.h2007-01-15 
09:21:32.0 -0500
+++ linux-2.6.20-rc4-mm1/include/linux/cpuset.h 2007-01-15 14:01:30.0 
-0500
@@ -75,6 +75,11 @@ static inline int cpuset_do_slab_mem_spr
 
 extern void cpuset_track_online_nodes(void);
 
+static inline void cpuset_nodes_allowed(nodemask_t *nodes)
+{
+   nodes_and(*nodes, *nodes, current-mems_allowed);
+}
+
 #else /* !CONFIG_CPUSETS */
 
 static inline int cpuset_init_early(void) { return 0; }
@@ -145,6 +150,7 @@ static inline int cpuset_do_slab_mem_spr
 }
 
 static inline void cpuset_track_online_nodes(void) {}
+static inline void cpuset_nodes_allowed(nodemask_t *nodes) {}
 
 #endif /* !CONFIG_CPUSETS */
 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [0/2]

2006-12-20 Thread Bob Picco

Hiroyuki KAMEZAWA wrote:[Sat Dec 16 2006, 03:31:36AM EST]
> This patch implements pfn_valid() micro optimization.
> 
> This uses ia64_pfn_valid() idea to check mem_map is valid or not instead of
> sparsemem's logic.
> 
> By this, we'll not access mem_section[] in usual ops.
> 
> I attaches my easy test result with *micro* benchmark on SMP system.
> I'm glad if you give me an advice about testing.
Sorry I was looking for AIM VII and/or reaim which are multiuser loads.
The results (2.6.20-rc1-mm1) for EXTREME, SPARSEMEM+VMEMMAP and
SPARSEMEM+VMEMMAP+your+patch are below. Note SPARSEMEM+VMEMMAP AIM VII
wasn't benchmarked to higher load limit because of my time constraints. 
The runs should be repeated more times.

Any difference between the three configurations looks insignificant and
within benchmark noise.

After tomorrow I'm on vacation until Jan 2.

bob
> 
> -Kame
> ==
> AIM Independent Resource Benchmark - Suite IX "1.1"
> 
> test on 
> CPU: Itanium2(madison) 1.3GHz x2, SMP
> Memory: memory 8G
> 2.6.20-rc1-m1 / 
>   extreme means  SPARSEMEM_VMEMMAP=n
>   vmem_map means SPARSEMEM_VMEMMAP=y + optimze pfn_valid patch.
> ==
> extreme   vmem_map
> creat-clo   136322  136989  File Creations and Closes per second
> page_test   1042187 1076976 System Allocations & Pages per second
> brk_test2678559 2727286 System Memory Allocations per second
> signal_test 309525  321052  Signal Traps per second
> exec_test   803 801 Program Loads per second
> fork_test   93549679Task Creations per second
> disk_rr 103766  103970  Random Disk Reads (K) per second
> disk_rw 82978   80244   Random Disk Writes (K) per second
> disk_rd 802548  872983  Sequential Disk Reads (K) per second
> disk_wrt130342  131408  Sequential Disk Writes (K) per second
> disk_cp 107498  107823  Disk Copies (K) per second
> sync_disk_rw800 752 Sync Random Disk Writes (K) per second
> sync_disk_wrt   81  78  Sync Sequential Disk Writes (K) per second
> sync_disk_cp84  78  Sync Disk Copies (K) per second
> disk_src44417   44379   Directory Searches per second
> mem_rtns_1  3239352 3222140 Dynamic Memory Operations per second
> mem_rtns_2  1157321 1155260 Block Memory Operations per second
> misc_rtns_1 10799   10993   Auxiliary Loops per second
> dir_rtns_1  1276159 1373725 Directory Operations per second
> shell_rtns_1175 176 Shell Scripts per second
> shell_rtns_2174 175 Shell Scripts per second
> shell_rtns_3175 175 Shell Scripts per second
> shared_memory   646725  628769  Shared Memory Operations per second
> tcp_test93258   94928   TCP/IP Messages per second
> udp_test177984  177276  UDP/IP DataGrams per second
> fifo_test   362774  385434  FIFO Messages per second
> stream_pipe 320825  325931  Stream Pipe Messages per second
> dgram_pipe  300789  303339  DataGram Pipe Messages per second
> pipe_cpy410539  449521  Pipe Messages per second
> 

EXTREME

AIM Multiuser Benchmark - Suite VII Run Beginning

Tasksjobs/min  jti  jobs/min/task  real   cpu
1  111.22  100   111.2215 52.33  0.88   Tue Dec 19 13:43:42 
2006
  101 6896.87   9668.2858 85.23 42.02   Tue Dec 19 13:49:31 
2006
  201 7997.07   9439.7864146.28 83.69   Tue Dec 19 13:59:30 
2006
  301 8580.37   9528.5062204.17125.72   Tue Dec 19 14:13:27 
2006
  401 8800.62   9421.9467265.19167.80   Tue Dec 19 14:31:33 
2006
  501 9445.73   9118.8537308.69210.16   Tue Dec 19 14:52:38 
2006
  601 9446.80   9315.7185370.26252.50   Tue Dec 19 15:17:55 
2006
  701 9353.27   9213.3427436.19295.04   Tue Dec 19 15:47:42 
2006
  918 9543.22   9110.3957559.85387.02   Tue Dec 19 16:25:55 
2006
 1000 9571.14   93 9.5711608.08421.95   Tue Dec 19 17:07:26 
2006

AIM Multiuser Benchmark - Suite VII Run Beginning

Tasksjobs/min  jti  jobs/min/task  real   cpu
1  111.43  100   111.4281 52.23  0.88   Wed Dec 20 07:16:00 
2006
  101 6940.84   9568.7212 84.69 42.08   Wed Dec 20 07:21:47 
2006
  201 8206.67   9440.8292142.54 83.68   Wed Dec 20 07:31:31 
2006
  301 8692.77   9428.8796201.53125.65   Wed Dec 20 07:45:16 
2006
  401 8910.40   9322.2204261.92167.79   Wed Dec 20 08:03:09 
2006
  500 9149.02   9318.2980318.07209.55   Wed Dec 20 08:24:52 
2006

REAIM Workload
Times are in seconds - Child times from tms.cstime and tms.cutime

Num Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
Forked  Time

Re: [PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [0/2]

2006-12-20 Thread Bob Picco

Hiroyuki KAMEZAWA wrote:[Sat Dec 16 2006, 03:31:36AM EST]
 This patch implements pfn_valid() micro optimization.
 
 This uses ia64_pfn_valid() idea to check mem_map is valid or not instead of
 sparsemem's logic.
 
 By this, we'll not access mem_section[] in usual ops.
 
 I attaches my easy test result with *micro* benchmark on SMP system.
 I'm glad if you give me an advice about testing.
Sorry I was looking for AIM VII and/or reaim which are multiuser loads.
The results (2.6.20-rc1-mm1) for EXTREME, SPARSEMEM+VMEMMAP and
SPARSEMEM+VMEMMAP+your+patch are below. Note SPARSEMEM+VMEMMAP AIM VII
wasn't benchmarked to higher load limit because of my time constraints. 
The runs should be repeated more times.

Any difference between the three configurations looks insignificant and
within benchmark noise.

After tomorrow I'm on vacation until Jan 2.

bob
 
 -Kame
 ==
 AIM Independent Resource Benchmark - Suite IX 1.1
 
 test on 
 CPU: Itanium2(madison) 1.3GHz x2, SMP
 Memory: memory 8G
 2.6.20-rc1-m1 / 
   extreme means  SPARSEMEM_VMEMMAP=n
   vmem_map means SPARSEMEM_VMEMMAP=y + optimze pfn_valid patch.
 ==
 extreme   vmem_map
 creat-clo   136322  136989  File Creations and Closes per second
 page_test   1042187 1076976 System Allocations  Pages per second
 brk_test2678559 2727286 System Memory Allocations per second
 signal_test 309525  321052  Signal Traps per second
 exec_test   803 801 Program Loads per second
 fork_test   93549679Task Creations per second
 disk_rr 103766  103970  Random Disk Reads (K) per second
 disk_rw 82978   80244   Random Disk Writes (K) per second
 disk_rd 802548  872983  Sequential Disk Reads (K) per second
 disk_wrt130342  131408  Sequential Disk Writes (K) per second
 disk_cp 107498  107823  Disk Copies (K) per second
 sync_disk_rw800 752 Sync Random Disk Writes (K) per second
 sync_disk_wrt   81  78  Sync Sequential Disk Writes (K) per second
 sync_disk_cp84  78  Sync Disk Copies (K) per second
 disk_src44417   44379   Directory Searches per second
 mem_rtns_1  3239352 3222140 Dynamic Memory Operations per second
 mem_rtns_2  1157321 1155260 Block Memory Operations per second
 misc_rtns_1 10799   10993   Auxiliary Loops per second
 dir_rtns_1  1276159 1373725 Directory Operations per second
 shell_rtns_1175 176 Shell Scripts per second
 shell_rtns_2174 175 Shell Scripts per second
 shell_rtns_3175 175 Shell Scripts per second
 shared_memory   646725  628769  Shared Memory Operations per second
 tcp_test93258   94928   TCP/IP Messages per second
 udp_test177984  177276  UDP/IP DataGrams per second
 fifo_test   362774  385434  FIFO Messages per second
 stream_pipe 320825  325931  Stream Pipe Messages per second
 dgram_pipe  300789  303339  DataGram Pipe Messages per second
 pipe_cpy410539  449521  Pipe Messages per second
 

EXTREME

AIM Multiuser Benchmark - Suite VII Run Beginning

Tasksjobs/min  jti  jobs/min/task  real   cpu
1  111.22  100   111.2215 52.33  0.88   Tue Dec 19 13:43:42 
2006
  101 6896.87   9668.2858 85.23 42.02   Tue Dec 19 13:49:31 
2006
  201 7997.07   9439.7864146.28 83.69   Tue Dec 19 13:59:30 
2006
  301 8580.37   9528.5062204.17125.72   Tue Dec 19 14:13:27 
2006
  401 8800.62   9421.9467265.19167.80   Tue Dec 19 14:31:33 
2006
  501 9445.73   9118.8537308.69210.16   Tue Dec 19 14:52:38 
2006
  601 9446.80   9315.7185370.26252.50   Tue Dec 19 15:17:55 
2006
  701 9353.27   9213.3427436.19295.04   Tue Dec 19 15:47:42 
2006
  918 9543.22   9110.3957559.85387.02   Tue Dec 19 16:25:55 
2006
 1000 9571.14   93 9.5711608.08421.95   Tue Dec 19 17:07:26 
2006

AIM Multiuser Benchmark - Suite VII Run Beginning

Tasksjobs/min  jti  jobs/min/task  real   cpu
1  111.43  100   111.4281 52.23  0.88   Wed Dec 20 07:16:00 
2006
  101 6940.84   9568.7212 84.69 42.08   Wed Dec 20 07:21:47 
2006
  201 8206.67   9440.8292142.54 83.68   Wed Dec 20 07:31:31 
2006
  301 8692.77   9428.8796201.53125.65   Wed Dec 20 07:45:16 
2006
  401 8910.40   9322.2204261.92167.79   Wed Dec 20 08:03:09 
2006
  500 9149.02   9318.2980318.07209.55   Wed Dec 20 08:24:52 
2006

REAIM Workload
Times are in seconds - Child times from tms.cstime and tms.cutime

Num Parent   Child   Child  Jobs per   Jobs/min/  Std_dev  Std_dev  JTI
Forked  Time SysTime UTime   Minute Child  Time Percent 
1

Re: [PATCH 2.6.13] x86_64: Make trap_init() happen earlier - dropped

2005-09-09 Thread Bob Picco

Andi Kleen wrote:   [Fri Sep 09 2005, 10:17:40AM EDT]
> On Thursday 08 September 2005 18:37, Tom Rini wrote:
> > It can be handy in some situations to have run trap_init() sooner than the
> > generic code does.  In order to do this on x86_64 we need to add a custom
> > early_setup_per_cpu_areas() call as well.
> 
> The patch is totally broken and causes crash even under light load
> (just found it after a lengthy binary search) 
> 
> >
> > +void __init early_setup_per_cpu_areas(void)
> > +{
> > +   static char cpu0[PERCPU_ENOUGH_ROOM] __cacheline_aligned
> > +   __attribute__ ((aligned (SMP_CACHE_BYTES)));
> 
> The original code does
> 
> /* Copy section for each CPU (we discard the original) */
> size = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES);
> #ifdef CONFIG_MODULES
> if (size < PERCPU_ENOUGH_ROOM)
> size = PERCPU_ENOUGH_ROOM;
> #endif
> 
> 
> perhaps end-start is larger than PERCPU_ENOUGH_ROOM ? (using defconfig) 
> 
> Dropped from my tree for now.
> 
> -Andi
Andi:

Sorry about that.  I originally intended this for KGDB only. The hardcoded
PERCPU_ENOUGH_ROOM value is dangerous and could be the issue.  Let me take
a look at this.

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2.6.13] x86_64: Make trap_init() happen earlier - dropped

2005-09-09 Thread Bob Picco

Andi Kleen wrote:   [Fri Sep 09 2005, 10:17:40AM EDT]
 On Thursday 08 September 2005 18:37, Tom Rini wrote:
  It can be handy in some situations to have run trap_init() sooner than the
  generic code does.  In order to do this on x86_64 we need to add a custom
  early_setup_per_cpu_areas() call as well.
 
 The patch is totally broken and causes crash even under light load
 (just found it after a lengthy binary search) 
 
 
  +void __init early_setup_per_cpu_areas(void)
  +{
  +   static char cpu0[PERCPU_ENOUGH_ROOM] __cacheline_aligned
  +   __attribute__ ((aligned (SMP_CACHE_BYTES)));
 
 The original code does
 
 /* Copy section for each CPU (we discard the original) */
 size = ALIGN(__per_cpu_end - __per_cpu_start, SMP_CACHE_BYTES);
 #ifdef CONFIG_MODULES
 if (size  PERCPU_ENOUGH_ROOM)
 size = PERCPU_ENOUGH_ROOM;
 #endif
 
 
 perhaps end-start is larger than PERCPU_ENOUGH_ROOM ? (using defconfig) 
 
 Dropped from my tree for now.
 
 -Andi
Andi:

Sorry about that.  I originally intended this for KGDB only. The hardcoded
PERCPU_ENOUGH_ROOM value is dangerous and could be the issue.  Let me take
a look at this.

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: HPET drift question

2005-08-28 Thread Bob Picco

Pallipadi, Venkatesh wrote: [Fri Aug 26 2005, 08:53:35PM EDT]
> 
> Yes. Looks like "ti->drift = HPET_DRIFT;" is right here. However, I
> would 
> like to double check this with Bob.
> 
> Thanks,
> Venki
> 
> >-Original Message-
> >From: Alex Williamson [mailto:[EMAIL PROTECTED] 
> >Sent: Thursday, August 25, 2005 8:17 AM
> >To: Pallipadi, Venkatesh
> >Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org
> >Subject: HPET drift question
> >
> >Hi Venki,
> >
> >   I'm confused by the calculation of the drift value in the hpet
> >driver.  The specs defines the recommended minimum hardware
> >implementation is a frequency drift of 0.05% or 500ppm.  However, the
> >drift passed in when registering with the time interpolator is:
> >
> >ti->drift = ti->frequency * HPET_DRIFT / 100;
> >
> >Isn't that absolute number of ticks per second drift?  The time
> >interpolator defines the drift in parts per million.  Shouldn't this
> >simply be:
> >
> >ti->drift = HPET_DRIFT;
> >
> >The current code seems to greatly penalize any hpet timer with greater
> >than a 1MHz frequency.  Thanks,
> >
> > Alex
> >
> >-- 
> >Alex Williamson HP Linux & Open Source Lab
> >
> >
Hi Venki:

Alex and I had an earlier IRC discussion where we agreed that HPET_DRIFT
should be the value.  We were just verifying with you.

thanks,

bob
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: HPET drift question

2005-08-28 Thread Bob Picco

Pallipadi, Venkatesh wrote: [Fri Aug 26 2005, 08:53:35PM EDT]
 
 Yes. Looks like ti-drift = HPET_DRIFT; is right here. However, I
 would 
 like to double check this with Bob.
 
 Thanks,
 Venki
 
 -Original Message-
 From: Alex Williamson [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, August 25, 2005 8:17 AM
 To: Pallipadi, Venkatesh
 Cc: [EMAIL PROTECTED]; linux-kernel@vger.kernel.org
 Subject: HPET drift question
 
 Hi Venki,
 
I'm confused by the calculation of the drift value in the hpet
 driver.  The specs defines the recommended minimum hardware
 implementation is a frequency drift of 0.05% or 500ppm.  However, the
 drift passed in when registering with the time interpolator is:
 
 ti-drift = ti-frequency * HPET_DRIFT / 100;
 
 Isn't that absolute number of ticks per second drift?  The time
 interpolator defines the drift in parts per million.  Shouldn't this
 simply be:
 
 ti-drift = HPET_DRIFT;
 
 The current code seems to greatly penalize any hpet timer with greater
 than a 1MHz frequency.  Thanks,
 
  Alex
 
 -- 
 Alex Williamson HP Linux  Open Source Lab
 
 
Hi Venki:

Alex and I had an earlier IRC discussion where we agreed that HPET_DRIFT
should be the value.  We were just verifying with you.

thanks,

bob
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sparsemem fix for sparse_index_init

2005-08-22 Thread Bob Picco

Andrew:

After reviewing recent SPARSEMEM+EXTREME changes for -mm, I spotted a memory 
leak issue.  In sparse_index_init we must evaluate whether the root index is
allocated before allocating, acquiring the lock and then checking
whether the root is already allocated. An alternative would be in the error path
doing a free_bootmem_node but this seems the more expensive method for
boot time.

thanks,

bob

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>


 mm/sparse.c |3 +++
 1 files changed, 3 insertions(+)

Index: linux-2.6.13-rc6-mm1/mm/sparse.c
===
--- linux-2.6.13-rc6-mm1.orig/mm/sparse.c   2005-08-19 12:47:53.0 
-0400
+++ linux-2.6.13-rc6-mm1/mm/sparse.c2005-08-21 13:36:57.0 -0400
@@ -45,6 +45,9 @@ static int sparse_index_init(unsigned lo
struct mem_section *section;
int ret = 0;
 
+   if (mem_section[root])
+   return -EEXIST;
+
section = sparse_index_alloc(nid);
/*
 * This lock keeps two different sections from
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] sparsemem fix for sparse_index_init

2005-08-22 Thread Bob Picco

Andrew:

After reviewing recent SPARSEMEM+EXTREME changes for -mm, I spotted a memory 
leak issue.  In sparse_index_init we must evaluate whether the root index is
allocated before allocating, acquiring the lock and then checking
whether the root is already allocated. An alternative would be in the error path
doing a free_bootmem_node but this seems the more expensive method for
boot time.

thanks,

bob

Signed-off-by: Bob Picco [EMAIL PROTECTED]


 mm/sparse.c |3 +++
 1 files changed, 3 insertions(+)

Index: linux-2.6.13-rc6-mm1/mm/sparse.c
===
--- linux-2.6.13-rc6-mm1.orig/mm/sparse.c   2005-08-19 12:47:53.0 
-0400
+++ linux-2.6.13-rc6-mm1/mm/sparse.c2005-08-21 13:36:57.0 -0400
@@ -45,6 +45,9 @@ static int sparse_index_init(unsigned lo
struct mem_section *section;
int ret = 0;
 
+   if (mem_section[root])
+   return -EEXIST;
+
section = sparse_index_alloc(nid);
/*
 * This lock keeps two different sections from
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] Early kmalloc/kfree

2005-07-08 Thread Bob Picco

We have a requirement on IA64 to run the ACPI interpreter in the setup_arch
function before paging_init examines the maximum DMA physical address which
is limited by the IOMMU.  One obstacle is the use of kmalloc/kfree by
ACPI.  Using the bootmem allocator is unacceptable because > 20Mb of memory
is wastefully allocated.  As an alternative, I investigated what
would be required to optionally make the slab allocator available early in
boot and work in an almost seamless way.

The patch below is a solution for early kmalloc/kfree.  An architecture which 
requires kmalloc/kfree use before kmem_cache_init has normally completed can 
perform the initialization as early as pfn_to_page is a valid operation.  Like 
the bootmem allocator this point in execution is well known.  An arch that
requires early kmalloc/kfree chooses the CONFIG_EARLY_KMALLOC option and
must call kmem_cache_init at the appropriate place in setup_arch.

The known deficiencies of this solution are similar to the bootmem allocator.
The placement of the call to kmem_cache_init requires arch dependent code
knowlege and possibly manipulation of arch dependent code for enablement. 
kmalloc/kmfree can't be called between when mem_init calls bootmem to free 
pages and the second call to kmem_cache_init made from start_kernel. A NUMA 
deficiency, like bootmem allocator, exists for CPU only nodes.  The NUMA node 
distance information isn't interrogated by bootmem allocator for memory less 
nodes.

The slab API hasn't been modified.  All hot code paths are untouched by
this patch.  The patch has been tested on a 2 CPU SMP box, two node NUMA
simulated machine with and without memory less nodes. All testing has
been done on ia64 but nothing prevents other architectures from using the
patch.

Manfred provided valuable early review feedback.

Alex is responsible for the early ACPI changes and helping me test the patch.

thanks,

bob

Signed-off-by: Bob Picco <[EMAIL PROTECTED]>

 Kconfig  |6 +++
 page_alloc.c |4 ++
 slab.c   |   99 ---
 3 files changed, 105 insertions(+), 4 deletions(-)

Index: linux-2.6.13-rc2-mm1/mm/Kconfig
===
--- linux-2.6.13-rc2-mm1.orig/mm/Kconfig2005-07-08 10:25:35.0 
-0400
+++ linux-2.6.13-rc2-mm1/mm/Kconfig 2005-07-08 10:42:07.0 -0400
@@ -98,3 +98,9 @@ config HAVE_MEMORY_PRESENT
 config ARCH_SPARSEMEM_EXTREME
def_bool n
depends on SPARSEMEM && 64BIT
+#
+# For early use of kmalloc and kfree.  This requires architecture dependent
+# code calling kmem_cache_init when pfn_to_page is safe to call. 
+#
+config EARLY_KMALLOC
+   def_bool n
Index: linux-2.6.13-rc2-mm1/mm/page_alloc.c
===
--- linux-2.6.13-rc2-mm1.orig/mm/page_alloc.c   2005-07-08 10:25:35.0 
-0400
+++ linux-2.6.13-rc2-mm1/mm/page_alloc.c2005-07-08 10:42:07.0 
-0400
@@ -1712,6 +1712,10 @@ void __init memmap_init_zone(unsigned lo
continue;
page = pfn_to_page(pfn);
set_page_links(page, zone, nid, pfn);
+#ifdef CONFIG_EARLY_KMALLOC
+   if (PageSlab(page))
+   continue;
+#endif
set_page_count(page, 0);
reset_page_mapcount(page);
SetPageReserved(page);
Index: linux-2.6.13-rc2-mm1/mm/slab.c
===
--- linux-2.6.13-rc2-mm1.orig/mm/slab.c 2005-07-08 10:25:35.0 -0400
+++ linux-2.6.13-rc2-mm1/mm/slab.c  2005-07-08 10:42:07.0 -0400
@@ -103,6 +103,7 @@
 #include   
 #include   
 #include   
+#include   
 
 #include   
 #include   
@@ -130,6 +131,28 @@
 #defineFORCED_DEBUG0
 #endif
 
+#ifdef CONFIG_EARLY_KMALLOC
+static int slab_init;
+static inline int early_kmalloc_init(void)
+{
+   return slab_init++;
+}
+
+static inline int in_early_kmalloc(void)
+{
+   return slab_init == 1;
+}
+#else
+static inline int early_kmalloc_init(void)
+{
+   return 0;
+}
+
+static inline int in_early_kmalloc(void)
+{
+   return 0;
+}
+#endif
 
 /* Shouldn't this be in a header file somewhere? */
 #defineBYTES_PER_WORD  sizeof(void *)
@@ -997,6 +1020,9 @@ void __init kmem_cache_init(void)
struct cache_names *names;
int i;
 
+   if (early_kmalloc_init())
+   return;
+
for (i = 0; i < NUM_INIT_LISTS; i++) {
LIST3_INIT(_list3[i]);
if (i < MAX_NUMNODES)
@@ -1177,6 +1203,35 @@ static int __init cpucache_init(void)
 
 __initcall(cpucache_init);
 
+#ifdef CONFIG_EARLY_KMALLOC
+static __init struct page *early_alloc_pages(unsigned int order, int nodeid)
+{
+   void *ptr;
+   struct page *page;
+   int i = 1 << order, j;
+   unsigned long si

[PATCH] Early kmalloc/kfree

2005-07-08 Thread Bob Picco

We have a requirement on IA64 to run the ACPI interpreter in the setup_arch
function before paging_init examines the maximum DMA physical address which
is limited by the IOMMU.  One obstacle is the use of kmalloc/kfree by
ACPI.  Using the bootmem allocator is unacceptable because  20Mb of memory
is wastefully allocated.  As an alternative, I investigated what
would be required to optionally make the slab allocator available early in
boot and work in an almost seamless way.

The patch below is a solution for early kmalloc/kfree.  An architecture which 
requires kmalloc/kfree use before kmem_cache_init has normally completed can 
perform the initialization as early as pfn_to_page is a valid operation.  Like 
the bootmem allocator this point in execution is well known.  An arch that
requires early kmalloc/kfree chooses the CONFIG_EARLY_KMALLOC option and
must call kmem_cache_init at the appropriate place in setup_arch.

The known deficiencies of this solution are similar to the bootmem allocator.
The placement of the call to kmem_cache_init requires arch dependent code
knowlege and possibly manipulation of arch dependent code for enablement. 
kmalloc/kmfree can't be called between when mem_init calls bootmem to free 
pages and the second call to kmem_cache_init made from start_kernel. A NUMA 
deficiency, like bootmem allocator, exists for CPU only nodes.  The NUMA node 
distance information isn't interrogated by bootmem allocator for memory less 
nodes.

The slab API hasn't been modified.  All hot code paths are untouched by
this patch.  The patch has been tested on a 2 CPU SMP box, two node NUMA
simulated machine with and without memory less nodes. All testing has
been done on ia64 but nothing prevents other architectures from using the
patch.

Manfred provided valuable early review feedback.

Alex is responsible for the early ACPI changes and helping me test the patch.

thanks,

bob

Signed-off-by: Bob Picco [EMAIL PROTECTED]

 Kconfig  |6 +++
 page_alloc.c |4 ++
 slab.c   |   99 ---
 3 files changed, 105 insertions(+), 4 deletions(-)

Index: linux-2.6.13-rc2-mm1/mm/Kconfig
===
--- linux-2.6.13-rc2-mm1.orig/mm/Kconfig2005-07-08 10:25:35.0 
-0400
+++ linux-2.6.13-rc2-mm1/mm/Kconfig 2005-07-08 10:42:07.0 -0400
@@ -98,3 +98,9 @@ config HAVE_MEMORY_PRESENT
 config ARCH_SPARSEMEM_EXTREME
def_bool n
depends on SPARSEMEM  64BIT
+#
+# For early use of kmalloc and kfree.  This requires architecture dependent
+# code calling kmem_cache_init when pfn_to_page is safe to call. 
+#
+config EARLY_KMALLOC
+   def_bool n
Index: linux-2.6.13-rc2-mm1/mm/page_alloc.c
===
--- linux-2.6.13-rc2-mm1.orig/mm/page_alloc.c   2005-07-08 10:25:35.0 
-0400
+++ linux-2.6.13-rc2-mm1/mm/page_alloc.c2005-07-08 10:42:07.0 
-0400
@@ -1712,6 +1712,10 @@ void __init memmap_init_zone(unsigned lo
continue;
page = pfn_to_page(pfn);
set_page_links(page, zone, nid, pfn);
+#ifdef CONFIG_EARLY_KMALLOC
+   if (PageSlab(page))
+   continue;
+#endif
set_page_count(page, 0);
reset_page_mapcount(page);
SetPageReserved(page);
Index: linux-2.6.13-rc2-mm1/mm/slab.c
===
--- linux-2.6.13-rc2-mm1.orig/mm/slab.c 2005-07-08 10:25:35.0 -0400
+++ linux-2.6.13-rc2-mm1/mm/slab.c  2005-07-08 10:42:07.0 -0400
@@ -103,6 +103,7 @@
 #include   linux/rcupdate.h
 #include   linux/string.h
 #include   linux/nodemask.h
+#include   linux/bootmem.h
 
 #include   asm/uaccess.h
 #include   asm/cacheflush.h
@@ -130,6 +131,28 @@
 #defineFORCED_DEBUG0
 #endif
 
+#ifdef CONFIG_EARLY_KMALLOC
+static int slab_init;
+static inline int early_kmalloc_init(void)
+{
+   return slab_init++;
+}
+
+static inline int in_early_kmalloc(void)
+{
+   return slab_init == 1;
+}
+#else
+static inline int early_kmalloc_init(void)
+{
+   return 0;
+}
+
+static inline int in_early_kmalloc(void)
+{
+   return 0;
+}
+#endif
 
 /* Shouldn't this be in a header file somewhere? */
 #defineBYTES_PER_WORD  sizeof(void *)
@@ -997,6 +1020,9 @@ void __init kmem_cache_init(void)
struct cache_names *names;
int i;
 
+   if (early_kmalloc_init())
+   return;
+
for (i = 0; i  NUM_INIT_LISTS; i++) {
LIST3_INIT(initkmem_list3[i]);
if (i  MAX_NUMNODES)
@@ -1177,6 +1203,35 @@ static int __init cpucache_init(void)
 
 __initcall(cpucache_init);
 
+#ifdef CONFIG_EARLY_KMALLOC
+static __init struct page *early_alloc_pages(unsigned int order, int nodeid)
+{
+   void *ptr;
+   struct page *page

70 matches

Mail list logo