Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-09-02 Thread Nathan Fontenot
On 08/31/2010 04:57 PM, Anton Blanchard wrote:
 
 Hi Nathan,
 
 This set of patches de-couples the idea that there is a single
 directory in sysfs for each memory section.  The intent of the
 patches is to reduce the number of sysfs directories created to
 resolve a boot-time performance issue.  On very large systems
 boot time are getting very long (as seen on powerpc hardware)
 due to the enormous number of sysfs directories being created.
 On a system with 1 TB of memory we create ~63,000 directories.
 For even larger systems boot times are being measured in hours.

 This set of patches allows for each directory created in sysfs
 to cover more than one memory section.  The default behavior for
 sysfs directory creation is the same, in that each directory
 represents a single memory section.  A new file 'end_phys_index'
 in each directory contains the physical_id of the last memory
 section covered by the directory so that users can easily
 determine the memory section range of a directory.
 
 I tested this on a POWER7 with 2TB memory and the boot time improved from
 greater than 6 hours (I gave up), to under 5 minutes. Nice!

Thanks for testing this out.  I was able to test this on a 1 TB system
and saw memory sysfs creation times go from 10 minutes to a few seconds.
It's good to see the difference for a 2 TB system.

-Nathan

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-31 Thread Dave Hansen
On Mon, 2010-08-16 at 09:34 -0500, Nathan Fontenot wrote:
  It's not an unresolvable issue, as this is a must-fix problem.  But you
  should tell us what your proposal is to prevent breakage of existing
  installations.  A Kconfig option would be good, but a boot-time kernel
  command line option which selects the new format would be much better.
 
 This shouldn't break existing installations, unless an architecture chooses
 to do so.  With my patch only the powerpc/pseries arch is updated such that
 what is seen in userspace is different. 

Even if an arch defines the override for the sysfs dir size, I still
don't think this breaks anything (it shouldn't).  We move _all_ of the
directories over, all at once, to a single, uniform size.  The only
apparent change to a user moving kernels would be a larger
block_size_bytes (which is certainly not changing the ABI) and a new
sysfs file for the end of the section.  The new sysfs file is
_completely_ redundant at this point.

The architecture is only supposed to bump up the directory size when it
*KNOWS* that all operations will be done at the larger section size,
such as if the specific hardware has physical DIMMs which are much
larger than SECTION_SIZE.

Let's say we have a system with 20MB of memory, SECTION_SIZE of 1MB and
a sysfs dir size of 4MB.  

Before the patch, we have 20 directories: one for each section.  After
this patch, we have 5 directories.  

The thing that I think is the next step, but that we _will_ probably
need eventually is this, take the 5 sysfs dirs in the above case:

0-3, 4-7, 8-11, 12-15, 16-19

and turn that into a single one:

0-19

*That* will require changing the ABI, but we could certainly have some
bloated and slow, but backward-compatible mode.  

-- Dave

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-31 Thread Anton Blanchard

Hi Nathan,

 This set of patches de-couples the idea that there is a single
 directory in sysfs for each memory section.  The intent of the
 patches is to reduce the number of sysfs directories created to
 resolve a boot-time performance issue.  On very large systems
 boot time are getting very long (as seen on powerpc hardware)
 due to the enormous number of sysfs directories being created.
 On a system with 1 TB of memory we create ~63,000 directories.
 For even larger systems boot times are being measured in hours.
 
 This set of patches allows for each directory created in sysfs
 to cover more than one memory section.  The default behavior for
 sysfs directory creation is the same, in that each directory
 represents a single memory section.  A new file 'end_phys_index'
 in each directory contains the physical_id of the last memory
 section covered by the directory so that users can easily
 determine the memory section range of a directory.

I tested this on a POWER7 with 2TB memory and the boot time improved from
greater than 6 hours (I gave up), to under 5 minutes. Nice!

Tested-by: Anton Blanchard an...@samba.org

Anton
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-16 Thread Nathan Fontenot
On 08/12/2010 02:08 PM, Andrew Morton wrote:
 On Mon, 09 Aug 2010 12:53:00 -0500
 Nathan Fontenot nf...@austin.ibm.com wrote:
 
 This set of patches de-couples the idea that there is a single
 directory in sysfs for each memory section.  The intent of the
 patches is to reduce the number of sysfs directories created to
 resolve a boot-time performance issue.  On very large systems
 boot time are getting very long (as seen on powerpc hardware)
 due to the enormous number of sysfs directories being created.
 On a system with 1 TB of memory we create ~63,000 directories.
 For even larger systems boot times are being measured in hours.
 
 And those hours are mainly due to this problem, I assume.

Yes, those hours are spent creating the sysfs directories for each
of the memory sections.

 
 This set of patches allows for each directory created in sysfs
 to cover more than one memory section.  The default behavior for
 sysfs directory creation is the same, in that each directory
 represents a single memory section.  A new file 'end_phys_index'
 in each directory contains the physical_id of the last memory
 section covered by the directory so that users can easily
 determine the memory section range of a directory.
 
 What you're proposing appears to be a non-back-compatible
 userspace-visible change.  This is a big issue!
 
 It's not an unresolvable issue, as this is a must-fix problem.  But you
 should tell us what your proposal is to prevent breakage of existing
 installations.  A Kconfig option would be good, but a boot-time kernel
 command line option which selects the new format would be much better.

This shouldn't break existing installations, unless an architecture chooses
to do so.  With my patch only the powerpc/pseries arch is updated such that
what is seen in userspace is different.

The default behavior is maintained for all architectures unless they define
their own version of memory_block_size_bytes().  The default definition of
this routine (defined as __weak in Patch 5/8) sets the memory block size
to the same size it currently is, and thus preserving the exisitng 1 sysfs
directory per memory section.  The only change that will be seen is a new
propery for memory section, end_phys_addr, which will have the same value
as the existing 'phys_addr' property.

 
 However you didn't mention this issue at all, and it's the most
 important one.
 
 
 Updates for version 5 of the patchset include the following:

 Patch 4/8 Add mutex for add/remove of memory blocks
 - Define the mutex using DEFINE_MUTEX macro.

 Patch 8/8 Update memory-hotplug documentation
 - Add information concerning memory holes in phys_index..end_phys_index.
 
 And you forgot to tell us how long those machines boot with the
 patchset applied, which is the entire point of the patchset!

Yes,  I am working on getting more time on our large systems to get
performance numbers with this patch.  I'll post them when I get them.

-Nathan
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-12 Thread Andrew Morton
On Mon, 09 Aug 2010 12:53:00 -0500
Nathan Fontenot nf...@austin.ibm.com wrote:

 This set of patches de-couples the idea that there is a single
 directory in sysfs for each memory section.  The intent of the
 patches is to reduce the number of sysfs directories created to
 resolve a boot-time performance issue.  On very large systems
 boot time are getting very long (as seen on powerpc hardware)
 due to the enormous number of sysfs directories being created.
 On a system with 1 TB of memory we create ~63,000 directories.
 For even larger systems boot times are being measured in hours.

And those hours are mainly due to this problem, I assume.

 This set of patches allows for each directory created in sysfs
 to cover more than one memory section.  The default behavior for
 sysfs directory creation is the same, in that each directory
 represents a single memory section.  A new file 'end_phys_index'
 in each directory contains the physical_id of the last memory
 section covered by the directory so that users can easily
 determine the memory section range of a directory.

What you're proposing appears to be a non-back-compatible
userspace-visible change.  This is a big issue!

It's not an unresolvable issue, as this is a must-fix problem.  But you
should tell us what your proposal is to prevent breakage of existing
installations.  A Kconfig option would be good, but a boot-time kernel
command line option which selects the new format would be much better.

However you didn't mention this issue at all, and it's the most
important one.


 Updates for version 5 of the patchset include the following:
 
 Patch 4/8 Add mutex for add/remove of memory blocks
 - Define the mutex using DEFINE_MUTEX macro.
 
 Patch 8/8 Update memory-hotplug documentation
 - Add information concerning memory holes in phys_index..end_phys_index.

And you forgot to tell us how long those machines boot with the
patchset applied, which is the entire point of the patchset!

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-12 Thread Dave Hansen
On Thu, 2010-08-12 at 12:08 -0700, Andrew Morton wrote:
  This set of patches allows for each directory created in sysfs
  to cover more than one memory section.  The default behavior for
  sysfs directory creation is the same, in that each directory
  represents a single memory section.  A new file 'end_phys_index'
  in each directory contains the physical_id of the last memory
  section covered by the directory so that users can easily
  determine the memory section range of a directory.
 
 What you're proposing appears to be a non-back-compatible
 userspace-visible change.  This is a big issue! 

Nathan, one thought to get around this at the moment would be to bump up
the size that we export in /sys/devices/system/memory/block_size_bytes.
I think you have already done most of the hard work to accomplish
this.  

You can still add the end_phys_index stuff.  But, for now, it would
always be equal to start_phys_index.

-- Dave

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-11 Thread Dave Hansen
On Mon, 2010-08-09 at 12:53 -0500, Nathan Fontenot wrote:
 This set of patches de-couples the idea that there is a single
 directory in sysfs for each memory section.  The intent of the
 patches is to reduce the number of sysfs directories created to
 resolve a boot-time performance issue.  On very large systems
 boot time are getting very long (as seen on powerpc hardware)
 due to the enormous number of sysfs directories being created.
 On a system with 1 TB of memory we create ~63,000 directories.
 For even larger systems boot times are being measured in hours. 

Hi Nathan,

The set is looking pretty good to me.  We _might_ want to up the ante in
the future and allow it to be even more dynamic than this, but this
looks like a good start to me.

BTW, have you taken a look at what the hotplug events look like if only
a single section (not filling up a whole block) is added?  

Feel free to add my:

Acked-by: Dave Hansen d...@linux.vnet.ibm.com

-- Dave

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH 0/8] v5 De-couple sysfs memory directories from memory sections

2010-08-09 Thread Nathan Fontenot
This set of patches de-couples the idea that there is a single
directory in sysfs for each memory section.  The intent of the
patches is to reduce the number of sysfs directories created to
resolve a boot-time performance issue.  On very large systems
boot time are getting very long (as seen on powerpc hardware)
due to the enormous number of sysfs directories being created.
On a system with 1 TB of memory we create ~63,000 directories.
For even larger systems boot times are being measured in hours.

This set of patches allows for each directory created in sysfs
to cover more than one memory section.  The default behavior for
sysfs directory creation is the same, in that each directory
represents a single memory section.  A new file 'end_phys_index'
in each directory contains the physical_id of the last memory
section covered by the directory so that users can easily
determine the memory section range of a directory.

Updates for version 5 of the patchset include the following:

Patch 4/8 Add mutex for add/remove of memory blocks
- Define the mutex using DEFINE_MUTEX macro.

Patch 8/8 Update memory-hotplug documentation
- Add information concerning memory holes in phys_index..end_phys_index.
 
Thanks,

Nathan Fontenot
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev