Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-11-01 Thread Mel Gorman
On Sun, Oct 23, 2011 at 09:05:05PM -0700, Michal Nazarewicz wrote:
  On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
  This commit introduces alloc_contig_freed_pages() function
  which allocates (ie. removes from buddy system) free pages
  in range. Caller has to guarantee that all pages in range
  are in buddy system.
 
 On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman m...@csn.ul.ie wrote:
  Straight away, I'm wondering why you didn't use
  mm/compaction.c#isolate_freepages()
 
 Does the below look like a step in the right direction?
 
 It basically moves isolate_freepages_block() to page_alloc.c (changing

For the purposes of review, have a separate patch for moving
isolate_freepages_block to another file that does not alter the
function in any way. When the function is updated in a follow-on patch,
it'll be far easier to see what has changed.

page_isolation.c may also be a better fit than page_alloc.c

As it is, the patch for isolate_freepages_block is almost impossible
to read because it's getting munged with existing code that is already
in page_alloc.c . About all I caught from it is that scannedp does
not have a type. It defaults to unsigned int but it's unnecessarily
obscure.

 it name to isolate_freepages_range()) and changes it so that depending
 on arguments it treats holes (either invalid PFN or non-free page) as
 errors so that CMA can use it.
 

I haven't actually read the function because it's too badly mixed with
page_alloc.c code but this description fits what I'm looking for.

 It also accepts a range rather then just assuming a single pageblock
 thus the change moves range calculation in compaction.c from
 isolate_freepages_block() up to isolate_freepages().
 
 The change also modifies spilt_free_page() so that it does not try to
 change pageblock's migrate type if current migrate type is ISOLATE or
 CMA.
 

This is fine. Later, the flags that determine what happens to pageblocks
may be placed in compact_control.

 ---
  include/linux/mm.h |1 -
  include/linux/page-isolation.h |4 +-
  mm/compaction.c|   73 +++
  mm/internal.h  |5 ++
  mm/page_alloc.c|  128 
 +---
  5 files changed, 95 insertions(+), 116 deletions(-)
 

I confess I didn't read closely because of the mess in page_alloc.c but
the intent seems fine. Hopefully there will be a new version of CMA
posted that will be easier to review.

Thanks

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-11-01 Thread Michal Nazarewicz

On Tue, 01 Nov 2011 16:04:48 +0100, Mel Gorman m...@csn.ul.ie wrote:

For the purposes of review, have a separate patch for moving
isolate_freepages_block to another file that does not alter the
function in any way. When the function is updated in a follow-on patch,
it'll be far easier to see what has changed.


Will do.


page_isolation.c may also be a better fit than page_alloc.c


Since isolate_freepages_block() is the only user of split_free_page(),
would it make sense to move split_free_page() to page_isolation.c as
well?  I sort of like the idea of making it static and removing from
header file.


I confess I didn't read closely because of the mess in page_alloc.c but
the intent seems fine.


No worries.  I just needed for a quick comment whether I'm headed the right
direction. :)


Hopefully there will be a new version of CMA posted that will be easier
to review.


I'll try and create the code no latter then on the weekend so hopefully
the new version will be sent next week.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-11-01 Thread Mel Gorman
On Tue, Nov 01, 2011 at 07:06:56PM +0100, Michal Nazarewicz wrote:
 page_isolation.c may also be a better fit than page_alloc.c
 
 Since isolate_freepages_block() is the only user of split_free_page(),
 would it make sense to move split_free_page() to page_isolation.c as
 well?  I sort of like the idea of making it static and removing from
 header file.
 

I see no problem with that. It'll be separate from split_page() but that
is not earth shattering.

Thanks.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-23 Thread Michal Nazarewicz

On Fri, 21 Oct 2011 03:06:24 -0700, Mel Gorman m...@csn.ul.ie wrote:


On Tue, Oct 18, 2011 at 10:26:37AM -0700, Michal Nazarewicz wrote:

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman m...@csn.ul.ie wrote:

At this point, I'm going to apologise for not reviewing this a long long
time ago.

On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com

This commit introduces alloc_contig_freed_pages() function
which allocates (ie. removes from buddy system) free pages
in range. Caller has to guarantee that all pages in range
are in buddy system.


Straight away, I'm wondering why you didn't use

mm/compaction.c#isolate_freepages()

It knows how to isolate pages within ranges. All its control information
is passed via struct compact_control() which I recognise may be awkward
for CMA but compaction.c know how to manage all the isolated pages and
pass them to migrate.c appropriately.

It is something to consider.  At first glance, I see that isolate_freepages
seem to operate on pageblocks which is not desired for CMA.



isolate_freepages_block operates on a range of pages that happens to be
hard-coded to be a pageblock because that was the requirements. It calculates
end_pfn and it is possible to make that a function parameter.


Yes, this seems doable.  I'll try and rewrite the patches to use it.

The biggest difference is in how CMA and compaction treat pages which are not
free.  CMA treat it as an error and compaction just skips those.  This is
solvable by an argument though.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-21 Thread Mel Gorman
On Tue, Oct 18, 2011 at 10:26:37AM -0700, Michal Nazarewicz wrote:
 On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman m...@csn.ul.ie wrote:
 
 At this point, I'm going to apologise for not reviewing this a long long
 time ago.
 
 On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
 From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 
 This commit introduces alloc_contig_freed_pages() function
 which allocates (ie. removes from buddy system) free pages
 in range. Caller has to guarantee that all pages in range
 are in buddy system.
 
 
 Straight away, I'm wondering why you didn't use
 
 mm/compaction.c#isolate_freepages()
 
 It knows how to isolate pages within ranges. All its control information
 is passed via struct compact_control() which I recognise may be awkward
 for CMA but compaction.c know how to manage all the isolated pages and
 pass them to migrate.c appropriately.
 
 It is something to consider.  At first glance, I see that isolate_freepages
 seem to operate on pageblocks which is not desired for CMA.
 

isolate_freepages_block operates on a range of pages that happens to be
hard-coded to be a pageblock because that was the requirements. It calculates
end_pfn and it is possible to make that a function parameter.

 I haven't read all the patches yet but isolate_freepages() does break
 everything up into order-0 pages. This may not be to your liking but it
 would not be possible to change.
 
 Splitting everything into order-0 pages is desired behaviour.
 

Great.

 Along with this function, a free_contig_pages() function is
 provided which frees all (or a subset of) pages allocated
 with alloc_contig_free_pages().
 
 mm/compaction.c#release_freepages()
 
 It sort of does the same thing but release_freepages() assumes that pages
 that are being freed are not-continuous and they need to be on the lru list.
 With free_contig_pages(), we can assume all pages are continuous.
 

Ok, I jumped the gun here. release_freepages() may not be a perfect fit.
release_freepages() is also used when finishing compaction where as it
is a real free function that is required here.

 You can do this in a more general fashion by checking the
 zone boundaries and resolving the pfn-page every MAX_ORDER_NR_PAGES.
 That will not be SPARSEMEM specific.
 
 I've tried doing stuff that way but it ended up with much more code.
 
 Dave suggested the above function to check if pointer arithmetic is valid.
 
 Please see also https://lkml.org/lkml/2011/9/21/220.
 

Ok, I'm still not fully convinced but I confess I'm not thinking about this
particular function too deeply because I am expecting the problem would
go away if compaction and CMA shared common code for freeing contiguous
regions via page migration.

  SNIP
 +   if (zone_pfn_same_memmap(pfn - count, pfn))
 +   page += count;
 +   else
 +   page = pfn_to_page(pfn);
 +   }
 +
 +   spin_unlock_irq(zone-lock);
 +
 +   /* After this, pages in the range can be freed one be one */
 +   count = pfn - start;
 +   pfn = start;
 +   for (page = pfn_to_page(pfn); count; --count) {
 +   prep_new_page(page, 0, flag);
 +   ++pfn;
 +   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
 +   ++page;
 +   else
 +   page = pfn_to_page(pfn);
 +   }
 +
 
 Here it looks like you have implemented something like split_free_page().
 
 split_free_page() takes a single page, removes it from buddy system, and 
 finally
 splits it. 

I'm referring to just this chunk.

split_free_page takes a page, checks the watermarks and performs similar
operations to prep_new_page(). There should be no need to introduce a
new similar function. split_free_page() does affect hte pageblock
migratetype and that is undesirable but that part could be taken out and
moved to compaction.c if necessary.

On the watermarks thing, CMA does not pay much attention to them. I have
a strong feeling that it is easy to deadlock a system by using CMA while
under memory pressure. Compaction had the same problem early in its
development FWIW. This is partially why I'd prefer to see CMA and
compaction sharing as much code as possible because compaction gets
continual testing.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Mel Gorman
At this point, I'm going to apologise for not reviewing this a long long
time ago.

On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
 From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 
 This commit introduces alloc_contig_freed_pages() function
 which allocates (ie. removes from buddy system) free pages
 in range. Caller has to guarantee that all pages in range
 are in buddy system.
 

Straight away, I'm wondering why you didn't use

mm/compaction.c#isolate_freepages()

It knows how to isolate pages within ranges. All its control information
is passed via struct compact_control() which I recognise may be awkward
for CMA but compaction.c know how to manage all the isolated pages and
pass them to migrate.c appropriately.

I haven't read all the patches yet but isolate_freepages() does break
everything up into order-0 pages. This may not be to your liking but it
would not be possible to change.

 Along with this function, a free_contig_pages() function is
 provided which frees all (or a subset of) pages allocated
 with alloc_contig_free_pages().
 

mm/compaction.c#release_freepages()

 Michal Nazarewicz has modified the function to make it easier
 to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
 return pfn of one-past-the-last allocated page.
 
 Signed-off-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 Signed-off-by: Michal Nazarewicz m.nazarew...@samsung.com
 [m.nazarewicz: added checks if all allocated pages comes from the
 same memory zone]
 Signed-off-by: Michal Nazarewicz min...@mina86.com
 Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
 [m.szyprowski: fixed wrong condition in VM_BUG_ON assert]
 Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
 Acked-by: Arnd Bergmann a...@arndb.de
 ---
  include/linux/mmzone.h |   16 +
  include/linux/page-isolation.h |5 +++
  mm/page_alloc.c|   67 
 
  3 files changed, 88 insertions(+), 0 deletions(-)
 
 diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
 index a2760bb..862a834 100644
 --- a/include/linux/mmzone.h
 +++ b/include/linux/mmzone.h
 @@ -1168,6 +1168,22 @@ static inline int memmap_valid_within(unsigned long 
 pfn,
  }
  #endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
  
 +#if defined(CONFIG_SPARSEMEM)  !defined(CONFIG_SPARSEMEM_VMEMMAP)
 +/*
 + * Both PFNs must be from the same zone!  If this function returns
 + * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
 + */
 +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long 
 pfn2)
 +{
 + return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
 +}
 +

Why do you care what section the page is in? The zone is important all
right, but not the section. Also, offhand I'm unsure if being in the
same section guarantees the same zone. sections are ordinarily fully
populated (except on ARM but hey) but I can't remember anything
enforcing that zones be section-aligned.

Later I think I see that the intention was to reduce the use of
pfn_to_page(). You can do this in a more general fashion by checking the
zone boundaries and resolving the pfn-page every MAX_ORDER_NR_PAGES.
That will not be SPARSEMEM specific.

 +#else
 +
 +#define zone_pfn_same_memmap(pfn1, pfn2) (true)
 +
 +#endif
 +
  #endif /* !__GENERATING_BOUNDS.H */
  #endif /* !__ASSEMBLY__ */
  #endif /* _LINUX_MMZONE_H */
 diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
 index 58cdbac..b9fc428 100644
 --- a/include/linux/page-isolation.h
 +++ b/include/linux/page-isolation.h
 @@ -33,6 +33,11 @@ test_pages_isolated(unsigned long start_pfn, unsigned long 
 end_pfn);
  extern int set_migratetype_isolate(struct page *page);
  extern void unset_migratetype_isolate(struct page *page);
  
 +/* The below functions must be run on a range from a single zone. */
 +extern unsigned long alloc_contig_freed_pages(unsigned long start,
 +   unsigned long end, gfp_t flag);
 +extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 +
  /*
   * For migration.
   */
 diff --git a/mm/page_alloc.c b/mm/page_alloc.c
 index bf4399a..fbfb920 100644
 --- a/mm/page_alloc.c
 +++ b/mm/page_alloc.c
 @@ -5706,6 +5706,73 @@ out:
   spin_unlock_irqrestore(zone-lock, flags);
  }
  
 +unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long 
 end,
 +gfp_t flag)
 +{
 + unsigned long pfn = start, count;
 + struct page *page;
 + struct zone *zone;
 + int order;
 +
 + VM_BUG_ON(!pfn_valid(start));

VM_BUG_ON seems very harsh here. WARN_ON_ONCE and returning 0 to the
caller sees reasonable.

 + page = pfn_to_page(start);
 + zone = page_zone(page);
 +
 + spin_lock_irq(zone-lock);
 +
 + for (;;) {
 + VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
 +   page_zone(page) != zone);
 +

Here you will VM_BUG_ON with the 

Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Michal Nazarewicz

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman m...@csn.ul.ie wrote:


At this point, I'm going to apologise for not reviewing this a long long
time ago.

On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:

From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com

This commit introduces alloc_contig_freed_pages() function
which allocates (ie. removes from buddy system) free pages
in range. Caller has to guarantee that all pages in range
are in buddy system.



Straight away, I'm wondering why you didn't use

mm/compaction.c#isolate_freepages()

It knows how to isolate pages within ranges. All its control information
is passed via struct compact_control() which I recognise may be awkward
for CMA but compaction.c know how to manage all the isolated pages and
pass them to migrate.c appropriately.


It is something to consider.  At first glance, I see that isolate_freepages
seem to operate on pageblocks which is not desired for CMA.


I haven't read all the patches yet but isolate_freepages() does break
everything up into order-0 pages. This may not be to your liking but it
would not be possible to change.


Splitting everything into order-0 pages is desired behaviour.


Along with this function, a free_contig_pages() function is
provided which frees all (or a subset of) pages allocated
with alloc_contig_free_pages().



mm/compaction.c#release_freepages()


It sort of does the same thing but release_freepages() assumes that pages
that are being freed are not-continuous and they need to be on the lru list.
With free_contig_pages(), we can assume all pages are continuous.


+#if defined(CONFIG_SPARSEMEM)  !defined(CONFIG_SPARSEMEM_VMEMMAP)
+/*
+ * Both PFNs must be from the same zone!  If this function returns
+ * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
+ */
+static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long pfn2)
+{
+   return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
+}
+


Why do you care what section the page is in? The zone is important all
right, but not the section. Also, offhand I'm unsure if being in the
same section guarantees the same zone. sections are ordinarily fully
populated (except on ARM but hey) but I can't remember anything
enforcing that zones be section-aligned.

Later I think I see that the intention was to reduce the use of
pfn_to_page().


That is correct.


You can do this in a more general fashion by checking the
zone boundaries and resolving the pfn-page every MAX_ORDER_NR_PAGES.
That will not be SPARSEMEM specific.


I've tried doing stuff that way but it ended up with much more code.

Dave suggested the above function to check if pointer arithmetic is valid.

Please see also https://lkml.org/lkml/2011/9/21/220.




+#else
+
+#define zone_pfn_same_memmap(pfn1, pfn2) (true)
+
+#endif
+
 #endif /* !__GENERATING_BOUNDS.H */
 #endif /* !__ASSEMBLY__ */
 #endif /* _LINUX_MMZONE_H */




@@ -5706,6 +5706,73 @@ out:
spin_unlock_irqrestore(zone-lock, flags);
 }

+unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
+  gfp_t flag)
+{
+   unsigned long pfn = start, count;
+   struct page *page;
+   struct zone *zone;
+   int order;
+
+   VM_BUG_ON(!pfn_valid(start));


VM_BUG_ON seems very harsh here. WARN_ON_ONCE and returning 0 to the
caller sees reasonable.


+   page = pfn_to_page(start);
+   zone = page_zone(page);
+
+   spin_lock_irq(zone-lock);
+
+   for (;;) {
+   VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
+ page_zone(page) != zone);
+


Here you will VM_BUG_ON with the zone lock held leading to system
halting very shortly.


+   list_del(page-lru);
+   order = page_order(page);
+   count = 1UL  order;
+   zone-free_area[order].nr_free--;
+   rmv_page_order(page);
+   __mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+


The callers need to check in advance if watermarks are sufficient for
this. In compaction, it happens in compaction_suitable() because it only
needed to be checked once. Your requirements might be different.


+   pfn += count;
+   if (pfn = end)
+   break;
+   VM_BUG_ON(!pfn_valid(pfn));
+


On ARM, it's possible to encounter invalid pages. VM_BUG_ON is serious
overkill.


+   if (zone_pfn_same_memmap(pfn - count, pfn))
+   page += count;
+   else
+   page = pfn_to_page(pfn);
+   }
+
+   spin_unlock_irq(zone-lock);
+
+   /* After this, pages in the range can be freed one be one */
+   count = pfn - start;
+   pfn = start;
+   for (page = pfn_to_page(pfn); count; --count) {
+   prep_new_page(page, 0, flag);
+   ++pfn;
+   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
+   

Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Dave Hansen
On Tue, 2011-10-18 at 10:26 -0700, Michal Nazarewicz wrote:
  You can do this in a more general fashion by checking the
  zone boundaries and resolving the pfn-page every MAX_ORDER_NR_PAGES.
  That will not be SPARSEMEM specific.
 
 I've tried doing stuff that way but it ended up with much more code.

I guess instead of:

 +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long 
 pfn2)
 +{
 +return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
 +}

You could do:

static inline bool zone_pfn_same_maxorder(unsigned long pfn1, unsigned long 
pfn2)
{
unsigned long mask = MAX_ORDER_NR_PAGES-1;
return (pfn1  mask) == (pfn2  mask);
}

I think that works.  Should be the same code you have now, basically.

-- Dave

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Michal Nazarewicz

On Tue, 18 Oct 2011 10:48:46 -0700, Dave Hansen d...@linux.vnet.ibm.com wrote:


On Tue, 2011-10-18 at 10:26 -0700, Michal Nazarewicz wrote:

 You can do this in a more general fashion by checking the
 zone boundaries and resolving the pfn-page every MAX_ORDER_NR_PAGES.
 That will not be SPARSEMEM specific.

I've tried doing stuff that way but it ended up with much more code.


I guess instead of:


+static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long pfn2)
+{
+return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
+}


You could do:

static inline bool zone_pfn_same_maxorder(unsigned long pfn1, unsigned long 
pfn2)
{
unsigned long mask = MAX_ORDER_NR_PAGES-1;
return (pfn1  mask) == (pfn2  mask);
}

I think that works.  Should be the same code you have now, basically.


Makes sense.  It'd require calling pfn_to_page() every MAX_ORDER_NR_PAGES even
in memory models that have linear mapping of struct page, but I guess that's
not that bad.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-17 Thread Marek Szyprowski
Hello Andrew,

Thanks for your comments. I will try to address them in the next round of
CMA patches.

On Saturday, October 15, 2011 1:30 AM Andrew Morton wrote:

(snipped)

  +
  +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
  +{
  +   struct page *page = pfn_to_page(pfn);
  +
  +   while (nr_pages--) {
  +   __free_page(page);
  +   ++pfn;
  +   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
  +   ++page;
  +   else
  +   page = pfn_to_page(pfn);
  +   }
  +}
 
 You're sure these functions don't need EXPORT_SYMBOL()?  Maybe the
 design is that only DMA core calls into here (if so, that's good).

Drivers should not call it, it is intended to be used by low-level DMA
code. Do you think that a comment about missing EXPORT_SYMBOL is 
required?

Best regards
-- 
Marek Szyprowski
Samsung Poland RD Center


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-17 Thread Andrew Morton
On Mon, 17 Oct 2011 14:21:07 +0200
Marek Szyprowski m.szyprow...@samsung.com wrote:

   +
   +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
   +{
   + struct page *page = pfn_to_page(pfn);
   +
   + while (nr_pages--) {
   + __free_page(page);
   + ++pfn;
   + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
   + ++page;
   + else
   + page = pfn_to_page(pfn);
   + }
   +}
  
  You're sure these functions don't need EXPORT_SYMBOL()?  Maybe the
  design is that only DMA core calls into here (if so, that's good).
 
 Drivers should not call it, it is intended to be used by low-level DMA
 code.

OK, thanks for checking.

 Do you think that a comment about missing EXPORT_SYMBOL is 
 required?

No.  If someone later wants to use these from a module then we can look
at their reasons and make a decision at that time.

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-16 Thread Michal Nazarewicz

On Sat, 15 Oct 2011 01:29:33 +0200, Andrew Morton a...@linux-foundation.org 
wrote:


On Thu, 06 Oct 2011 15:54:42 +0200
Marek Szyprowski m.szyprow...@samsung.com wrote:


From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com

This commit introduces alloc_contig_freed_pages() function


The freed seems redundant to me.  Wouldn't alloc_contig_pages be a
better name?


The “freed” is there because the function operates on pages that are in
buddy system, ie. it is given a range of PFNs that are to be removed
from buddy system.

There's also a alloc_contig_range() function (added by next patch)
which frees pages in given range and then calls
alloc_contig_free_pages() to allocate them.

IMO, if there was an alloc_contig_pages() function, it would have to
be one level up (ie. it would figure out where to allocate memory and
then call alloc_contig_range()).  (That's really what CMA is doing).

Still, as I think of it now, maybe alloc_contig_free_range() would be
better?
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-16 Thread Andrew Morton
On Sun, 16 Oct 2011 10:01:36 +0200 Michal Nazarewicz min...@mina86.com 
wrote:

 Still, as I think of it now, maybe alloc_contig_free_range() would be
 better?

Nope.  Of *course* the pages were free.  Otherwise we couldn't
(re)allocate them.  I still think the free part is redundant.

What could be improved is the alloc part.  This really isn't an
allocation operation.  The pages are being removed from buddy then
moved into the free arena of a different memory manager from where they
will _later_ be allocated.

So we should move away from the alloc/free naming altogether for this
operation and think up new terms.  How about claim and release? 
claim_contig_pages, claim_contig_range, release_contig_pages, etc?
Or we could use take/return.

Also, if we have no expectation that anything apart from CMA will use
these interfaces (?), the names could/should be prefixed with cma_.

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-16 Thread Michal Nazarewicz

On Sun, 16 Oct 2011 10:01:36 +0200 Michal Nazarewicz wrote:

Still, as I think of it now, maybe alloc_contig_free_range() would be
better?


On Sun, 16 Oct 2011 10:31:16 +0200, Andrew Morton wrote:

Nope.  Of *course* the pages were free.  Otherwise we couldn't
(re)allocate them.  I still think the free part is redundant.


Makes sense.


What could be improved is the alloc part.  This really isn't an
allocation operation.  The pages are being removed from buddy then
moved into the free arena of a different memory manager from where they
will _later_ be allocated.


Not quite.  After alloc_contig_range() returns, the pages are passed with
no further processing to the caller.  Ie. the area is not later split into
several parts nor kept in CMA's pool unused.

alloc_contig_freed_pages() is a little different since it must be called on
a buddy page boundary and may return more then requested (because of the way
buddy system merges buddies) so there is a little processing after it returns
(namely freeing of the excess pages).


So we should move away from the alloc/free naming altogether for this
operation and think up new terms.  How about claim and release?
claim_contig_pages, claim_contig_range, release_contig_pages, etc?
Or we could use take/return.


Personally, I'm not convinced about changing the names of alloc_contig_range()
and free_contig_pages() but I see merit in changing alloc_contig_freed_pages()
to something else.

Since at the moment, it's used only by alloc_contig_range(), I'd lean
towards removing it from page-isolation.h, marking as static and renaming
to __alloc_contig_range().


Also, if we have no expectation that anything apart from CMA will use
these interfaces (?), the names could/should be prefixed with cma_.


In Kamezawa's original patchset, he used those for a bit different
approach (IIRC, Kamezawa's patchset introduced a function that scanned memory
and tried to allocate contiguous memory where it could), so I can imagine that
someone will make use of those functions.  It may be used in any situation
where a range of pages is either free (ie. in buddy system) or movable and
one wants to allocate them for some reason.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michal mina86 Nazarewicz(o o)
ooo +--min...@mina86.com---min...@jabber.org---ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-14 Thread Andrew Morton
On Thu, 06 Oct 2011 15:54:42 +0200
Marek Szyprowski m.szyprow...@samsung.com wrote:

 From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 
 This commit introduces alloc_contig_freed_pages() function

The freed seems redundant to me.  Wouldn't alloc_contig_pages be a
better name?

 which allocates (ie. removes from buddy system) free pages
 in range.  Caller has to guarantee that all pages in range
 are in buddy system.
 
 Along with this function, a free_contig_pages() function is
 provided which frees all (or a subset of) pages allocated
 with alloc_contig_free_pages().
 
 Michal Nazarewicz has modified the function to make it easier
 to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
 return pfn of one-past-the-last allocated page.
 

 ...

 +#if defined(CONFIG_SPARSEMEM)  !defined(CONFIG_SPARSEMEM_VMEMMAP)
 +/*
 + * Both PFNs must be from the same zone!  If this function returns
 + * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
 + */
 +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long 
 pfn2)
 +{
 + return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
 +}
 +
 +#else
 +
 +#define zone_pfn_same_memmap(pfn1, pfn2) (true)

Do this in C, please.  It's nicer and can prevent unused-var warnings.

 +#endif
 +

 ...

 +unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long 
 end,
 +gfp_t flag)
 +{
 + unsigned long pfn = start, count;
 + struct page *page;
 + struct zone *zone;
 + int order;
 +
 + VM_BUG_ON(!pfn_valid(start));
 + page = pfn_to_page(start);
 + zone = page_zone(page);
 +
 + spin_lock_irq(zone-lock);
 +
 + for (;;) {
 + VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
 +   page_zone(page) != zone);
 +
 + list_del(page-lru);
 + order = page_order(page);
 + count = 1UL  order;
 + zone-free_area[order].nr_free--;
 + rmv_page_order(page);
 + __mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);

__mod_zone_page_state() generally shouldn't be used - it bypasses the
per-cpu magazines and can introduce high lock contentions.

That's hopefully not an issue on this callpath but it is still a red
flag.  I'd suggest at least the addition of a suitably apologetic code
comment here - we don't want people to naively copy this code.

Plus such a comment would let me know why this was done ;)

 + pfn += count;
 + if (pfn = end)
 + break;
 + VM_BUG_ON(!pfn_valid(pfn));
 +
 + if (zone_pfn_same_memmap(pfn - count, pfn))
 + page += count;
 + else
 + page = pfn_to_page(pfn);
 + }
 +
 + spin_unlock_irq(zone-lock);
 +
 + /* After this, pages in the range can be freed one be one */
 + count = pfn - start;
 + pfn = start;
 + for (page = pfn_to_page(pfn); count; --count) {
 + prep_new_page(page, 0, flag);
 + ++pfn;
 + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
 + ++page;
 + else
 + page = pfn_to_page(pfn);
 + }
 +
 + return pfn;
 +}
 +
 +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
 +{
 + struct page *page = pfn_to_page(pfn);
 +
 + while (nr_pages--) {
 + __free_page(page);
 + ++pfn;
 + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
 + ++page;
 + else
 + page = pfn_to_page(pfn);
 + }
 +}

You're sure these functions don't need EXPORT_SYMBOL()?  Maybe the
design is that only DMA core calls into here (if so, that's good).


  #ifdef CONFIG_MEMORY_HOTREMOVE
  /*
   * All pages in the range must be isolated before calling this.
 -- 
 1.7.1.569.g6f426
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-06 Thread Marek Szyprowski
From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com

This commit introduces alloc_contig_freed_pages() function
which allocates (ie. removes from buddy system) free pages
in range.  Caller has to guarantee that all pages in range
are in buddy system.

Along with this function, a free_contig_pages() function is
provided which frees all (or a subset of) pages allocated
with alloc_contig_free_pages().

Michal Nazarewicz has modified the function to make it easier
to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
return pfn of one-past-the-last allocated page.

Signed-off-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
Signed-off-by: Michal Nazarewicz m.nazarew...@samsung.com
[m.nazarewicz: added checks if all allocated pages comes from the
same memory zone]
Signed-off-by: Michal Nazarewicz min...@mina86.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
[m.szyprowski: fixed wrong condition in VM_BUG_ON assert]
Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
Acked-by: Arnd Bergmann a...@arndb.de
---
 include/linux/mmzone.h |   16 +
 include/linux/page-isolation.h |5 +++
 mm/page_alloc.c|   67 
 3 files changed, 88 insertions(+), 0 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index a2760bb..862a834 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1168,6 +1168,22 @@ static inline int memmap_valid_within(unsigned long pfn,
 }
 #endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
 
+#if defined(CONFIG_SPARSEMEM)  !defined(CONFIG_SPARSEMEM_VMEMMAP)
+/*
+ * Both PFNs must be from the same zone!  If this function returns
+ * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
+ */
+static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long pfn2)
+{
+   return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
+}
+
+#else
+
+#define zone_pfn_same_memmap(pfn1, pfn2) (true)
+
+#endif
+
 #endif /* !__GENERATING_BOUNDS.H */
 #endif /* !__ASSEMBLY__ */
 #endif /* _LINUX_MMZONE_H */
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 58cdbac..b9fc428 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -33,6 +33,11 @@ test_pages_isolated(unsigned long start_pfn, unsigned long 
end_pfn);
 extern int set_migratetype_isolate(struct page *page);
 extern void unset_migratetype_isolate(struct page *page);
 
+/* The below functions must be run on a range from a single zone. */
+extern unsigned long alloc_contig_freed_pages(unsigned long start,
+ unsigned long end, gfp_t flag);
+extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
+
 /*
  * For migration.
  */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index bf4399a..fbfb920 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5706,6 +5706,73 @@ out:
spin_unlock_irqrestore(zone-lock, flags);
 }
 
+unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
+  gfp_t flag)
+{
+   unsigned long pfn = start, count;
+   struct page *page;
+   struct zone *zone;
+   int order;
+
+   VM_BUG_ON(!pfn_valid(start));
+   page = pfn_to_page(start);
+   zone = page_zone(page);
+
+   spin_lock_irq(zone-lock);
+
+   for (;;) {
+   VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
+ page_zone(page) != zone);
+
+   list_del(page-lru);
+   order = page_order(page);
+   count = 1UL  order;
+   zone-free_area[order].nr_free--;
+   rmv_page_order(page);
+   __mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+
+   pfn += count;
+   if (pfn = end)
+   break;
+   VM_BUG_ON(!pfn_valid(pfn));
+
+   if (zone_pfn_same_memmap(pfn - count, pfn))
+   page += count;
+   else
+   page = pfn_to_page(pfn);
+   }
+
+   spin_unlock_irq(zone-lock);
+
+   /* After this, pages in the range can be freed one be one */
+   count = pfn - start;
+   pfn = start;
+   for (page = pfn_to_page(pfn); count; --count) {
+   prep_new_page(page, 0, flag);
+   ++pfn;
+   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
+   ++page;
+   else
+   page = pfn_to_page(pfn);
+   }
+
+   return pfn;
+}
+
+void free_contig_pages(unsigned long pfn, unsigned nr_pages)
+{
+   struct page *page = pfn_to_page(pfn);
+
+   while (nr_pages--) {
+   __free_page(page);
+   ++pfn;
+   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
+   ++page;
+   else
+   page = pfn_to_page(pfn);
+   }
+}
+
 #ifdef 

[PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-08-12 Thread Marek Szyprowski
From: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com

This commit introduces alloc_contig_freed_pages() function
which allocates (ie. removes from buddy system) free pages
in range.  Caller has to guarantee that all pages in range
are in buddy system.

Along with this function, a free_contig_pages() function is
provided which frees all (or a subset of) pages allocated
with alloc_contig_free_pages().

Michal Nazarewicz has modified the function to make it easier
to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
return pfn of one-past-the-last allocated page.

Signed-off-by: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
Signed-off-by: Michal Nazarewicz m.nazarew...@samsung.com
Signed-off-by: Kyungmin Park kyungmin.p...@samsung.com
Signed-off-by: Marek Szyprowski m.szyprow...@samsung.com
CC: Michal Nazarewicz min...@mina86.com
Acked-by: Arnd Bergmann a...@arndb.de
---
 include/linux/page-isolation.h |3 ++
 mm/page_alloc.c|   44 
 2 files changed, 47 insertions(+), 0 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 58cdbac..f1417ed 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -32,6 +32,9 @@ test_pages_isolated(unsigned long start_pfn, unsigned long 
end_pfn);
  */
 extern int set_migratetype_isolate(struct page *page);
 extern void unset_migratetype_isolate(struct page *page);
+extern unsigned long alloc_contig_freed_pages(unsigned long start,
+ unsigned long end, gfp_t flag);
+extern void free_contig_pages(struct page *page, int nr_pages);
 
 /*
  * For migration.
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6e8ecb6..ad6ae3f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5668,6 +5668,50 @@ out:
spin_unlock_irqrestore(zone-lock, flags);
 }
 
+unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
+  gfp_t flag)
+{
+   unsigned long pfn = start, count;
+   struct page *page;
+   struct zone *zone;
+   int order;
+
+   VM_BUG_ON(!pfn_valid(start));
+   zone = page_zone(pfn_to_page(start));
+
+   spin_lock_irq(zone-lock);
+
+   page = pfn_to_page(pfn);
+   for (;;) {
+   VM_BUG_ON(page_count(page) || !PageBuddy(page));
+   list_del(page-lru);
+   order = page_order(page);
+   zone-free_area[order].nr_free--;
+   rmv_page_order(page);
+   __mod_zone_page_state(zone, NR_FREE_PAGES, -(1UL  order));
+   pfn  += 1  order;
+   if (pfn = end)
+   break;
+   VM_BUG_ON(!pfn_valid(pfn));
+   page += 1  order;
+   }
+
+   spin_unlock_irq(zone-lock);
+
+   /* After this, pages in the range can be freed one be one */
+   page = pfn_to_page(start);
+   for (count = pfn - start; count; --count, ++page)
+   prep_new_page(page, 0, flag);
+
+   return pfn;
+}
+
+void free_contig_pages(struct page *page, int nr_pages)
+{
+   for (; nr_pages; --nr_pages, ++page)
+   __free_page(page);
+}
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 /*
  * All pages in the range must be isolated before calling this.
-- 
1.7.1.569.g6f426

--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html