Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-11-01 Thread Mel Gorman
On Tue, Nov 01, 2011 at 07:06:56PM +0100, Michal Nazarewicz wrote:
> >page_isolation.c may also be a better fit than page_alloc.c
> 
> Since isolate_freepages_block() is the only user of split_free_page(),
> would it make sense to move split_free_page() to page_isolation.c as
> well?  I sort of like the idea of making it static and removing from
> header file.
> 

I see no problem with that. It'll be separate from split_page() but that
is not earth shattering.

Thanks.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-11-01 Thread Michal Nazarewicz

On Tue, 01 Nov 2011 16:04:48 +0100, Mel Gorman  wrote:

For the purposes of review, have a separate patch for moving
isolate_freepages_block to another file that does not alter the
function in any way. When the function is updated in a follow-on patch,
it'll be far easier to see what has changed.


Will do.


page_isolation.c may also be a better fit than page_alloc.c


Since isolate_freepages_block() is the only user of split_free_page(),
would it make sense to move split_free_page() to page_isolation.c as
well?  I sort of like the idea of making it static and removing from
header file.


I confess I didn't read closely because of the mess in page_alloc.c but
the intent seems fine.


No worries.  I just needed for a quick comment whether I'm headed the right
direction. :)


Hopefully there will be a new version of CMA posted that will be easier
to review.


I'll try and create the code no latter then on the weekend so hopefully
the new version will be sent next week.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-11-01 Thread Mel Gorman
On Sun, Oct 23, 2011 at 09:05:05PM -0700, Michal Nazarewicz wrote:
> > On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
> >> This commit introduces alloc_contig_freed_pages() function
> >> which allocates (ie. removes from buddy system) free pages
> >> in range. Caller has to guarantee that all pages in range
> >> are in buddy system.
> 
> On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman  wrote:
> > Straight away, I'm wondering why you didn't use
> > mm/compaction.c#isolate_freepages()
> 
> Does the below look like a step in the right direction?
> 
> It basically moves isolate_freepages_block() to page_alloc.c (changing

For the purposes of review, have a separate patch for moving
isolate_freepages_block to another file that does not alter the
function in any way. When the function is updated in a follow-on patch,
it'll be far easier to see what has changed.

page_isolation.c may also be a better fit than page_alloc.c

As it is, the patch for isolate_freepages_block is almost impossible
to read because it's getting munged with existing code that is already
in page_alloc.c . About all I caught from it is that scannedp does
not have a type. It defaults to unsigned int but it's unnecessarily
obscure.

> it name to isolate_freepages_range()) and changes it so that depending
> on arguments it treats holes (either invalid PFN or non-free page) as
> errors so that CMA can use it.
> 

I haven't actually read the function because it's too badly mixed with
page_alloc.c code but this description fits what I'm looking for.

> It also accepts a range rather then just assuming a single pageblock
> thus the change moves range calculation in compaction.c from
> isolate_freepages_block() up to isolate_freepages().
> 
> The change also modifies spilt_free_page() so that it does not try to
> change pageblock's migrate type if current migrate type is ISOLATE or
> CMA.
> 

This is fine. Later, the flags that determine what happens to pageblocks
may be placed in compact_control.

> ---
>  include/linux/mm.h |1 -
>  include/linux/page-isolation.h |4 +-
>  mm/compaction.c|   73 +++
>  mm/internal.h  |5 ++
>  mm/page_alloc.c|  128 
> +---
>  5 files changed, 95 insertions(+), 116 deletions(-)
> 

I confess I didn't read closely because of the mess in page_alloc.c but
the intent seems fine. Hopefully there will be a new version of CMA
posted that will be easier to review.

Thanks

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-23 Thread Michal Nazarewicz
> On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>> This commit introduces alloc_contig_freed_pages() function
>> which allocates (ie. removes from buddy system) free pages
>> in range. Caller has to guarantee that all pages in range
>> are in buddy system.

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman  wrote:
> Straight away, I'm wondering why you didn't use
> mm/compaction.c#isolate_freepages()

Does the below look like a step in the right direction?

It basically moves isolate_freepages_block() to page_alloc.c (changing
it name to isolate_freepages_range()) and changes it so that depending
on arguments it treats holes (either invalid PFN or non-free page) as
errors so that CMA can use it.

It also accepts a range rather then just assuming a single pageblock
thus the change moves range calculation in compaction.c from
isolate_freepages_block() up to isolate_freepages().

The change also modifies spilt_free_page() so that it does not try to
change pageblock's migrate type if current migrate type is ISOLATE or
CMA.

---
 include/linux/mm.h |1 -
 include/linux/page-isolation.h |4 +-
 mm/compaction.c|   73 +++
 mm/internal.h  |5 ++
 mm/page_alloc.c|  128 +---
 5 files changed, 95 insertions(+), 116 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index fd599f4..98c99c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -435,7 +435,6 @@ void put_page(struct page *page);
 void put_pages_list(struct list_head *pages);
 
 void split_page(struct page *page, unsigned int order);
-int split_free_page(struct page *page);
 
 /*
  * Compound pages have a destructor function.  Provide a
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 003c52f..6becc74 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -48,10 +48,8 @@ static inline void unset_migratetype_isolate(struct page 
*page)
 }
 
 /* The below functions must be run on a range from a single zone. */
-extern unsigned long alloc_contig_freed_pages(unsigned long start,
- unsigned long end, gfp_t flag);
 extern int alloc_contig_range(unsigned long start, unsigned long end,
- gfp_t flags, unsigned migratetype);
+ unsigned migratetype);
 extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
 
 /*
diff --git a/mm/compaction.c b/mm/compaction.c
index 9e5cc59..685a19e 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -58,77 +58,15 @@ static unsigned long release_freepages(struct list_head 
*freelist)
return count;
 }
 
-/* Isolate free pages onto a private freelist. Must hold zone->lock */
-static unsigned long isolate_freepages_block(struct zone *zone,
-   unsigned long blockpfn,
-   struct list_head *freelist)
-{
-   unsigned long zone_end_pfn, end_pfn;
-   int nr_scanned = 0, total_isolated = 0;
-   struct page *cursor;
-
-   /* Get the last PFN we should scan for free pages at */
-   zone_end_pfn = zone->zone_start_pfn + zone->spanned_pages;
-   end_pfn = min(blockpfn + pageblock_nr_pages, zone_end_pfn);
-
-   /* Find the first usable PFN in the block to initialse page cursor */
-   for (; blockpfn < end_pfn; blockpfn++) {
-   if (pfn_valid_within(blockpfn))
-   break;
-   }
-   cursor = pfn_to_page(blockpfn);
-
-   /* Isolate free pages. This assumes the block is valid */
-   for (; blockpfn < end_pfn; blockpfn++, cursor++) {
-   int isolated, i;
-   struct page *page = cursor;
-
-   if (!pfn_valid_within(blockpfn))
-   continue;
-   nr_scanned++;
-
-   if (!PageBuddy(page))
-   continue;
-
-   /* Found a free page, break it into order-0 pages */
-   isolated = split_free_page(page);
-   total_isolated += isolated;
-   for (i = 0; i < isolated; i++) {
-   list_add(&page->lru, freelist);
-   page++;
-   }
-
-   /* If a page was split, advance to the end of it */
-   if (isolated) {
-   blockpfn += isolated - 1;
-   cursor += isolated - 1;
-   }
-   }
-
-   trace_mm_compaction_isolate_freepages(nr_scanned, total_isolated);
-   return total_isolated;
-}
-
 /* Returns true if the page is within a block suitable for migration to */
 static bool suitable_migration_target(struct page *page)
 {
-
int migratetype = get_pageblock_migratetype(page);
 
/* Don't interfere with memory hot-remove or the min_free_kbytes blocks 
*/
if (migratetype == MIGRATE_ISOLATE || migratetype =

Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-23 Thread Michal Nazarewicz

On Fri, 21 Oct 2011 03:06:24 -0700, Mel Gorman  wrote:


On Tue, Oct 18, 2011 at 10:26:37AM -0700, Michal Nazarewicz wrote:

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman  wrote:

>At this point, I'm going to apologise for not reviewing this a long long
>time ago.
>
>On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
>>From: KAMEZAWA Hiroyuki 
>>
>>This commit introduces alloc_contig_freed_pages() function
>>which allocates (ie. removes from buddy system) free pages
>>in range. Caller has to guarantee that all pages in range
>>are in buddy system.
>>
>
>Straight away, I'm wondering why you didn't use
>
>mm/compaction.c#isolate_freepages()
>
>It knows how to isolate pages within ranges. All its control information
>is passed via struct compact_control() which I recognise may be awkward
>for CMA but compaction.c know how to manage all the isolated pages and
>pass them to migrate.c appropriately.

It is something to consider.  At first glance, I see that isolate_freepages
seem to operate on pageblocks which is not desired for CMA.



isolate_freepages_block operates on a range of pages that happens to be
hard-coded to be a pageblock because that was the requirements. It calculates
end_pfn and it is possible to make that a function parameter.


Yes, this seems doable.  I'll try and rewrite the patches to use it.

The biggest difference is in how CMA and compaction treat pages which are not
free.  CMA treat it as an error and compaction just skips those.  This is
solvable by an argument though.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-21 Thread Mel Gorman
On Tue, Oct 18, 2011 at 10:26:37AM -0700, Michal Nazarewicz wrote:
> On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman  wrote:
> 
> >At this point, I'm going to apologise for not reviewing this a long long
> >time ago.
> >
> >On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
> >>From: KAMEZAWA Hiroyuki 
> >>
> >>This commit introduces alloc_contig_freed_pages() function
> >>which allocates (ie. removes from buddy system) free pages
> >>in range. Caller has to guarantee that all pages in range
> >>are in buddy system.
> >>
> >
> >Straight away, I'm wondering why you didn't use
> >
> >mm/compaction.c#isolate_freepages()
> >
> >It knows how to isolate pages within ranges. All its control information
> >is passed via struct compact_control() which I recognise may be awkward
> >for CMA but compaction.c know how to manage all the isolated pages and
> >pass them to migrate.c appropriately.
> 
> It is something to consider.  At first glance, I see that isolate_freepages
> seem to operate on pageblocks which is not desired for CMA.
> 

isolate_freepages_block operates on a range of pages that happens to be
hard-coded to be a pageblock because that was the requirements. It calculates
end_pfn and it is possible to make that a function parameter.

> >I haven't read all the patches yet but isolate_freepages() does break
> >everything up into order-0 pages. This may not be to your liking but it
> >would not be possible to change.
> 
> Splitting everything into order-0 pages is desired behaviour.
> 

Great.

> >>Along with this function, a free_contig_pages() function is
> >>provided which frees all (or a subset of) pages allocated
> >>with alloc_contig_free_pages().
> 
> >mm/compaction.c#release_freepages()
> 
> It sort of does the same thing but release_freepages() assumes that pages
> that are being freed are not-continuous and they need to be on the lru list.
> With free_contig_pages(), we can assume all pages are continuous.
> 

Ok, I jumped the gun here. release_freepages() may not be a perfect fit.
release_freepages() is also used when finishing compaction where as it
is a real free function that is required here.

> >You can do this in a more general fashion by checking the
> >zone boundaries and resolving the pfn->page every MAX_ORDER_NR_PAGES.
> >That will not be SPARSEMEM specific.
> 
> I've tried doing stuff that way but it ended up with much more code.
> 
> Dave suggested the above function to check if pointer arithmetic is valid.
> 
> Please see also .
> 

Ok, I'm still not fully convinced but I confess I'm not thinking about this
particular function too deeply because I am expecting the problem would
go away if compaction and CMA shared common code for freeing contiguous
regions via page migration.

> >> 
> >>+   if (zone_pfn_same_memmap(pfn - count, pfn))
> >>+   page += count;
> >>+   else
> >>+   page = pfn_to_page(pfn);
> >>+   }
> >>+
> >>+   spin_unlock_irq(&zone->lock);
> >>+
> >>+   /* After this, pages in the range can be freed one be one */
> >>+   count = pfn - start;
> >>+   pfn = start;
> >>+   for (page = pfn_to_page(pfn); count; --count) {
> >>+   prep_new_page(page, 0, flag);
> >>+   ++pfn;
> >>+   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> >>+   ++page;
> >>+   else
> >>+   page = pfn_to_page(pfn);
> >>+   }
> >>+
> >
> >Here it looks like you have implemented something like split_free_page().
> 
> split_free_page() takes a single page, removes it from buddy system, and 
> finally
> splits it. 

I'm referring to just this chunk.

split_free_page takes a page, checks the watermarks and performs similar
operations to prep_new_page(). There should be no need to introduce a
new similar function. split_free_page() does affect hte pageblock
migratetype and that is undesirable but that part could be taken out and
moved to compaction.c if necessary.

On the watermarks thing, CMA does not pay much attention to them. I have
a strong feeling that it is easy to deadlock a system by using CMA while
under memory pressure. Compaction had the same problem early in its
development FWIW. This is partially why I'd prefer to see CMA and
compaction sharing as much code as possible because compaction gets
continual testing.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Michal Nazarewicz

On Tue, 18 Oct 2011 10:48:46 -0700, Dave Hansen  wrote:


On Tue, 2011-10-18 at 10:26 -0700, Michal Nazarewicz wrote:

> You can do this in a more general fashion by checking the
> zone boundaries and resolving the pfn->page every MAX_ORDER_NR_PAGES.
> That will not be SPARSEMEM specific.

I've tried doing stuff that way but it ended up with much more code.


I guess instead of:


+static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long pfn2)
+{
+return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
+}


You could do:

static inline bool zone_pfn_same_maxorder(unsigned long pfn1, unsigned long 
pfn2)
{
unsigned long mask = MAX_ORDER_NR_PAGES-1;
return (pfn1 & mask) == (pfn2 & mask);
}

I think that works.  Should be the same code you have now, basically.


Makes sense.  It'd require calling pfn_to_page() every MAX_ORDER_NR_PAGES even
in memory models that have linear mapping of struct page, but I guess that's
not that bad.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michał “mina86” Nazarewicz(o o)
ooo +--ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Dave Hansen
On Tue, 2011-10-18 at 10:26 -0700, Michal Nazarewicz wrote:
> > You can do this in a more general fashion by checking the
> > zone boundaries and resolving the pfn->page every MAX_ORDER_NR_PAGES.
> > That will not be SPARSEMEM specific.
> 
> I've tried doing stuff that way but it ended up with much more code.

I guess instead of:

>> +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long 
>> pfn2)
>> +{
>> +return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
>> +}

You could do:

static inline bool zone_pfn_same_maxorder(unsigned long pfn1, unsigned long 
pfn2)
{
unsigned long mask = MAX_ORDER_NR_PAGES-1;
return (pfn1 & mask) == (pfn2 & mask);
}

I think that works.  Should be the same code you have now, basically.

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Michal Nazarewicz

On Tue, 18 Oct 2011 05:21:09 -0700, Mel Gorman  wrote:


At this point, I'm going to apologise for not reviewing this a long long
time ago.

On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:

From: KAMEZAWA Hiroyuki 

This commit introduces alloc_contig_freed_pages() function
which allocates (ie. removes from buddy system) free pages
in range. Caller has to guarantee that all pages in range
are in buddy system.



Straight away, I'm wondering why you didn't use

mm/compaction.c#isolate_freepages()

It knows how to isolate pages within ranges. All its control information
is passed via struct compact_control() which I recognise may be awkward
for CMA but compaction.c know how to manage all the isolated pages and
pass them to migrate.c appropriately.


It is something to consider.  At first glance, I see that isolate_freepages
seem to operate on pageblocks which is not desired for CMA.


I haven't read all the patches yet but isolate_freepages() does break
everything up into order-0 pages. This may not be to your liking but it
would not be possible to change.


Splitting everything into order-0 pages is desired behaviour.


Along with this function, a free_contig_pages() function is
provided which frees all (or a subset of) pages allocated
with alloc_contig_free_pages().



mm/compaction.c#release_freepages()


It sort of does the same thing but release_freepages() assumes that pages
that are being freed are not-continuous and they need to be on the lru list.
With free_contig_pages(), we can assume all pages are continuous.


+#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
+/*
+ * Both PFNs must be from the same zone!  If this function returns
+ * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
+ */
+static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long pfn2)
+{
+   return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
+}
+


Why do you care what section the page is in? The zone is important all
right, but not the section. Also, offhand I'm unsure if being in the
same section guarantees the same zone. sections are ordinarily fully
populated (except on ARM but hey) but I can't remember anything
enforcing that zones be section-aligned.

Later I think I see that the intention was to reduce the use of
pfn_to_page().


That is correct.


You can do this in a more general fashion by checking the
zone boundaries and resolving the pfn->page every MAX_ORDER_NR_PAGES.
That will not be SPARSEMEM specific.


I've tried doing stuff that way but it ended up with much more code.

Dave suggested the above function to check if pointer arithmetic is valid.

Please see also .




+#else
+
+#define zone_pfn_same_memmap(pfn1, pfn2) (true)
+
+#endif
+
 #endif /* !__GENERATING_BOUNDS.H */
 #endif /* !__ASSEMBLY__ */
 #endif /* _LINUX_MMZONE_H */




@@ -5706,6 +5706,73 @@ out:
spin_unlock_irqrestore(&zone->lock, flags);
 }

+unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
+  gfp_t flag)
+{
+   unsigned long pfn = start, count;
+   struct page *page;
+   struct zone *zone;
+   int order;
+
+   VM_BUG_ON(!pfn_valid(start));


VM_BUG_ON seems very harsh here. WARN_ON_ONCE and returning 0 to the
caller sees reasonable.


+   page = pfn_to_page(start);
+   zone = page_zone(page);
+
+   spin_lock_irq(&zone->lock);
+
+   for (;;) {
+   VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
+ page_zone(page) != zone);
+


Here you will VM_BUG_ON with the zone lock held leading to system
halting very shortly.


+   list_del(&page->lru);
+   order = page_order(page);
+   count = 1UL << order;
+   zone->free_area[order].nr_free--;
+   rmv_page_order(page);
+   __mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);
+


The callers need to check in advance if watermarks are sufficient for
this. In compaction, it happens in compaction_suitable() because it only
needed to be checked once. Your requirements might be different.


+   pfn += count;
+   if (pfn >= end)
+   break;
+   VM_BUG_ON(!pfn_valid(pfn));
+


On ARM, it's possible to encounter invalid pages. VM_BUG_ON is serious
overkill.


+   if (zone_pfn_same_memmap(pfn - count, pfn))
+   page += count;
+   else
+   page = pfn_to_page(pfn);
+   }
+
+   spin_unlock_irq(&zone->lock);
+
+   /* After this, pages in the range can be freed one be one */
+   count = pfn - start;
+   pfn = start;
+   for (page = pfn_to_page(pfn); count; --count) {
+   prep_new_page(page, 0, flag);
+   ++pfn;
+   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
+   ++page;
+   else

Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-18 Thread Mel Gorman
At this point, I'm going to apologise for not reviewing this a long long
time ago.

On Thu, Oct 06, 2011 at 03:54:42PM +0200, Marek Szyprowski wrote:
> From: KAMEZAWA Hiroyuki 
> 
> This commit introduces alloc_contig_freed_pages() function
> which allocates (ie. removes from buddy system) free pages
> in range. Caller has to guarantee that all pages in range
> are in buddy system.
> 

Straight away, I'm wondering why you didn't use

mm/compaction.c#isolate_freepages()

It knows how to isolate pages within ranges. All its control information
is passed via struct compact_control() which I recognise may be awkward
for CMA but compaction.c know how to manage all the isolated pages and
pass them to migrate.c appropriately.

I haven't read all the patches yet but isolate_freepages() does break
everything up into order-0 pages. This may not be to your liking but it
would not be possible to change.

> Along with this function, a free_contig_pages() function is
> provided which frees all (or a subset of) pages allocated
> with alloc_contig_free_pages().
> 

mm/compaction.c#release_freepages()

> Michal Nazarewicz has modified the function to make it easier
> to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
> return pfn of one-past-the-last allocated page.
> 
> Signed-off-by: KAMEZAWA Hiroyuki 
> Signed-off-by: Michal Nazarewicz 
> [m.nazarewicz: added checks if all allocated pages comes from the
> same memory zone]
> Signed-off-by: Michal Nazarewicz 
> Signed-off-by: Kyungmin Park 
> [m.szyprowski: fixed wrong condition in VM_BUG_ON assert]
> Signed-off-by: Marek Szyprowski 
> Acked-by: Arnd Bergmann 
> ---
>  include/linux/mmzone.h |   16 +
>  include/linux/page-isolation.h |5 +++
>  mm/page_alloc.c|   67 
> 
>  3 files changed, 88 insertions(+), 0 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index a2760bb..862a834 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1168,6 +1168,22 @@ static inline int memmap_valid_within(unsigned long 
> pfn,
>  }
>  #endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
>  
> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> +/*
> + * Both PFNs must be from the same zone!  If this function returns
> + * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
> + */
> +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long 
> pfn2)
> +{
> + return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
> +}
> +

Why do you care what section the page is in? The zone is important all
right, but not the section. Also, offhand I'm unsure if being in the
same section guarantees the same zone. sections are ordinarily fully
populated (except on ARM but hey) but I can't remember anything
enforcing that zones be section-aligned.

Later I think I see that the intention was to reduce the use of
pfn_to_page(). You can do this in a more general fashion by checking the
zone boundaries and resolving the pfn->page every MAX_ORDER_NR_PAGES.
That will not be SPARSEMEM specific.

> +#else
> +
> +#define zone_pfn_same_memmap(pfn1, pfn2) (true)
> +
> +#endif
> +
>  #endif /* !__GENERATING_BOUNDS.H */
>  #endif /* !__ASSEMBLY__ */
>  #endif /* _LINUX_MMZONE_H */
> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 58cdbac..b9fc428 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -33,6 +33,11 @@ test_pages_isolated(unsigned long start_pfn, unsigned long 
> end_pfn);
>  extern int set_migratetype_isolate(struct page *page);
>  extern void unset_migratetype_isolate(struct page *page);
>  
> +/* The below functions must be run on a range from a single zone. */
> +extern unsigned long alloc_contig_freed_pages(unsigned long start,
> +   unsigned long end, gfp_t flag);
> +extern void free_contig_pages(unsigned long pfn, unsigned nr_pages);
> +
>  /*
>   * For migration.
>   */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index bf4399a..fbfb920 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -5706,6 +5706,73 @@ out:
>   spin_unlock_irqrestore(&zone->lock, flags);
>  }
>  
> +unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long 
> end,
> +gfp_t flag)
> +{
> + unsigned long pfn = start, count;
> + struct page *page;
> + struct zone *zone;
> + int order;
> +
> + VM_BUG_ON(!pfn_valid(start));

VM_BUG_ON seems very harsh here. WARN_ON_ONCE and returning 0 to the
caller sees reasonable.

> + page = pfn_to_page(start);
> + zone = page_zone(page);
> +
> + spin_lock_irq(&zone->lock);
> +
> + for (;;) {
> + VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
> +   page_zone(page) != zone);
> +

Here you will VM_BUG_ON with the zone lock held leading to system
halting very shortly.

Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-17 Thread Andrew Morton
On Mon, 17 Oct 2011 14:21:07 +0200
Marek Szyprowski  wrote:

> > > +
> > > +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
> > > +{
> > > + struct page *page = pfn_to_page(pfn);
> > > +
> > > + while (nr_pages--) {
> > > + __free_page(page);
> > > + ++pfn;
> > > + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> > > + ++page;
> > > + else
> > > + page = pfn_to_page(pfn);
> > > + }
> > > +}
> > 
> > You're sure these functions don't need EXPORT_SYMBOL()?  Maybe the
> > design is that only DMA core calls into here (if so, that's good).
> 
> Drivers should not call it, it is intended to be used by low-level DMA
> code.

OK, thanks for checking.

> Do you think that a comment about missing EXPORT_SYMBOL is 
> required?

No.  If someone later wants to use these from a module then we can look
at their reasons and make a decision at that time.

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-17 Thread Marek Szyprowski
Hello Andrew,

Thanks for your comments. I will try to address them in the next round of
CMA patches.

On Saturday, October 15, 2011 1:30 AM Andrew Morton wrote:

(snipped)

> > +
> > +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
> > +{
> > +   struct page *page = pfn_to_page(pfn);
> > +
> > +   while (nr_pages--) {
> > +   __free_page(page);
> > +   ++pfn;
> > +   if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> > +   ++page;
> > +   else
> > +   page = pfn_to_page(pfn);
> > +   }
> > +}
> 
> You're sure these functions don't need EXPORT_SYMBOL()?  Maybe the
> design is that only DMA core calls into here (if so, that's good).

Drivers should not call it, it is intended to be used by low-level DMA
code. Do you think that a comment about missing EXPORT_SYMBOL is 
required?

Best regards
-- 
Marek Szyprowski
Samsung Poland R&D Center


--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-16 Thread Michal Nazarewicz

On Sun, 16 Oct 2011 10:01:36 +0200 "Michal Nazarewicz" wrote:

Still, as I think of it now, maybe alloc_contig_free_range() would be
better?


On Sun, 16 Oct 2011 10:31:16 +0200, Andrew Morton wrote:

Nope.  Of *course* the pages were free.  Otherwise we couldn't
(re)allocate them.  I still think the "free" part is redundant.


Makes sense.


What could be improved is the "alloc" part.  This really isn't an
allocation operation.  The pages are being removed from buddy then
moved into the free arena of a different memory manager from where they
will _later_ be "allocated".


Not quite.  After alloc_contig_range() returns, the pages are passed with
no further processing to the caller.  Ie. the area is not later split into
several parts nor kept in CMA's pool unused.

alloc_contig_freed_pages() is a little different since it must be called on
a buddy page boundary and may return more then requested (because of the way
buddy system merges buddies) so there is a little processing after it returns
(namely freeing of the excess pages).


So we should move away from the alloc/free naming altogether for this
operation and think up new terms.  How about "claim" and "release"?
claim_contig_pages, claim_contig_range, release_contig_pages, etc?
Or we could use take/return.


Personally, I'm not convinced about changing the names of alloc_contig_range()
and free_contig_pages() but I see merit in changing alloc_contig_freed_pages()
to something else.

Since at the moment, it's used only by alloc_contig_range(), I'd lean
towards removing it from page-isolation.h, marking as static and renaming
to __alloc_contig_range().


Also, if we have no expectation that anything apart from CMA will use
these interfaces (?), the names could/should be prefixed with "cma_".


In Kamezawa's original patchset, he used those for a bit different
approach (IIRC, Kamezawa's patchset introduced a function that scanned memory
and tried to allocate contiguous memory where it could), so I can imagine that
someone will make use of those functions.  It may be used in any situation
where a range of pages is either free (ie. in buddy system) or movable and
one wants to allocate them for some reason.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of  o' \,=./ `o
..o | Computer Science,  Michal "mina86" Nazarewicz(o o)
ooo +ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-16 Thread Andrew Morton
On Sun, 16 Oct 2011 10:01:36 +0200 "Michal Nazarewicz"  
wrote:

> Still, as I think of it now, maybe alloc_contig_free_range() would be
> better?

Nope.  Of *course* the pages were free.  Otherwise we couldn't
(re)allocate them.  I still think the "free" part is redundant.

What could be improved is the "alloc" part.  This really isn't an
allocation operation.  The pages are being removed from buddy then
moved into the free arena of a different memory manager from where they
will _later_ be "allocated".

So we should move away from the alloc/free naming altogether for this
operation and think up new terms.  How about "claim" and "release"? 
claim_contig_pages, claim_contig_range, release_contig_pages, etc?
Or we could use take/return.

Also, if we have no expectation that anything apart from CMA will use
these interfaces (?), the names could/should be prefixed with "cma_".

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-16 Thread Michal Nazarewicz

On Sat, 15 Oct 2011 01:29:33 +0200, Andrew Morton  
wrote:


On Thu, 06 Oct 2011 15:54:42 +0200
Marek Szyprowski  wrote:


From: KAMEZAWA Hiroyuki 

This commit introduces alloc_contig_freed_pages() function


The "freed" seems redundant to me.  Wouldn't "alloc_contig_pages" be a
better name?


The “freed” is there because the function operates on pages that are in
buddy system, ie. it is given a range of PFNs that are to be removed
from buddy system.

There's also a alloc_contig_range() function (added by next patch)
which frees pages in given range and then calls
alloc_contig_free_pages() to allocate them.

IMO, if there was an alloc_contig_pages() function, it would have to
be one level up (ie. it would figure out where to allocate memory and
then call alloc_contig_range()).  (That's really what CMA is doing).

Still, as I think of it now, maybe alloc_contig_free_range() would be
better?
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/9] mm: alloc_contig_freed_pages() added

2011-10-14 Thread Andrew Morton
On Thu, 06 Oct 2011 15:54:42 +0200
Marek Szyprowski  wrote:

> From: KAMEZAWA Hiroyuki 
> 
> This commit introduces alloc_contig_freed_pages() function

The "freed" seems redundant to me.  Wouldn't "alloc_contig_pages" be a
better name?

> which allocates (ie. removes from buddy system) free pages
> in range.  Caller has to guarantee that all pages in range
> are in buddy system.
> 
> Along with this function, a free_contig_pages() function is
> provided which frees all (or a subset of) pages allocated
> with alloc_contig_free_pages().
> 
> Michal Nazarewicz has modified the function to make it easier
> to allocate not MAX_ORDER_NR_PAGES aligned pages by making it
> return pfn of one-past-the-last allocated page.
> 
>
> ...
>
> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> +/*
> + * Both PFNs must be from the same zone!  If this function returns
> + * true, pfn_to_page(pfn1) + (pfn2 - pfn1) == pfn_to_page(pfn2).
> + */
> +static inline bool zone_pfn_same_memmap(unsigned long pfn1, unsigned long 
> pfn2)
> +{
> + return pfn_to_section_nr(pfn1) == pfn_to_section_nr(pfn2);
> +}
> +
> +#else
> +
> +#define zone_pfn_same_memmap(pfn1, pfn2) (true)

Do this in C, please.  It's nicer and can prevent unused-var warnings.

> +#endif
> +
>
> ...
>
> +unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long 
> end,
> +gfp_t flag)
> +{
> + unsigned long pfn = start, count;
> + struct page *page;
> + struct zone *zone;
> + int order;
> +
> + VM_BUG_ON(!pfn_valid(start));
> + page = pfn_to_page(start);
> + zone = page_zone(page);
> +
> + spin_lock_irq(&zone->lock);
> +
> + for (;;) {
> + VM_BUG_ON(page_count(page) || !PageBuddy(page) ||
> +   page_zone(page) != zone);
> +
> + list_del(&page->lru);
> + order = page_order(page);
> + count = 1UL << order;
> + zone->free_area[order].nr_free--;
> + rmv_page_order(page);
> + __mod_zone_page_state(zone, NR_FREE_PAGES, -(long)count);

__mod_zone_page_state() generally shouldn't be used - it bypasses the
per-cpu magazines and can introduce high lock contentions.

That's hopefully not an issue on this callpath but it is still a red
flag.  I'd suggest at least the addition of a suitably apologetic code
comment here - we don't want people to naively copy this code.

Plus such a comment would let me know why this was done ;)

> + pfn += count;
> + if (pfn >= end)
> + break;
> + VM_BUG_ON(!pfn_valid(pfn));
> +
> + if (zone_pfn_same_memmap(pfn - count, pfn))
> + page += count;
> + else
> + page = pfn_to_page(pfn);
> + }
> +
> + spin_unlock_irq(&zone->lock);
> +
> + /* After this, pages in the range can be freed one be one */
> + count = pfn - start;
> + pfn = start;
> + for (page = pfn_to_page(pfn); count; --count) {
> + prep_new_page(page, 0, flag);
> + ++pfn;
> + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> + ++page;
> + else
> + page = pfn_to_page(pfn);
> + }
> +
> + return pfn;
> +}
> +
> +void free_contig_pages(unsigned long pfn, unsigned nr_pages)
> +{
> + struct page *page = pfn_to_page(pfn);
> +
> + while (nr_pages--) {
> + __free_page(page);
> + ++pfn;
> + if (likely(zone_pfn_same_memmap(pfn - 1, pfn)))
> + ++page;
> + else
> + page = pfn_to_page(pfn);
> + }
> +}

You're sure these functions don't need EXPORT_SYMBOL()?  Maybe the
design is that only DMA core calls into here (if so, that's good).


>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  /*
>   * All pages in the range must be isolated before calling this.
> -- 
> 1.7.1.569.g6f426
--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html