uc page allocator

Jerome Glisse Fri, 26 Mar 2010 07:00:03 -0700

On Thu, Mar 25, 2010 at 12:36:43AM +0200, Pauli Nieminen wrote:
> On AGP system we might allocate/free routinely uncached or wc memory,
> changing page from cached (wb) to uc or wc is very expensive and involves
> a lot of flushing. To improve performance this allocator use a pool
> of uc,wc pages.
> 
> Pools are protected with spinlocks to allow multiple threads to allocate pages
> simultanously. Expensive operations are done outside of spinlock to maximize
> concurrency.
> 
> Pools are linked lists of pages that were recently freed. mm shrink callback
> allows kernel to claim back pages when they are required for something else.
> 
> Based on Jerome Glisse's and Dave Airlie's pool allocator.
> 
> Signed-off-by: Jerome Glisse <jgli...@redhat.com>
> Signed-off-by: Dave Airlie <airl...@redhat.com>
> Signed-off-by: Pauli Nieminen <suok...@gmail.com>


I think using array rather than list would have make things simplier.
Anyway this patchset looks good, couple of issues you need to fix first,
comment in the code.

Cheers,
Jerome

> ---
>  drivers/gpu/drm/ttm/Makefile         |    2 +-
>  drivers/gpu/drm/ttm/ttm_memory.c     |    7 +-
>  drivers/gpu/drm/ttm/ttm_page_alloc.c |  718 
> ++++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/ttm/ttm_tt.c         |   44 +--
>  include/drm/ttm/ttm_page_alloc.h     |   64 +++
>  5 files changed, 810 insertions(+), 25 deletions(-)
>  create mode 100644 drivers/gpu/drm/ttm/ttm_page_alloc.c
>  create mode 100644 include/drm/ttm/ttm_page_alloc.h
> 
> diff --git a/drivers/gpu/drm/ttm/Makefile b/drivers/gpu/drm/ttm/Makefile
> index 1e138f5..4256e20 100644
> --- a/drivers/gpu/drm/ttm/Makefile
> +++ b/drivers/gpu/drm/ttm/Makefile
> @@ -4,6 +4,6 @@
>  ccflags-y := -Iinclude/drm
>  ttm-y := ttm_agp_backend.o ttm_memory.o ttm_tt.o ttm_bo.o \
>       ttm_bo_util.o ttm_bo_vm.o ttm_module.o ttm_global.o \
> -     ttm_object.o ttm_lock.o ttm_execbuf_util.o
> +     ttm_object.o ttm_lock.o ttm_execbuf_util.o ttm_page_alloc.o
>  
>  obj-$(CONFIG_DRM_TTM) += ttm.o
> diff --git a/drivers/gpu/drm/ttm/ttm_memory.c 
> b/drivers/gpu/drm/ttm/ttm_memory.c
> index eb143e0..72f31aa 100644
> --- a/drivers/gpu/drm/ttm/ttm_memory.c
> +++ b/drivers/gpu/drm/ttm/ttm_memory.c
> @@ -27,6 +27,7 @@
>  
>  #include "ttm/ttm_memory.h"
>  #include "ttm/ttm_module.h"
> +#include "ttm/ttm_page_alloc.h"
>  #include <linux/spinlock.h>
>  #include <linux/sched.h>
>  #include <linux/wait.h>
> @@ -394,6 +395,7 @@ int ttm_mem_global_init(struct ttm_mem_global *glob)
>                      "Zone %7s: Available graphics memory: %llu kiB.\n",
>                      zone->name, (unsigned long long) zone->max_mem >> 10);
>       }
> +     ttm_page_alloc_init(glob->zone_kernel->max_mem/(2*PAGE_SIZE));
>       return 0;
>  out_no_zone:
>       ttm_mem_global_release(glob);
> @@ -406,6 +408,9 @@ void ttm_mem_global_release(struct ttm_mem_global *glob)
>       unsigned int i;
>       struct ttm_mem_zone *zone;
>  
> +     /* let the page allocator first stop the shrink work. */
> +     ttm_page_alloc_fini();
> +
>       flush_workqueue(glob->swap_queue);
>       destroy_workqueue(glob->swap_queue);
>       glob->swap_queue = NULL;
> @@ -413,7 +418,7 @@ void ttm_mem_global_release(struct ttm_mem_global *glob)
>               zone = glob->zones[i];
>               kobject_del(&zone->kobj);
>               kobject_put(&zone->kobj);
> -     }
> +                     }
>       kobject_del(&glob->kobj);
>       kobject_put(&glob->kobj);
>  }
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c 
> b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> new file mode 100644
> index 0000000..18be14f
> --- /dev/null
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -0,0 +1,718 @@
> +/*
> + * Copyright (c) Red Hat Inc.
> +
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sub license,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Authors: Dave Airlie <airl...@redhat.com>
> + *          Jerome Glisse <jgli...@redhat.com>
> + *          Pauli Nieminen <suok...@gmail.com>
> + */
> +
> +/* simple list based uncached page pool
> + * - Pool collects resently freed pages for reuse
> + * - Use page->lru to keep a free list
> + * - doesn't track currently in use pages
> + */
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +#include <linux/highmem.h>
> +#include <linux/mm_types.h>
> +#include <linux/mm.h>
> +
> +#include <asm/atomic.h>
> +#include <asm/agp.h>
> +
> +#include "ttm/ttm_bo_driver.h"
> +#include "ttm/ttm_page_alloc.h"
> +
> +
> +#define NUM_PAGES_TO_ALLOC           (PAGE_SIZE/sizeof(struct page *))
> +#define SMALL_ALLOCATION             16
> +#define FREE_ALL_PAGES                       (~0U)
> +/* times are in msecs */
> +#define PAGE_FREE_INTERVAL           1000
> +
> +/**
> + * struct ttm_page_pool - Pool to reuse recently allocated uc/wc pages.
> + *
> + * @lock: Protects the shared pool from concurrnet access. Must be used with
> + * irqsave/irqrestore variants because pool allocator maybe called from
> + * delayed work.
> + * @fill_lock: Prevent concurrent calls to fill.
> + * @list: Pool of free uc/wc pages for fast reuse.
> + * @gfp_flags: Flags to pass for alloc_page.
> + * @npages: Number of pages in pool.
> + */
> +struct ttm_page_pool {
> +     spinlock_t              lock;
> +     bool                    fill_lock;
> +     struct list_head        list;
> +     int                     gfp_flags;
> +     unsigned                npages;
> +};
> +
> +struct ttm_pool_opts {
> +     unsigned        alloc_size;
> +     unsigned        max_size;
> +     unsigned        small;
> +};
> +
> +#define NUM_POOLS 4
> +
> +/**
> + * struct ttm_pool_manager - Holds memory pools for fst allocation
> + *
> + * Manager is read only object for pool code so it doesn't need locking.
> + *
> + * @free_interval: minimum number of jiffies between freeing pages from pool.
> + * @page_alloc_inited: reference counting for pool allocation.
> + * @work: Work that is used to shrink the pool. Work is only run when there 
> is
> + * some pages to free.
> + * @small_allocation: Limit in number of pages what is small allocation.
> + *
> + * @pools: All pool objects in use.
> + **/
> +struct ttm_pool_manager {
> +     struct shrinker         mm_shrink;
> +     atomic_t                page_alloc_inited;
> +     struct ttm_pool_opts    options;
> +
> +     union {
> +             struct ttm_page_pool    pools[NUM_POOLS];
> +             struct {
> +                     struct ttm_page_pool    wc_pool;
> +                     struct ttm_page_pool    uc_pool;
> +                     struct ttm_page_pool    wc_pool_dma32;
> +                     struct ttm_page_pool    uc_pool_dma32;
> +             } ;
> +     };
> +};
> +
> +static struct ttm_pool_manager _manager = {
> +     .page_alloc_inited      = ATOMIC_INIT(0)
> +};
> +
> +#ifdef CONFIG_X86
> +/* TODO: add this to x86 like _uc, this version here is inefficient */
> +static int set_pages_array_wc(struct page **pages, int addrinarray)
> +{
> +     int i;
> +
> +     for (i = 0; i < addrinarray; i++)
> +             set_memory_wc((unsigned long)page_address(pages[i]), 1);
> +     return 0;
> +}
> +#else
> +static int set_pages_array_wb(struct page **pages, int addrinarray)
> +{
> +#ifdef TTM_HAS_AGP
> +     int i;
> +
> +     for (i = 0; i < addrinarray; i++)
> +             unmap_page_from_agp(pages[i]);
> +#endif
> +     return 0;
> +}
> +
> +static int set_pages_array_wc(struct page **pages, int addrinarray)
> +{
> +#ifdef TTM_HAS_AGP
> +     int i;
> +
> +     for (i = 0; i < addrinarray; i++)
> +             map_page_into_agp(pages[i]);
> +#endif
> +     return 0;
> +}
> +
> +static int set_pages_array_uc(struct page **pages, int addrinarray)
> +{
> +#ifdef TTM_HAS_AGP
> +     int i;
> +
> +     for (i = 0; i < addrinarray; i++)
> +             map_page_into_agp(pages[i]);
> +#endif
> +     return 0;
> +}
> +#endif
> +
> +/**
> + * Select the right pool or requested caching state and ttm flags. */
> +static struct ttm_page_pool *ttm_get_pool(int flags,
> +             enum ttm_caching_state cstate)
> +{
> +     int pool_index;
> +
> +     if (cstate == tt_cached)
> +             return NULL;
> +
> +     if (cstate == tt_wc)
> +             pool_index = 0x0;
> +     else
> +             pool_index = 0x1;
> +
> +     if (flags & TTM_PAGE_FLAG_DMA32)
> +             pool_index |= 0x2;
> +
> +     return &_manager.pools[pool_index];
> +}
> +
> +/* set memory back to wb and free the pages. */
> +static void ttm_pages_put(struct page *pages[], unsigned npages)
> +{
> +     unsigned i;
> +     if (set_pages_array_wb(pages, npages))
> +             printk(KERN_ERR "[ttm] Failed to set %d pages to wb!\n",
> +                             npages);
> +     for (i = 0; i < npages; ++i)
> +             __free_page(pages[i]);
> +}
> +
> +static void ttm_pool_update_free_locked(struct ttm_page_pool *pool,
> +             unsigned freed_pages)
> +{
> +     pool->npages -= freed_pages;
> +}
> +
> +/**
> + * Free pages from pool.
> + *
> + * To prevent hogging the ttm_swap process we only free NUM_PAGES_TO_ALLOC
> + * number of pages in one go.
> + *
> + * @pool: to free the pages from
> + * @free_all: If set to true will free all pages in pool
> + **/
> +static int ttm_page_pool_free(struct ttm_page_pool *pool, unsigned nr_free)
> +{
> +     unsigned long irq_flags;
> +     struct page *p;
> +     struct page **pages_to_free;
> +     unsigned freed_pages, npages_to_free = nr_free;
> +     if (NUM_PAGES_TO_ALLOC < nr_free)
> +             npages_to_free = NUM_PAGES_TO_ALLOC;
> +
> +     pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
> +                     GFP_KERNEL);
> +     if (!pages_to_free) {
> +             printk(KERN_ERR "Failed to allocate memory for pool free 
> operation.\n");
> +             return 0;
> +     }
> +
> +restart:
> +     spin_lock_irqsave(&pool->lock, irq_flags);
> +
> +     freed_pages = 0;
> +
> +     list_for_each_entry_reverse(p, &pool->list, lru) {
> +             if (freed_pages >= npages_to_free)
> +                     break;
> +
> +             pages_to_free[freed_pages++] = p;
> +             /* We can only remove NUM_PAGES_TO_ALLOC at a time. */
> +             if (freed_pages >= NUM_PAGES_TO_ALLOC) {
> +                     /* remove range of page sfrom the pool */
> +                     __list_del(p->lru.prev, &pool->list);
> +
> +                     ttm_pool_update_free_locked(pool, freed_pages);
> +                     /**
> +                      * Because changing page caching is costly
> +                      * we unlock the pool to prevent stalling.
> +                      */
> +                     spin_unlock_irqrestore(&pool->lock, irq_flags);
> +
> +                     ttm_pages_put(pages_to_free, freed_pages);
> +                     if (likely(nr_free != FREE_ALL_PAGES))
> +                             nr_free -= freed_pages;
> +
> +                     if (NUM_PAGES_TO_ALLOC >= nr_free)
> +                             npages_to_free = nr_free;
> +                     else
> +                             npages_to_free = NUM_PAGES_TO_ALLOC;
> +
> +                     /* free all so restart the processing */
> +                     if (nr_free)
> +                             goto restart;
> +
> +                     goto out;
> +
> +             }
> +     }
> +
> +
> +     /* remove range of pages from the pool */
> +     if (freed_pages) {
> +             __list_del(&p->lru, &pool->list);
> +
> +             ttm_pool_update_free_locked(pool, freed_pages);
> +             nr_free -= freed_pages;
> +     }
> +
> +     spin_unlock_irqrestore(&pool->lock, irq_flags);
> +
> +     if (freed_pages)
> +             ttm_pages_put(pages_to_free, freed_pages);
> +out:
> +     kfree(pages_to_free);
> +     return nr_free;
> +}
> +
> +/* Get good estimation how many pages are free in pools */
> +static int ttm_pool_get_num_unused_pages(void)
> +{
> +     unsigned i;
> +     struct ttm_page_pool *pool;
> +     int r;
> +     for (i = 0; i < NUM_POOLS; ++i) {
> +             pool = &_manager.pools[i];
> +
> +             r += pool->npages;
> +     }
> +     return r;
> +}

This is wrong you need to set r = 0 or you might return random
number. Also the pool temp seems useless _manager.pools[i].npages
would be more straightforward.

> +
> +/**
> + * Calback for mm to request pool to reduce number of page held.
> + */
> +static int ttm_pool_mm_shrink(int shrink_pages, gfp_t gfp_mask)
> +{
> +     static atomic_t start_pool = ATOMIC_INIT(0);
> +     unsigned i;
> +     unsigned pool_offset = atomic_add_return(1, &start_pool);
> +     struct ttm_page_pool *pool;
> +
> +     pool_offset = pool_offset % NUM_POOLS;
> +     /* select start pool in round robin fashion */
> +     for (i = 0; i < NUM_POOLS; ++i) {
> +             unsigned nr_free = shrink_pages;
> +             if (shrink_pages == 0)
> +                     break;
> +             pool = &_manager.pools[(i + pool_offset)%NUM_POOLS];
> +             shrink_pages = ttm_page_pool_free(pool, nr_free);
> +     }
> +     /* return estimated number of unused pages in pool */
> +     return ttm_pool_get_num_unused_pages();
> +}
> +
> +static void ttm_pool_mm_shrink_init(struct ttm_pool_manager *manager)
> +{
> +     manager->mm_shrink.shrink = &ttm_pool_mm_shrink;
> +     manager->mm_shrink.seeks = 1;
> +     register_shrinker(&manager->mm_shrink);
> +}
> +
> +static void ttm_pool_mm_shrink_fini(struct ttm_pool_manager *manager)
> +{
> +     unregister_shrinker(&manager->mm_shrink);
> +}
> +
> +static int ttm_set_pages_caching(struct page **pages,
> +             enum ttm_caching_state cstate, unsigned cpages)
> +{
> +     int r = 0;
> +     /* Set page caching */
> +     switch (cstate) {
> +     case tt_uncached:
> +             r = set_pages_array_uc(pages, cpages);
> +             if (r)
> +                     printk(KERN_ERR "[ttm] Failed to set %d pages to uc!\n",
> +                                     cpages);
> +             break;
> +     case tt_wc:
> +             r = set_pages_array_wc(pages, cpages);
> +             if (r)
> +                     printk(KERN_ERR "[ttm] Failed to set %d pages to wc!\n",
> +                                     cpages);
> +             break;
> +     default:
> +             break;
> +     }
> +     return r;
> +}
> +
> +/**
> + * Free pages the pages that failed to change the caching state. If there is
> + * any pages that have changed their caching state already put them to the
> + * pool.
> + */
> +static void ttm_handle_caching_state_failure(struct list_head *pages,
> +             int ttm_flags, enum ttm_caching_state cstate,
> +             struct page **failed_pages, unsigned cpages)
> +{
> +     unsigned i;
> +     /* Failed pages has to be reed */
> +     for (i = 0; i < cpages; ++i) {
> +             list_del(&failed_pages[i]->lru);
> +             __free_page(failed_pages[i]);
> +     }
> +}
> +
> +/**
> + * Allocate new pages with correct caching.
> + *
> + * This function is reentrant if caller updates count depending on number of
> + * pages returned in pages array.
> + */
> +static int ttm_alloc_new_pages(struct list_head *pages, int gfp_flags,
> +             int ttm_flags, enum ttm_caching_state cstate, unsigned count)
> +{
> +     struct page **caching_array;
> +     struct page *p;
> +     int r = 0;
> +     unsigned i, cpages;
> +     unsigned max_cpages = min(count,
> +                     (unsigned)(PAGE_SIZE/sizeof(struct page *)));
> +
> +     /* allocate array for page caching change */
> +     caching_array = kmalloc(max_cpages*sizeof(struct page *), GFP_KERNEL);
> +
> +     if (!caching_array) {
> +             printk(KERN_ERR "[ttm] unable to allocate table for new 
> pages.");
> +             return -ENOMEM;
> +     }
> +
> +     for (i = 0, cpages = 0; i < count; ++i) {
> +             p = alloc_page(gfp_flags);
> +
> +             if (!p) {
> +                     printk(KERN_ERR "[ttm] unable to get page %u\n", i);
> +
> +                     /* store already allocated pages in the pool after
> +                      * setting the caching state */
> +                     if (cpages) {
> +                             r = ttm_set_pages_caching(caching_array, 
> cstate, cpages);
> +                             if (r)
> +                                     ttm_handle_caching_state_failure(pages,
> +                                             ttm_flags, cstate,
> +                                             caching_array, cpages);
> +                     }
> +                     r = -ENOMEM;
> +                     goto out;
> +             }
> +
> +#ifdef CONFIG_HIGHMEM
> +             /* gfp flags of highmem page should never be dma32 so we
> +              * we should be fine in such case
> +              */
> +             if (!PageHighMem(p))
> +#endif
> +             {
> +                     caching_array[cpages++] = p;
> +                     if (cpages == max_cpages) {
> +
> +                             r = ttm_set_pages_caching(caching_array,
> +                                             cstate, cpages);
> +                             if (r) {
> +                                     ttm_handle_caching_state_failure(pages,
> +                                             ttm_flags, cstate,
> +                                             caching_array, cpages);
> +                                     goto out;
> +                             }
> +                             cpages = 0;
> +                     }
> +             }
> +
> +             list_add(&p->lru, pages);
> +     }
> +
> +     if (cpages) {
> +             r = ttm_set_pages_caching(caching_array, cstate, cpages);
> +             if (r)
> +                     ttm_handle_caching_state_failure(pages,
> +                                     ttm_flags, cstate,
> +                                     caching_array, cpages);
> +     }
> +out:
> +     kfree(caching_array);
> +
> +     return r;
> +}
> +
> +/**
> + * Fill the given pool if there isn't enough pages and requested number of
> + * pages is small.
> + */
> +static void ttm_page_pool_fill_locked(struct ttm_page_pool *pool,
> +             int ttm_flags, enum ttm_caching_state cstate, unsigned count,
> +             unsigned long *irq_flags)
> +{
> +     struct page *p, *tmp;
> +     int r;
> +     unsigned cpages = 0;
> +     /**
> +      * Only allow one pool fill operation at a time.
> +      * If pool doesn't have enough pages for the allocation new pages are
> +      * allocated from outside of pool.
> +      */
> +     if (pool->fill_lock)
> +             return;
> +
> +     pool->fill_lock = true;
> +
> +     /* If allocation request is small and there is not enough
> +      * pages in pool we fill the pool first */
> +     if (count < _manager.options.small
> +             && count > pool->npages) {
> +             struct list_head new_pages;
> +             unsigned alloc_size = _manager.options.alloc_size;
> +
> +             /**
> +              * Can't change page caching if in irqsave context. We have to
> +              * drop the pool->lock.
> +              */
> +             spin_unlock_irqrestore(&pool->lock, *irq_flags);
> +
> +             INIT_LIST_HEAD(&new_pages);
> +             r = ttm_alloc_new_pages(&new_pages, pool->gfp_flags, ttm_flags,
> +                             cstate, alloc_size);
> +             spin_lock_irqsave(&pool->lock, *irq_flags);
> +
> +             if (!r) {
> +                     list_splice(&new_pages, &pool->list);
> +                     pool->npages += alloc_size;
> +             } else {
> +                     printk(KERN_ERR "[ttm] Failed to fill pool (%p).", 
> pool);
> +                     /* If we have any pages left put them to the pool. */
> +                     list_for_each_entry_safe(p, tmp, &pool->list, lru) {
> +                             ++cpages;
> +                     }

This looks wrong didn't you wanted to do:
list_for_each_entry(p, &new_pages, lru) {
++cpages;
}

> +                     list_splice(&new_pages, &pool->list);
> +                     pool->npages += cpages;
> +             }
> +
> +     }
> +     pool->fill_lock = false;
> +}
> +
> +/**
> + * Cut count nubmer of pages from the pool and put them to return list
> + *
> + * @return count of pages still to allocate to fill the request.
> + */
> +static unsigned ttm_page_pool_get_pages(struct ttm_page_pool *pool,
> +             struct list_head *pages, int ttm_flags,
> +             enum ttm_caching_state cstate, unsigned count)
> +{
> +     unsigned long irq_flags;
> +     struct list_head *p;
> +     unsigned i;
> +
> +     spin_lock_irqsave(&pool->lock, irq_flags);
> +     ttm_page_pool_fill_locked(pool, ttm_flags, cstate, count, &irq_flags);
> +
> +     if (count >= pool->npages) {
> +             /* take all pages from the pool */
> +             list_splice_init(&pool->list, pages);
> +             count -= pool->npages;
> +             pool->npages = 0;
> +             goto out;
> +     }
> +     /* find the last pages to include for requested number of pages */
> +     if (count <= pool->npages/2) {
> +             i = 0;
> +             list_for_each(p, &pool->list) {
> +                     if (++i == count)
> +                             break;
> +             }
> +     } else {
> +             i = pool->npages + 1;
> +             list_for_each_prev(p, &pool->list) {
> +                     if (--i == count)
> +                             break;
> +             }
> +     }

Why do you go from top to bottom or bottom to top ?

> +     /* Cut count number of pages from pool */
> +     list_cut_position(pages, &pool->list, p);
> +     pool->npages -= count;
> +     count = 0;
> +out:
> +     spin_unlock_irqrestore(&pool->lock, irq_flags);
> +     return count;
> +}
> +
> +/*
> + * On success pages list will hold count number of correctly
> + * cached pages.
> + */
> +int ttm_get_pages(struct list_head *pages, int flags,
> +             enum ttm_caching_state cstate, unsigned count)
> +{
> +     struct ttm_page_pool *pool = ttm_get_pool(flags, cstate);
> +     struct page *p = NULL;
> +     int gfp_flags = 0;
> +     int r;
> +
> +     /* set zero flag for page allocation if required */
> +     if (flags & TTM_PAGE_FLAG_ZERO_ALLOC)
> +             gfp_flags |= __GFP_ZERO;
> +
> +     /* No pool for cached pages */
> +     if (pool == NULL) {
> +             if (flags & TTM_PAGE_FLAG_DMA32)
> +                     gfp_flags |= GFP_DMA32;
> +             else
> +                     gfp_flags |= __GFP_HIGHMEM;
> +
> +             for (r = 0; r < count; ++r) {
> +                     p = alloc_page(gfp_flags);
> +                     if (!p) {
> +
> +                             printk(KERN_ERR "[ttm] unable to allocate 
> page.");
> +                             return -ENOMEM;
> +                     }
> +
> +                     list_add(&p->lru, pages);
> +             }
> +             return 0;
> +     }
> +
> +
> +     /* combine zero flag to pool flags */
> +     gfp_flags |= pool->gfp_flags;
> +
> +     /* First we take pages from the pool */
> +     count = ttm_page_pool_get_pages(pool, pages, flags, cstate, count);
> +
> +     /* clear the pages coming from the pool if requested */
> +     if (flags & TTM_PAGE_FLAG_ZERO_ALLOC) {
> +             list_for_each_entry(p, pages, lru) {
> +                     clear_page(page_address(p));
> +             }
> +     }
> +
> +     /* If pool didn't have enough pages allocate new one. */
> +     if (count > 0) {
> +             /* ttm_alloc_new_pages doesn't reference pool so we can run
> +              * multiple requests in parallel.
> +              **/
> +             r = ttm_alloc_new_pages(pages, gfp_flags, flags, cstate, count);
> +             if (r) {
> +                     /* If there is any pages in the list put them back to
> +                      * the pool. */
> +                     printk(KERN_ERR "[ttm] Failed to allocate extra pages "
> +                                     "for large request.");
> +                     ttm_put_pages(pages, flags, cstate);
> +                     return r;
> +             }
> +     }
> +
> +
> +     return 0;
> +}
> +
> +/* Put all pages in pages list to correct pool to wait for reuse */
> +void ttm_put_pages(struct list_head *pages, int flags,
> +             enum ttm_caching_state cstate)
> +{
> +     unsigned long irq_flags;
> +     struct ttm_page_pool *pool = ttm_get_pool(flags, cstate);
> +     struct page *p, *tmp;
> +     unsigned page_count = 0;
> +
> +     if (pool == NULL) {
> +             /* No pool for this memory type so free the pages */
> +
> +             list_for_each_entry_safe(p, tmp, pages, lru) {
> +                     __free_page(p);
> +             }
> +             /* Make the pages list empty */
> +             INIT_LIST_HEAD(pages);
> +             return;
> +     }
> +
> +     list_for_each_entry_safe(p, tmp, pages, lru) {
> +
> +#ifdef CONFIG_HIGHMEM
> +             /* we don't have pool for highmem -> free them */
> +             if (PageHighMem(p)) {
> +                     list_del(&p->lru);
> +                     __free_page(p);
> +             } else
> +#endif
> +             {
> +                     ++page_count;
> +             }
> +
> +     }
> +
> +     spin_lock_irqsave(&pool->lock, irq_flags);
> +     list_splice_init(pages, &pool->list);
> +     pool->npages += page_count;
> +     /* Check that we don't go over the pool limit */
> +     page_count = 0;
> +     if (pool->npages > _manager.options.max_size) {
> +             page_count = pool->npages - _manager.options.max_size;
> +             /* free at least NUM_PAGES_TO_ALLOC number of pages
> +              * to reduce calls to set_memory_wb */
> +             if (page_count < NUM_PAGES_TO_ALLOC)
> +                     page_count = NUM_PAGES_TO_ALLOC;
> +     }
> +     spin_unlock_irqrestore(&pool->lock, irq_flags);
> +     if (page_count)
> +             ttm_page_pool_free(pool, page_count);
> +}
> +
> +static void ttm_page_pool_init_locked(struct ttm_page_pool *pool, int flags)
> +{
> +     spin_lock_init(&pool->lock);
> +     pool->fill_lock = false;
> +     INIT_LIST_HEAD(&pool->list);
> +     pool->npages = 0;
> +     pool->gfp_flags = flags;
> +}
> +
> +int ttm_page_alloc_init(unsigned max_pages)
> +{
> +     if (atomic_add_return(1, &_manager.page_alloc_inited) > 1)
> +             return 0;
> +
> +     printk(KERN_INFO "[ttm] Initializing pool allocator.\n");
> +
> +     ttm_page_pool_init_locked(&_manager.wc_pool, GFP_HIGHUSER);
> +
> +     ttm_page_pool_init_locked(&_manager.uc_pool, GFP_HIGHUSER);
> +
> +     ttm_page_pool_init_locked(&_manager.wc_pool_dma32, GFP_USER | 
> GFP_DMA32);
> +
> +     ttm_page_pool_init_locked(&_manager.uc_pool_dma32, GFP_USER | 
> GFP_DMA32);
> +
> +     _manager.options.max_size = max_pages;
> +     _manager.options.small = SMALL_ALLOCATION;
> +     _manager.options.alloc_size = NUM_PAGES_TO_ALLOC;
> +
> +     ttm_pool_mm_shrink_init(&_manager);
> +
> +     return 0;
> +}
> +
> +void ttm_page_alloc_fini()
> +{
> +     int i;
> +
> +     if (atomic_sub_return(1, &_manager.page_alloc_inited) > 0)
> +             return;
> +
> +     printk(KERN_INFO "[ttm] Finilizing pool allocator.\n");
> +     ttm_pool_mm_shrink_fini(&_manager);
> +
> +     for (i = 0; i < NUM_POOLS; ++i)
> +             ttm_page_pool_free(&_manager.pools[i], FREE_ALL_PAGES);
> +}
> diff --git a/drivers/gpu/drm/ttm/ttm_tt.c b/drivers/gpu/drm/ttm/ttm_tt.c
> index a759170..8a6fc01 100644
> --- a/drivers/gpu/drm/ttm/ttm_tt.c
> +++ b/drivers/gpu/drm/ttm/ttm_tt.c
> @@ -38,6 +38,7 @@
>  #include "ttm/ttm_module.h"
>  #include "ttm/ttm_bo_driver.h"
>  #include "ttm/ttm_placement.h"
> +#include "ttm/ttm_page_alloc.h"
>  
>  static int ttm_tt_swapin(struct ttm_tt *ttm);
>  
> @@ -72,21 +73,6 @@ static void ttm_tt_free_page_directory(struct ttm_tt *ttm)
>       ttm->pages = NULL;
>  }
>  
> -static struct page *ttm_tt_alloc_page(unsigned page_flags)
> -{
> -     gfp_t gfp_flags = GFP_USER;
> -
> -     if (page_flags & TTM_PAGE_FLAG_ZERO_ALLOC)
> -             gfp_flags |= __GFP_ZERO;
> -
> -     if (page_flags & TTM_PAGE_FLAG_DMA32)
> -             gfp_flags |= __GFP_DMA32;
> -     else
> -             gfp_flags |= __GFP_HIGHMEM;
> -
> -     return alloc_page(gfp_flags);
> -}
> -
>  static void ttm_tt_free_user_pages(struct ttm_tt *ttm)
>  {
>       int write;
> @@ -127,15 +113,21 @@ static void ttm_tt_free_user_pages(struct ttm_tt *ttm)
>  static struct page *__ttm_tt_get_page(struct ttm_tt *ttm, int index)
>  {
>       struct page *p;
> +     struct list_head h;
>       struct ttm_mem_global *mem_glob = ttm->glob->mem_glob;
>       int ret;
>  
>       while (NULL == (p = ttm->pages[index])) {
> -             p = ttm_tt_alloc_page(ttm->page_flags);
>  
> -             if (!p)
> +             INIT_LIST_HEAD(&h);
> +
> +             ret = ttm_get_pages(&h, ttm->page_flags, ttm->caching_state, 1);
> +
> +             if (ret != 0)
>                       return NULL;
>  
> +             p = list_first_entry(&h, struct page, lru);
> +
>               ret = ttm_mem_global_alloc_page(mem_glob, p, false, false);
>               if (unlikely(ret != 0))
>                       goto out_err;
> @@ -244,10 +236,10 @@ static int ttm_tt_set_caching(struct ttm_tt *ttm,
>       if (ttm->caching_state == c_state)
>               return 0;
>  
> -     if (c_state != tt_cached) {
> -             ret = ttm_tt_populate(ttm);
> -             if (unlikely(ret != 0))
> -                     return ret;
> +     if (ttm->state == tt_unpopulated) {
> +             /* Change caching but don't populate */
> +             ttm->caching_state = c_state;
> +             return 0;
>       }
>  
>       if (ttm->caching_state == tt_cached)
> @@ -298,13 +290,17 @@ EXPORT_SYMBOL(ttm_tt_set_placement_caching);
>  static void ttm_tt_free_alloced_pages(struct ttm_tt *ttm)
>  {
>       int i;
> +     unsigned count = 0;
> +     struct list_head h;
>       struct page *cur_page;
>       struct ttm_backend *be = ttm->be;
>  
> +     INIT_LIST_HEAD(&h);
> +
>       if (be)
>               be->func->clear(be);
> -     (void)ttm_tt_set_caching(ttm, tt_cached);
>       for (i = 0; i < ttm->num_pages; ++i) {
> +
>               cur_page = ttm->pages[i];
>               ttm->pages[i] = NULL;
>               if (cur_page) {
> @@ -314,9 +310,11 @@ static void ttm_tt_free_alloced_pages(struct ttm_tt *ttm)
>                                      "Leaking pages.\n");
>                       ttm_mem_global_free_page(ttm->glob->mem_glob,
>                                                cur_page);
> -                     __free_page(cur_page);
> +                     list_add(&cur_page->lru, &h);
> +                     count++;
>               }
>       }
> +     ttm_put_pages(&h, ttm->page_flags, ttm->caching_state);
>       ttm->state = tt_unpopulated;
>       ttm->first_himem_page = ttm->num_pages;
>       ttm->last_lomem_page = -1;
> diff --git a/include/drm/ttm/ttm_page_alloc.h 
> b/include/drm/ttm/ttm_page_alloc.h
> new file mode 100644
> index 0000000..63cd94a
> --- /dev/null
> +++ b/include/drm/ttm/ttm_page_alloc.h
> @@ -0,0 +1,64 @@
> +/*
> + * Copyright (c) Red Hat Inc.
> +
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sub license,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the
> + * next paragraph) shall be included in all copies or substantial portions
> + * of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
> + * DEALINGS IN THE SOFTWARE.
> + *
> + * Authors: Dave Airlie <airl...@redhat.com>
> + *          Jerome Glisse <jgli...@redhat.com>
> + */
> +#ifndef TTM_PAGE_ALLOC
> +#define TTM_PAGE_ALLOC
> +
> +#include "ttm_bo_driver.h"
> +#include "ttm_memory.h"
> +
> +/**
> + * Get count number of pages from pool to pages list.
> + *
> + * @pages: heado of empty linked list where pages are filled.
> + * @flags: ttm flags for page allocation.
> + * @cstate: ttm caching state for the page.
> + * @count: number of pages to allocate.
> + */
> +int ttm_get_pages(struct list_head *pages, int flags,
> +             enum ttm_caching_state cstate, unsigned count);
> +/**
> + * Put linked list of pages to pool.
> + *
> + * @pages: list of pages to free.
> + * @flags: ttm flags for page allocation.
> + * @cstate: ttm caching state.
> + */
> +void ttm_put_pages(struct list_head *pages, int flags,
> +             enum ttm_caching_state cstate);
> +/**
> + * Initialize pool allocator.
> + *
> + * Pool allocator is internaly reference counted so it can be initialized
> + * multiple times but ttm_page_alloc_fini has to be called same number of
> + * times.
> + */
> +int ttm_page_alloc_init(unsigned max_pages);
> +/**
> + * Free pool allocator.
> + */
> +void ttm_page_alloc_fini(void);
> +
> +#endif
> -- 
> 1.6.3.3
> 

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Re: [PATCH 1/7] drm/ttm: add pool wc/uc page allocator

Reply via email to