Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
Vaidyanathan Srinivasan wrote: > > YAMAMOTO Takashi wrote: >>> +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, >>> + struct list_head *dst, >>> + unsigned long *scanned, int order, >>> + int mode, struct zone *z, >>> + struct mem_container *mem_cont, >>> + int active) >>> +{ >>> + unsigned long nr_taken = 0; >>> + struct page *page; >>> + unsigned long scan; >>> + LIST_HEAD(mp_list); >>> + struct list_head *src; >>> + struct meta_page *mp; >>> + >>> + if (active) >>> + src = _cont->active_list; >>> + else >>> + src = _cont->inactive_list; >>> + >>> + for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { >>> + mp = list_entry(src->prev, struct meta_page, lru); >> what prevents another thread from freeing mp here? > > mem_cont->lru_lock protects the list and validity of mp. If we hold > mem_cont->lru_lock for this entire loop, then we preserve the validity > of mp. However that will be holding up container charge and uncharge. > > This entire routing is called with zone->lru_lock held by the caller. > So within a zone, this routine is serialized. > > However page uncharge may race with isolate page. But will that lead > to any corruption of the list? We may be holding the lock for too > much time just to be on the safe side. > > Please allow us some time to verify whether this is indeed inadequate > locking that will lead to corruption of the list. I did few runs and checked for ref_cnt on meta_page and there seems to be a race between isolate pages and uncharge. We will probably have to increase the ref_cnt on meta_page while we are isolating it. I am trying to see if we can solve the problem by manipulating the ref_cnt on the meta_page. --Vaidy > Thanks for pointing out this situation. > --Vaidy > >>> + spin_lock(_cont->lru_lock); >>> + if (mp) >>> + page = mp->page; >>> + spin_unlock(_cont->lru_lock); >>> + if (!mp) >>> + continue; >> YAMAMOTO Takashi >> ___ >> Containers mailing list >> [EMAIL PROTECTED] >> https://lists.linux-foundation.org/mailman/listinfo/containers >> > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.linux-foundation.org/mailman/listinfo/containers > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
Vaidyanathan Srinivasan wrote: YAMAMOTO Takashi wrote: +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, + struct list_head *dst, + unsigned long *scanned, int order, + int mode, struct zone *z, + struct mem_container *mem_cont, + int active) +{ + unsigned long nr_taken = 0; + struct page *page; + unsigned long scan; + LIST_HEAD(mp_list); + struct list_head *src; + struct meta_page *mp; + + if (active) + src = mem_cont-active_list; + else + src = mem_cont-inactive_list; + + for (scan = 0; scan nr_to_scan !list_empty(src); scan++) { + mp = list_entry(src-prev, struct meta_page, lru); what prevents another thread from freeing mp here? mem_cont-lru_lock protects the list and validity of mp. If we hold mem_cont-lru_lock for this entire loop, then we preserve the validity of mp. However that will be holding up container charge and uncharge. This entire routing is called with zone-lru_lock held by the caller. So within a zone, this routine is serialized. However page uncharge may race with isolate page. But will that lead to any corruption of the list? We may be holding the lock for too much time just to be on the safe side. Please allow us some time to verify whether this is indeed inadequate locking that will lead to corruption of the list. I did few runs and checked for ref_cnt on meta_page and there seems to be a race between isolate pages and uncharge. We will probably have to increase the ref_cnt on meta_page while we are isolating it. I am trying to see if we can solve the problem by manipulating the ref_cnt on the meta_page. --Vaidy Thanks for pointing out this situation. --Vaidy + spin_lock(mem_cont-lru_lock); + if (mp) + page = mp-page; + spin_unlock(mem_cont-lru_lock); + if (!mp) + continue; YAMAMOTO Takashi ___ Containers mailing list [EMAIL PROTECTED] https://lists.linux-foundation.org/mailman/listinfo/containers ___ Containers mailing list [EMAIL PROTECTED] https://lists.linux-foundation.org/mailman/listinfo/containers - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
YAMAMOTO Takashi wrote: >> +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, >> +struct list_head *dst, >> +unsigned long *scanned, int order, >> +int mode, struct zone *z, >> +struct mem_container *mem_cont, >> +int active) >> +{ >> +unsigned long nr_taken = 0; >> +struct page *page; >> +unsigned long scan; >> +LIST_HEAD(mp_list); >> +struct list_head *src; >> +struct meta_page *mp; >> + >> +if (active) >> +src = _cont->active_list; >> +else >> +src = _cont->inactive_list; >> + >> +for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { >> +mp = list_entry(src->prev, struct meta_page, lru); > > what prevents another thread from freeing mp here? mem_cont->lru_lock protects the list and validity of mp. If we hold mem_cont->lru_lock for this entire loop, then we preserve the validity of mp. However that will be holding up container charge and uncharge. This entire routing is called with zone->lru_lock held by the caller. So within a zone, this routine is serialized. However page uncharge may race with isolate page. But will that lead to any corruption of the list? We may be holding the lock for too much time just to be on the safe side. Please allow us some time to verify whether this is indeed inadequate locking that will lead to corruption of the list. Thanks for pointing out this situation. --Vaidy >> +spin_lock(_cont->lru_lock); >> +if (mp) >> +page = mp->page; >> +spin_unlock(_cont->lru_lock); >> +if (!mp) >> +continue; > > YAMAMOTO Takashi > ___ > Containers mailing list > [EMAIL PROTECTED] > https://lists.linux-foundation.org/mailman/listinfo/containers > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
YAMAMOTO Takashi wrote: +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, +struct list_head *dst, +unsigned long *scanned, int order, +int mode, struct zone *z, +struct mem_container *mem_cont, +int active) +{ +unsigned long nr_taken = 0; +struct page *page; +unsigned long scan; +LIST_HEAD(mp_list); +struct list_head *src; +struct meta_page *mp; + +if (active) +src = mem_cont-active_list; +else +src = mem_cont-inactive_list; + +for (scan = 0; scan nr_to_scan !list_empty(src); scan++) { +mp = list_entry(src-prev, struct meta_page, lru); what prevents another thread from freeing mp here? mem_cont-lru_lock protects the list and validity of mp. If we hold mem_cont-lru_lock for this entire loop, then we preserve the validity of mp. However that will be holding up container charge and uncharge. This entire routing is called with zone-lru_lock held by the caller. So within a zone, this routine is serialized. However page uncharge may race with isolate page. But will that lead to any corruption of the list? We may be holding the lock for too much time just to be on the safe side. Please allow us some time to verify whether this is indeed inadequate locking that will lead to corruption of the list. Thanks for pointing out this situation. --Vaidy +spin_lock(mem_cont-lru_lock); +if (mp) +page = mp-page; +spin_unlock(mem_cont-lru_lock); +if (!mp) +continue; YAMAMOTO Takashi ___ Containers mailing list [EMAIL PROTECTED] https://lists.linux-foundation.org/mailman/listinfo/containers - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
> +unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, > + struct list_head *dst, > + unsigned long *scanned, int order, > + int mode, struct zone *z, > + struct mem_container *mem_cont, > + int active) > +{ > + unsigned long nr_taken = 0; > + struct page *page; > + unsigned long scan; > + LIST_HEAD(mp_list); > + struct list_head *src; > + struct meta_page *mp; > + > + if (active) > + src = _cont->active_list; > + else > + src = _cont->inactive_list; > + > + for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) { > + mp = list_entry(src->prev, struct meta_page, lru); what prevents another thread from freeing mp here? > + spin_lock(_cont->lru_lock); > + if (mp) > + page = mp->page; > + spin_unlock(_cont->lru_lock); > + if (!mp) > + continue; YAMAMOTO Takashi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
On Mon, Jul 30, 2007 at 07:07:58PM +0530, Dhaval Giani wrote: > Hi Balbir, > > > diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c > > --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim > > 2007-07-28 01:12:50.0 +0530 > > +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 > > 01:12:50.0 +0530 > > > /* > > * The memory controller data structure. The memory controller controls > > both > > @@ -51,6 +54,10 @@ struct mem_container { > > */ > > struct list_head active_list; > > struct list_head inactive_list; > > + /* > > +* spin_lock to protect the per container LRU > > +*/ > > + spinlock_t lru_lock; > > }; > > The spinlock is not annotated by lockdep. The following patch should do > it. > > Signed-off-by: Dhaval Giani <[EMAIL PROTECTED]> > Signed-off-by: Gautham Shenoy R <[EMAIL PROTECTED]> Gautham R Shenoy > > > Index: linux-2.6.23-rc1/mm/memcontrol.c > === > --- linux-2.6.23-rc1.orig/mm/memcontrol.c 2007-07-30 17:27:24.0 > +0530 > +++ linux-2.6.23-rc1/mm/memcontrol.c 2007-07-30 18:43:40.0 +0530 > @@ -501,6 +501,9 @@ > > static struct mem_container init_mem_container; > > +/* lockdep should know about lru_lock */ > +static struct lock_class_key lru_lock_key; > + > static struct container_subsys_state * > mem_container_create(struct container_subsys *ss, struct container *cont) > { > @@ -519,6 +522,7 @@ > INIT_LIST_HEAD(>active_list); > INIT_LIST_HEAD(>inactive_list); > spin_lock_init(>lru_lock); > + lockdep_set_class(>lru_lock, _lock_key); > mem->control_type = MEM_CONTAINER_TYPE_ALL; > return >css; > } > -- > regards, > Dhaval > > I would like to change the world but they don't give me the source code! -- Gautham R Shenoy Linux Technology Center IBM India. "Freedom comes with a price tag of responsibility, which is still a bargain, because Freedom is priceless!" - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
On Mon, 2007-07-30 at 19:07 +0530, Dhaval Giani wrote: > Hi Balbir, > > > diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c > > --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim > > 2007-07-28 01:12:50.0 +0530 > > +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 > > 01:12:50.0 +0530 > > > /* > > * The memory controller data structure. The memory controller controls > > both > > @@ -51,6 +54,10 @@ struct mem_container { > > */ > > struct list_head active_list; > > struct list_head inactive_list; > > + /* > > +* spin_lock to protect the per container LRU > > +*/ > > + spinlock_t lru_lock; > > }; > > The spinlock is not annotated by lockdep. The following patch should do > it. One does not need explicit lockdep annotations unless there is a non obvious use of the locks. A typical example of that would be the inode locks, that get placed differently in the various filesystem's locking hierarchy and might hence seem to generate contradictory locking rules - even though they are consistent within a particular filesystem. So unless there are 2 or more distinct locking hierarchies this one lock partakes in, there is no need for this annotation. Was this patch driven by a lockdep report? > Signed-off-by: Dhaval Giani <[EMAIL PROTECTED]> > Signed-off-by: Gautham Shenoy R <[EMAIL PROTECTED]> > > > Index: linux-2.6.23-rc1/mm/memcontrol.c > === > --- linux-2.6.23-rc1.orig/mm/memcontrol.c 2007-07-30 17:27:24.0 > +0530 > +++ linux-2.6.23-rc1/mm/memcontrol.c 2007-07-30 18:43:40.0 +0530 > @@ -501,6 +501,9 @@ > > static struct mem_container init_mem_container; > > +/* lockdep should know about lru_lock */ > +static struct lock_class_key lru_lock_key; > + > static struct container_subsys_state * > mem_container_create(struct container_subsys *ss, struct container *cont) > { > @@ -519,6 +522,7 @@ > INIT_LIST_HEAD(>active_list); > INIT_LIST_HEAD(>inactive_list); > spin_lock_init(>lru_lock); > + lockdep_set_class(>lru_lock, _lock_key); > mem->control_type = MEM_CONTAINER_TYPE_ALL; > return >css; > } signature.asc Description: This is a digitally signed message part
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
Hi Balbir, > diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c > --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim > 2007-07-28 01:12:50.0 +0530 > +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 > 01:12:50.0 +0530 > /* > * The memory controller data structure. The memory controller controls both > @@ -51,6 +54,10 @@ struct mem_container { >*/ > struct list_head active_list; > struct list_head inactive_list; > + /* > + * spin_lock to protect the per container LRU > + */ > + spinlock_t lru_lock; > }; The spinlock is not annotated by lockdep. The following patch should do it. Signed-off-by: Dhaval Giani <[EMAIL PROTECTED]> Signed-off-by: Gautham Shenoy R <[EMAIL PROTECTED]> Index: linux-2.6.23-rc1/mm/memcontrol.c === --- linux-2.6.23-rc1.orig/mm/memcontrol.c 2007-07-30 17:27:24.0 +0530 +++ linux-2.6.23-rc1/mm/memcontrol.c2007-07-30 18:43:40.0 +0530 @@ -501,6 +501,9 @@ static struct mem_container init_mem_container; +/* lockdep should know about lru_lock */ +static struct lock_class_key lru_lock_key; + static struct container_subsys_state * mem_container_create(struct container_subsys *ss, struct container *cont) { @@ -519,6 +522,7 @@ INIT_LIST_HEAD(>active_list); INIT_LIST_HEAD(>inactive_list); spin_lock_init(>lru_lock); + lockdep_set_class(>lru_lock, _lock_key); mem->control_type = MEM_CONTAINER_TYPE_ALL; return >css; } -- regards, Dhaval I would like to change the world but they don't give me the source code! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
Hi Balbir, diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 01:12:50.0 +0530 /* * The memory controller data structure. The memory controller controls both @@ -51,6 +54,10 @@ struct mem_container { */ struct list_head active_list; struct list_head inactive_list; + /* + * spin_lock to protect the per container LRU + */ + spinlock_t lru_lock; }; The spinlock is not annotated by lockdep. The following patch should do it. Signed-off-by: Dhaval Giani [EMAIL PROTECTED] Signed-off-by: Gautham Shenoy R [EMAIL PROTECTED] Index: linux-2.6.23-rc1/mm/memcontrol.c === --- linux-2.6.23-rc1.orig/mm/memcontrol.c 2007-07-30 17:27:24.0 +0530 +++ linux-2.6.23-rc1/mm/memcontrol.c2007-07-30 18:43:40.0 +0530 @@ -501,6 +501,9 @@ static struct mem_container init_mem_container; +/* lockdep should know about lru_lock */ +static struct lock_class_key lru_lock_key; + static struct container_subsys_state * mem_container_create(struct container_subsys *ss, struct container *cont) { @@ -519,6 +522,7 @@ INIT_LIST_HEAD(mem-active_list); INIT_LIST_HEAD(mem-inactive_list); spin_lock_init(mem-lru_lock); + lockdep_set_class(mem-lru_lock, lru_lock_key); mem-control_type = MEM_CONTAINER_TYPE_ALL; return mem-css; } -- regards, Dhaval I would like to change the world but they don't give me the source code! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
On Mon, 2007-07-30 at 19:07 +0530, Dhaval Giani wrote: Hi Balbir, diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 01:12:50.0 +0530 /* * The memory controller data structure. The memory controller controls both @@ -51,6 +54,10 @@ struct mem_container { */ struct list_head active_list; struct list_head inactive_list; + /* +* spin_lock to protect the per container LRU +*/ + spinlock_t lru_lock; }; The spinlock is not annotated by lockdep. The following patch should do it. One does not need explicit lockdep annotations unless there is a non obvious use of the locks. A typical example of that would be the inode locks, that get placed differently in the various filesystem's locking hierarchy and might hence seem to generate contradictory locking rules - even though they are consistent within a particular filesystem. So unless there are 2 or more distinct locking hierarchies this one lock partakes in, there is no need for this annotation. Was this patch driven by a lockdep report? Signed-off-by: Dhaval Giani [EMAIL PROTECTED] Signed-off-by: Gautham Shenoy R [EMAIL PROTECTED] Index: linux-2.6.23-rc1/mm/memcontrol.c === --- linux-2.6.23-rc1.orig/mm/memcontrol.c 2007-07-30 17:27:24.0 +0530 +++ linux-2.6.23-rc1/mm/memcontrol.c 2007-07-30 18:43:40.0 +0530 @@ -501,6 +501,9 @@ static struct mem_container init_mem_container; +/* lockdep should know about lru_lock */ +static struct lock_class_key lru_lock_key; + static struct container_subsys_state * mem_container_create(struct container_subsys *ss, struct container *cont) { @@ -519,6 +522,7 @@ INIT_LIST_HEAD(mem-active_list); INIT_LIST_HEAD(mem-inactive_list); spin_lock_init(mem-lru_lock); + lockdep_set_class(mem-lru_lock, lru_lock_key); mem-control_type = MEM_CONTAINER_TYPE_ALL; return mem-css; } signature.asc Description: This is a digitally signed message part
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
On Mon, Jul 30, 2007 at 07:07:58PM +0530, Dhaval Giani wrote: Hi Balbir, diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 01:12:50.0 +0530 /* * The memory controller data structure. The memory controller controls both @@ -51,6 +54,10 @@ struct mem_container { */ struct list_head active_list; struct list_head inactive_list; + /* +* spin_lock to protect the per container LRU +*/ + spinlock_t lru_lock; }; The spinlock is not annotated by lockdep. The following patch should do it. Signed-off-by: Dhaval Giani [EMAIL PROTECTED] Signed-off-by: Gautham Shenoy R [EMAIL PROTECTED] Gautham R Shenoy Index: linux-2.6.23-rc1/mm/memcontrol.c === --- linux-2.6.23-rc1.orig/mm/memcontrol.c 2007-07-30 17:27:24.0 +0530 +++ linux-2.6.23-rc1/mm/memcontrol.c 2007-07-30 18:43:40.0 +0530 @@ -501,6 +501,9 @@ static struct mem_container init_mem_container; +/* lockdep should know about lru_lock */ +static struct lock_class_key lru_lock_key; + static struct container_subsys_state * mem_container_create(struct container_subsys *ss, struct container *cont) { @@ -519,6 +522,7 @@ INIT_LIST_HEAD(mem-active_list); INIT_LIST_HEAD(mem-inactive_list); spin_lock_init(mem-lru_lock); + lockdep_set_class(mem-lru_lock, lru_lock_key); mem-control_type = MEM_CONTAINER_TYPE_ALL; return mem-css; } -- regards, Dhaval I would like to change the world but they don't give me the source code! -- Gautham R Shenoy Linux Technology Center IBM India. Freedom comes with a price tag of responsibility, which is still a bargain, because Freedom is priceless! - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
+unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, + struct list_head *dst, + unsigned long *scanned, int order, + int mode, struct zone *z, + struct mem_container *mem_cont, + int active) +{ + unsigned long nr_taken = 0; + struct page *page; + unsigned long scan; + LIST_HEAD(mp_list); + struct list_head *src; + struct meta_page *mp; + + if (active) + src = mem_cont-active_list; + else + src = mem_cont-inactive_list; + + for (scan = 0; scan nr_to_scan !list_empty(src); scan++) { + mp = list_entry(src-prev, struct meta_page, lru); what prevents another thread from freeing mp here? + spin_lock(mem_cont-lru_lock); + if (mp) + page = mp-page; + spin_unlock(mem_cont-lru_lock); + if (!mp) + continue; YAMAMOTO Takashi - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
Changelog since v3 1. Added reclaim retry logic to avoid being OOM'ed due to pages from swap cache (coming in due to reclaim) don't overwhelm the container. Changelog 1. Fix probable NULL pointer dereference based on review comments by YAMAMOTO Takashi Add the meta_page to the per container LRU. The reclaim algorithm has been modified to make the isolate_lru_pages() as a pluggable component. The scan_control data structure now accepts the container on behalf of which reclaims are carried out. try_to_free_pages() has been extended to become container aware. Signed-off-by: Pavel Emelianov <[EMAIL PROTECTED]> Signed-off-by: <[EMAIL PROTECTED]> --- include/linux/memcontrol.h | 11 +++ include/linux/res_counter.h | 23 +++ include/linux/swap.h|3 mm/memcontrol.c | 139 +++- mm/swap.c |2 mm/vmscan.c | 126 +++ 6 files changed, 278 insertions(+), 26 deletions(-) diff -puN include/linux/memcontrol.h~mem-control-lru-and-reclaim include/linux/memcontrol.h --- linux-2.6.23-rc1-mm1/include/linux/memcontrol.h~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/include/linux/memcontrol.h 2007-07-28 01:12:50.0 +0530 @@ -31,6 +31,13 @@ extern void page_assign_meta_page(struct extern struct meta_page *page_get_meta_page(struct page *page); extern int mem_container_charge(struct page *page, struct mm_struct *mm); extern void mem_container_uncharge(struct meta_page *mp); +extern void mem_container_move_lists(struct meta_page *mp, bool active); +extern unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, + struct list_head *dst, + unsigned long *scanned, int order, + int mode, struct zone *z, + struct mem_container *mem_cont, + int active); static inline void mem_container_uncharge_page(struct page *page) { @@ -70,6 +77,10 @@ static inline void mem_container_uncharg { } +static inline void mem_container_move_lists(struct meta_page *mp, bool active) +{ +} + #endif /* CONFIG_CONTAINER_MEM_CONT */ #endif /* _LINUX_MEMCONTROL_H */ diff -puN include/linux/res_counter.h~mem-control-lru-and-reclaim include/linux/res_counter.h --- linux-2.6.23-rc1-mm1/include/linux/res_counter.h~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/include/linux/res_counter.h 2007-07-28 01:12:50.0 +0530 @@ -99,4 +99,27 @@ int res_counter_charge(struct res_counte void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val); void res_counter_uncharge(struct res_counter *counter, unsigned long val); +static inline bool res_counter_limit_check_locked(struct res_counter *cnt) +{ + if (cnt->usage < cnt->limit) + return true; + + return false; +} + +/* + * Helper function to detect if the container is within it's limit or + * not. It's currently called from container_rss_prepare() + */ +static inline bool res_counter_check_under_limit(struct res_counter *cnt) +{ + bool ret; + unsigned long flags; + + spin_lock_irqsave(>lock, flags); + ret = res_counter_limit_check_locked(cnt); + spin_unlock_irqrestore(>lock, flags); + return ret; +} + #endif diff -puN include/linux/swap.h~mem-control-lru-and-reclaim include/linux/swap.h --- linux-2.6.23-rc1-mm1/include/linux/swap.h~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/include/linux/swap.h2007-07-28 01:12:50.0 +0530 @@ -6,6 +6,7 @@ #include #include #include +#include #include #include @@ -191,6 +192,8 @@ extern void swap_setup(void); /* linux/mm/vmscan.c */ extern unsigned long try_to_free_pages(struct zone **zones, int order, gfp_t gfp_mask); +extern unsigned long try_to_free_mem_container_pages(struct mem_container *mem); +extern int __isolate_lru_page(struct page *page, int mode); extern unsigned long shrink_all_memory(unsigned long nr_pages); extern int vm_swappiness; extern int remove_mapping(struct address_space *mapping, struct page *page); diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 01:12:50.0 +0530 @@ -24,8 +24,11 @@ #include #include #include +#include +#include struct container_subsys mem_container_subsys; +static const int MEM_CONTAINER_RECLAIM_RETRIES = 5; /* * The memory controller data structure. The memory controller controls both @@ -51,6 +54,10 @@ struct mem_container
[-mm PATCH 6/9] Memory controller add per container LRU and reclaim (v4)
Changelog since v3 1. Added reclaim retry logic to avoid being OOM'ed due to pages from swap cache (coming in due to reclaim) don't overwhelm the container. Changelog 1. Fix probable NULL pointer dereference based on review comments by YAMAMOTO Takashi Add the meta_page to the per container LRU. The reclaim algorithm has been modified to make the isolate_lru_pages() as a pluggable component. The scan_control data structure now accepts the container on behalf of which reclaims are carried out. try_to_free_pages() has been extended to become container aware. Signed-off-by: Pavel Emelianov [EMAIL PROTECTED] Signed-off-by: [EMAIL PROTECTED] --- include/linux/memcontrol.h | 11 +++ include/linux/res_counter.h | 23 +++ include/linux/swap.h|3 mm/memcontrol.c | 139 +++- mm/swap.c |2 mm/vmscan.c | 126 +++ 6 files changed, 278 insertions(+), 26 deletions(-) diff -puN include/linux/memcontrol.h~mem-control-lru-and-reclaim include/linux/memcontrol.h --- linux-2.6.23-rc1-mm1/include/linux/memcontrol.h~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/include/linux/memcontrol.h 2007-07-28 01:12:50.0 +0530 @@ -31,6 +31,13 @@ extern void page_assign_meta_page(struct extern struct meta_page *page_get_meta_page(struct page *page); extern int mem_container_charge(struct page *page, struct mm_struct *mm); extern void mem_container_uncharge(struct meta_page *mp); +extern void mem_container_move_lists(struct meta_page *mp, bool active); +extern unsigned long mem_container_isolate_pages(unsigned long nr_to_scan, + struct list_head *dst, + unsigned long *scanned, int order, + int mode, struct zone *z, + struct mem_container *mem_cont, + int active); static inline void mem_container_uncharge_page(struct page *page) { @@ -70,6 +77,10 @@ static inline void mem_container_uncharg { } +static inline void mem_container_move_lists(struct meta_page *mp, bool active) +{ +} + #endif /* CONFIG_CONTAINER_MEM_CONT */ #endif /* _LINUX_MEMCONTROL_H */ diff -puN include/linux/res_counter.h~mem-control-lru-and-reclaim include/linux/res_counter.h --- linux-2.6.23-rc1-mm1/include/linux/res_counter.h~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/include/linux/res_counter.h 2007-07-28 01:12:50.0 +0530 @@ -99,4 +99,27 @@ int res_counter_charge(struct res_counte void res_counter_uncharge_locked(struct res_counter *counter, unsigned long val); void res_counter_uncharge(struct res_counter *counter, unsigned long val); +static inline bool res_counter_limit_check_locked(struct res_counter *cnt) +{ + if (cnt-usage cnt-limit) + return true; + + return false; +} + +/* + * Helper function to detect if the container is within it's limit or + * not. It's currently called from container_rss_prepare() + */ +static inline bool res_counter_check_under_limit(struct res_counter *cnt) +{ + bool ret; + unsigned long flags; + + spin_lock_irqsave(cnt-lock, flags); + ret = res_counter_limit_check_locked(cnt); + spin_unlock_irqrestore(cnt-lock, flags); + return ret; +} + #endif diff -puN include/linux/swap.h~mem-control-lru-and-reclaim include/linux/swap.h --- linux-2.6.23-rc1-mm1/include/linux/swap.h~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/include/linux/swap.h2007-07-28 01:12:50.0 +0530 @@ -6,6 +6,7 @@ #include linux/mmzone.h #include linux/list.h #include linux/sched.h +#include linux/memcontrol.h #include asm/atomic.h #include asm/page.h @@ -191,6 +192,8 @@ extern void swap_setup(void); /* linux/mm/vmscan.c */ extern unsigned long try_to_free_pages(struct zone **zones, int order, gfp_t gfp_mask); +extern unsigned long try_to_free_mem_container_pages(struct mem_container *mem); +extern int __isolate_lru_page(struct page *page, int mode); extern unsigned long shrink_all_memory(unsigned long nr_pages); extern int vm_swappiness; extern int remove_mapping(struct address_space *mapping, struct page *page); diff -puN mm/memcontrol.c~mem-control-lru-and-reclaim mm/memcontrol.c --- linux-2.6.23-rc1-mm1/mm/memcontrol.c~mem-control-lru-and-reclaim 2007-07-28 01:12:50.0 +0530 +++ linux-2.6.23-rc1-mm1-balbir/mm/memcontrol.c 2007-07-28 01:12:50.0 +0530 @@ -24,8 +24,11 @@ #include linux/page-flags.h #include linux/bit_spinlock.h #include linux/rcupdate.h +#include linux/swap.h +#include linux/spinlock.h struct container_subsys mem_container_subsys; +static const