Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-11-14 Thread Johannes Weiner
On Wed, Nov 14, 2012 at 02:59:30PM +0100, Michal Hocko wrote:
> On Tue 13-11-12 16:10:41, Johannes Weiner wrote:
> > Would it make sense to stick a wait_on_page_locked() in there just so
> > that we don't busy spin on a page under migration/reclaim?
> 
> Hmm, this would also mean that get_page_unless_zero would fail as well
> and so we would schedule in mem_cgroup_force_empty_list. It is true that
> there might be no other runnable task so we can busy loop so yes this
> would help. Care to cook the patch?

Eventually get_page_unless_zero() would fail but we could still spin
on a page while it's off the LRU and migration performs writeback on
it e.g.  cond_resched() does not necessarily schedule just because
there is another runnable task, I think, it's voluntary preemption
when the task needs rescheduling anyway, not yield.

Maybe not worth bothering...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-11-14 Thread Michal Hocko
On Tue 13-11-12 16:10:41, Johannes Weiner wrote:
> On Tue, Oct 30, 2012 at 11:35:59AM +0100, Michal Hocko wrote:
> > On Mon 29-10-12 15:00:22, Andrew Morton wrote:
> > > On Mon, 29 Oct 2012 17:58:45 +0400
> > > Glauber Costa  wrote:
> > > 
> > > > > + * move charges to its parent or the root cgroup if the group has no
> > > > > + * parent (aka use_hierarchy==0).
> > > > > + * Although this might fail (get_page_unless_zero, isolate_lru_page 
> > > > > or
> > > > > + * mem_cgroup_move_account fails) the failure is always temporary and
> > > > > + * it signals a race with a page removal/uncharge or migration. In 
> > > > > the
> > > > > + * first case the page is on the way out and it will vanish from the 
> > > > > LRU
> > > > > + * on the next attempt and the call should be retried later.
> > > > > + * Isolation from the LRU fails only if page has been isolated from
> > > > > + * the LRU since we looked at it and that usually means either global
> > > > > + * reclaim or migration going on. The page will either get back to 
> > > > > the
> > > > > + * LRU or vanish.
> > > > 
> > > > I just wonder for how long can it go in the worst case?
> > > 
> > > If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
> > 
> > You are right, if the rmdir (resp. echo > force_empty) at SCHED_FIFO
> > races with put_page (on a shared page) which gets preempted after
> > put_page_testzero and before __page_cache_release then we are screwed:
> > 
> > put_page(page)
> >   put_page_testzero
> >> LRU>
> > mem_cgroup_force_empty_list
> >   page = list_entry(list->prev, struct page, lru);
> >   mem_cgroup_move_parent(page)
> > get_page_unless_zero 
> >   cond_resched() 
> > 
> > The race window is really small but it is definitely possible. I am not
> > happy about this state and it should be probably mentioned in the
> > patch description but I do not see any way around (except for hacks like
> > sched_setscheduler for the current which is, ehm...) and still keep
> > do_not_fail contract here.
> > 
> > Can we consider this as a corner case (it is much easier to kill a
> > machine with SCHED_FIFO than this anyway) or the concern is really
> > strong and we should come with a solution before this can get merged?
> 
> Wouldn't the much bigger race window be reclaim having the page
> isolated and SCHED_FIFO preventing it from putback?

We wouldn't see the page on the LRU then, right?

> I also don't think this is a new class of problem, though.
> 
> Would it make sense to stick a wait_on_page_locked() in there just so
> that we don't busy spin on a page under migration/reclaim?

Hmm, this would also mean that get_page_unless_zero would fail as well
and so we would schedule in mem_cgroup_force_empty_list. It is true that
there might be no other runnable task so we can busy loop so yes this
would help. Care to cook the patch?

Thanks
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-11-14 Thread Michal Hocko
On Tue 13-11-12 16:10:41, Johannes Weiner wrote:
 On Tue, Oct 30, 2012 at 11:35:59AM +0100, Michal Hocko wrote:
  On Mon 29-10-12 15:00:22, Andrew Morton wrote:
   On Mon, 29 Oct 2012 17:58:45 +0400
   Glauber Costa glom...@parallels.com wrote:
   
 + * move charges to its parent or the root cgroup if the group has no
 + * parent (aka use_hierarchy==0).
 + * Although this might fail (get_page_unless_zero, isolate_lru_page 
 or
 + * mem_cgroup_move_account fails) the failure is always temporary and
 + * it signals a race with a page removal/uncharge or migration. In 
 the
 + * first case the page is on the way out and it will vanish from the 
 LRU
 + * on the next attempt and the call should be retried later.
 + * Isolation from the LRU fails only if page has been isolated from
 + * the LRU since we looked at it and that usually means either global
 + * reclaim or migration going on. The page will either get back to 
 the
 + * LRU or vanish.

I just wonder for how long can it go in the worst case?
   
   If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
  
  You are right, if the rmdir (resp. echo  force_empty) at SCHED_FIFO
  races with put_page (on a shared page) which gets preempted after
  put_page_testzero and before __page_cache_release then we are screwed:
  
  put_page(page)
put_page_testzero
preempted and page still on 
  LRU
  mem_cgroup_force_empty_list
page = list_entry(list-prev, struct page, lru);
mem_cgroup_move_parent(page)
  get_page_unless_zero fails
cond_resched() scheduled again
  
  The race window is really small but it is definitely possible. I am not
  happy about this state and it should be probably mentioned in the
  patch description but I do not see any way around (except for hacks like
  sched_setscheduler for the current which is, ehm...) and still keep
  do_not_fail contract here.
  
  Can we consider this as a corner case (it is much easier to kill a
  machine with SCHED_FIFO than this anyway) or the concern is really
  strong and we should come with a solution before this can get merged?
 
 Wouldn't the much bigger race window be reclaim having the page
 isolated and SCHED_FIFO preventing it from putback?

We wouldn't see the page on the LRU then, right?

 I also don't think this is a new class of problem, though.
 
 Would it make sense to stick a wait_on_page_locked() in there just so
 that we don't busy spin on a page under migration/reclaim?

Hmm, this would also mean that get_page_unless_zero would fail as well
and so we would schedule in mem_cgroup_force_empty_list. It is true that
there might be no other runnable task so we can busy loop so yes this
would help. Care to cook the patch?

Thanks
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-11-14 Thread Johannes Weiner
On Wed, Nov 14, 2012 at 02:59:30PM +0100, Michal Hocko wrote:
 On Tue 13-11-12 16:10:41, Johannes Weiner wrote:
  Would it make sense to stick a wait_on_page_locked() in there just so
  that we don't busy spin on a page under migration/reclaim?
 
 Hmm, this would also mean that get_page_unless_zero would fail as well
 and so we would schedule in mem_cgroup_force_empty_list. It is true that
 there might be no other runnable task so we can busy loop so yes this
 would help. Care to cook the patch?

Eventually get_page_unless_zero() would fail but we could still spin
on a page while it's off the LRU and migration performs writeback on
it e.g.  cond_resched() does not necessarily schedule just because
there is another runnable task, I think, it's voluntary preemption
when the task needs rescheduling anyway, not yield.

Maybe not worth bothering...
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-11-13 Thread Johannes Weiner
On Tue, Oct 30, 2012 at 11:35:59AM +0100, Michal Hocko wrote:
> On Mon 29-10-12 15:00:22, Andrew Morton wrote:
> > On Mon, 29 Oct 2012 17:58:45 +0400
> > Glauber Costa  wrote:
> > 
> > > > + * move charges to its parent or the root cgroup if the group has no
> > > > + * parent (aka use_hierarchy==0).
> > > > + * Although this might fail (get_page_unless_zero, isolate_lru_page or
> > > > + * mem_cgroup_move_account fails) the failure is always temporary and
> > > > + * it signals a race with a page removal/uncharge or migration. In the
> > > > + * first case the page is on the way out and it will vanish from the 
> > > > LRU
> > > > + * on the next attempt and the call should be retried later.
> > > > + * Isolation from the LRU fails only if page has been isolated from
> > > > + * the LRU since we looked at it and that usually means either global
> > > > + * reclaim or migration going on. The page will either get back to the
> > > > + * LRU or vanish.
> > > 
> > > I just wonder for how long can it go in the worst case?
> > 
> > If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
> 
> You are right, if the rmdir (resp. echo > force_empty) at SCHED_FIFO
> races with put_page (on a shared page) which gets preempted after
> put_page_testzero and before __page_cache_release then we are screwed:
> 
>   put_page(page)
> put_page_testzero
>  LRU>
> mem_cgroup_force_empty_list
>   page = list_entry(list->prev, struct page, lru);
>   mem_cgroup_move_parent(page)
> get_page_unless_zero 
>   cond_resched() 
> 
> The race window is really small but it is definitely possible. I am not
> happy about this state and it should be probably mentioned in the
> patch description but I do not see any way around (except for hacks like
> sched_setscheduler for the current which is, ehm...) and still keep
> do_not_fail contract here.
> 
> Can we consider this as a corner case (it is much easier to kill a
> machine with SCHED_FIFO than this anyway) or the concern is really
> strong and we should come with a solution before this can get merged?

Wouldn't the much bigger race window be reclaim having the page
isolated and SCHED_FIFO preventing it from putback?

I also don't think this is a new class of problem, though.

Would it make sense to stick a wait_on_page_locked() in there just so
that we don't busy spin on a page under migration/reclaim?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-11-13 Thread Johannes Weiner
On Tue, Oct 30, 2012 at 11:35:59AM +0100, Michal Hocko wrote:
 On Mon 29-10-12 15:00:22, Andrew Morton wrote:
  On Mon, 29 Oct 2012 17:58:45 +0400
  Glauber Costa glom...@parallels.com wrote:
  
+ * move charges to its parent or the root cgroup if the group has no
+ * parent (aka use_hierarchy==0).
+ * Although this might fail (get_page_unless_zero, isolate_lru_page or
+ * mem_cgroup_move_account fails) the failure is always temporary and
+ * it signals a race with a page removal/uncharge or migration. In the
+ * first case the page is on the way out and it will vanish from the 
LRU
+ * on the next attempt and the call should be retried later.
+ * Isolation from the LRU fails only if page has been isolated from
+ * the LRU since we looked at it and that usually means either global
+ * reclaim or migration going on. The page will either get back to the
+ * LRU or vanish.
   
   I just wonder for how long can it go in the worst case?
  
  If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
 
 You are right, if the rmdir (resp. echo  force_empty) at SCHED_FIFO
 races with put_page (on a shared page) which gets preempted after
 put_page_testzero and before __page_cache_release then we are screwed:
 
   put_page(page)
 put_page_testzero
 preempted and page still on 
 LRU
 mem_cgroup_force_empty_list
   page = list_entry(list-prev, struct page, lru);
   mem_cgroup_move_parent(page)
 get_page_unless_zero fails
   cond_resched() scheduled again
 
 The race window is really small but it is definitely possible. I am not
 happy about this state and it should be probably mentioned in the
 patch description but I do not see any way around (except for hacks like
 sched_setscheduler for the current which is, ehm...) and still keep
 do_not_fail contract here.
 
 Can we consider this as a corner case (it is much easier to kill a
 machine with SCHED_FIFO than this anyway) or the concern is really
 strong and we should come with a solution before this can get merged?

Wouldn't the much bigger race window be reclaim having the page
isolated and SCHED_FIFO preventing it from putback?

I also don't think this is a new class of problem, though.

Would it make sense to stick a wait_on_page_locked() in there just so
that we don't busy spin on a page under migration/reclaim?
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-31 Thread Andrew Morton
On Tue, 30 Oct 2012 11:35:59 +0100
Michal Hocko  wrote:

> > If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
>
> ...
>
> Can we consider this as a corner case (it is much easier to kill a
> machine with SCHED_FIFO than this anyway) or the concern is really
> strong and we should come with a solution before this can get merged?

Sure.  SCHED_FIFO can be used to permanently block all kernel threads
which pretty quickly results in a very sick kernel.  My observation was
just a general moan about the SCHED_FIFO wontfix problem :)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-31 Thread Andrew Morton
On Tue, 30 Oct 2012 11:35:59 +0100
Michal Hocko mho...@suse.cz wrote:

  If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!

 ...

 Can we consider this as a corner case (it is much easier to kill a
 machine with SCHED_FIFO than this anyway) or the concern is really
 strong and we should come with a solution before this can get merged?

Sure.  SCHED_FIFO can be used to permanently block all kernel threads
which pretty quickly results in a very sick kernel.  My observation was
just a general moan about the SCHED_FIFO wontfix problem :)

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-30 Thread Michal Hocko
On Mon 29-10-12 15:00:22, Andrew Morton wrote:
> On Mon, 29 Oct 2012 17:58:45 +0400
> Glauber Costa  wrote:
> 
> > > + * move charges to its parent or the root cgroup if the group has no
> > > + * parent (aka use_hierarchy==0).
> > > + * Although this might fail (get_page_unless_zero, isolate_lru_page or
> > > + * mem_cgroup_move_account fails) the failure is always temporary and
> > > + * it signals a race with a page removal/uncharge or migration. In the
> > > + * first case the page is on the way out and it will vanish from the LRU
> > > + * on the next attempt and the call should be retried later.
> > > + * Isolation from the LRU fails only if page has been isolated from
> > > + * the LRU since we looked at it and that usually means either global
> > > + * reclaim or migration going on. The page will either get back to the
> > > + * LRU or vanish.
> > 
> > I just wonder for how long can it go in the worst case?
> 
> If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!

You are right, if the rmdir (resp. echo > force_empty) at SCHED_FIFO
races with put_page (on a shared page) which gets preempted after
put_page_testzero and before __page_cache_release then we are screwed:

put_page(page)
  put_page_testzero
  
mem_cgroup_force_empty_list
  page = list_entry(list->prev, struct page, lru);
  mem_cgroup_move_parent(page)
get_page_unless_zero 
  cond_resched() 

The race window is really small but it is definitely possible. I am not
happy about this state and it should be probably mentioned in the
patch description but I do not see any way around (except for hacks like
sched_setscheduler for the current which is, ehm...) and still keep
do_not_fail contract here.

Can we consider this as a corner case (it is much easier to kill a
machine with SCHED_FIFO than this anyway) or the concern is really
strong and we should come with a solution before this can get merged?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-30 Thread Michal Hocko
On Mon 29-10-12 15:00:22, Andrew Morton wrote:
 On Mon, 29 Oct 2012 17:58:45 +0400
 Glauber Costa glom...@parallels.com wrote:
 
   + * move charges to its parent or the root cgroup if the group has no
   + * parent (aka use_hierarchy==0).
   + * Although this might fail (get_page_unless_zero, isolate_lru_page or
   + * mem_cgroup_move_account fails) the failure is always temporary and
   + * it signals a race with a page removal/uncharge or migration. In the
   + * first case the page is on the way out and it will vanish from the LRU
   + * on the next attempt and the call should be retried later.
   + * Isolation from the LRU fails only if page has been isolated from
   + * the LRU since we looked at it and that usually means either global
   + * reclaim or migration going on. The page will either get back to the
   + * LRU or vanish.
  
  I just wonder for how long can it go in the worst case?
 
 If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!

You are right, if the rmdir (resp. echo  force_empty) at SCHED_FIFO
races with put_page (on a shared page) which gets preempted after
put_page_testzero and before __page_cache_release then we are screwed:

put_page(page)
  put_page_testzero
  preempted and page still on 
LRU
mem_cgroup_force_empty_list
  page = list_entry(list-prev, struct page, lru);
  mem_cgroup_move_parent(page)
get_page_unless_zero fails
  cond_resched() scheduled again

The race window is really small but it is definitely possible. I am not
happy about this state and it should be probably mentioned in the
patch description but I do not see any way around (except for hacks like
sched_setscheduler for the current which is, ehm...) and still keep
do_not_fail contract here.

Can we consider this as a corner case (it is much easier to kill a
machine with SCHED_FIFO than this anyway) or the concern is really
strong and we should come with a solution before this can get merged?

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Andrew Morton
On Mon, 29 Oct 2012 17:58:45 +0400
Glauber Costa  wrote:

> > + * move charges to its parent or the root cgroup if the group has no
> > + * parent (aka use_hierarchy==0).
> > + * Although this might fail (get_page_unless_zero, isolate_lru_page or
> > + * mem_cgroup_move_account fails) the failure is always temporary and
> > + * it signals a race with a page removal/uncharge or migration. In the
> > + * first case the page is on the way out and it will vanish from the LRU
> > + * on the next attempt and the call should be retried later.
> > + * Isolation from the LRU fails only if page has been isolated from
> > + * the LRU since we looked at it and that usually means either global
> > + * reclaim or migration going on. The page will either get back to the
> > + * LRU or vanish.
> 
> I just wonder for how long can it go in the worst case?

If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Glauber Costa
>>> + * move charges to its parent or the root cgroup if the group has no
>>> + * parent (aka use_hierarchy==0).
>>> + * Although this might fail (get_page_unless_zero, isolate_lru_page or
>>> + * mem_cgroup_move_account fails) the failure is always temporary and
>>> + * it signals a race with a page removal/uncharge or migration. In the
>>> + * first case the page is on the way out and it will vanish from the LRU
>>> + * on the next attempt and the call should be retried later.
>>> + * Isolation from the LRU fails only if page has been isolated from
>>> + * the LRU since we looked at it and that usually means either global
>>> + * reclaim or migration going on. The page will either get back to the
>>> + * LRU or vanish.
>>
>> I just wonder for how long can it go in the worst case?
>  
> That's a good question and to be honest I have no idea. The point is
> that it will terminate eventually and that the group is on the way out
> so the time to complete the removal is not a big deal IMHO. We had
> basically similar situation previously when we would need to repeat
> rmdir loop on EBUSY. The only change is that we do not have to retry
> anymore.
> 
> So the key point is to check whether my assumption about temporarily is
> correct and that we cannot block the rest of the kernel/userspace to
> proceed even though we are waiting for finalization. I believe this is
> true but... (last famous words?)
> 
At least for me, it seems that this will hold.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Michal Hocko
On Mon 29-10-12 17:58:45, Glauber Costa wrote:
> 
> > 
> > Changes since v1
> > - use kerndoc
> > - be more specific about mem_cgroup_move_parent possible failures
> > 
> > Signed-off-by: Michal Hocko 
> > Reviewed-by: Tejun Heo 
> Reviewed-by: Glauber Costa 

Thanks!

> > + * move charges to its parent or the root cgroup if the group has no
> > + * parent (aka use_hierarchy==0).
> > + * Although this might fail (get_page_unless_zero, isolate_lru_page or
> > + * mem_cgroup_move_account fails) the failure is always temporary and
> > + * it signals a race with a page removal/uncharge or migration. In the
> > + * first case the page is on the way out and it will vanish from the LRU
> > + * on the next attempt and the call should be retried later.
> > + * Isolation from the LRU fails only if page has been isolated from
> > + * the LRU since we looked at it and that usually means either global
> > + * reclaim or migration going on. The page will either get back to the
> > + * LRU or vanish.
> 
> I just wonder for how long can it go in the worst case?
 
That's a good question and to be honest I have no idea. The point is
that it will terminate eventually and that the group is on the way out
so the time to complete the removal is not a big deal IMHO. We had
basically similar situation previously when we would need to repeat
rmdir loop on EBUSY. The only change is that we do not have to retry
anymore.

So the key point is to check whether my assumption about temporarily is
correct and that we cannot block the rest of the kernel/userspace to
proceed even though we are waiting for finalization. I believe this is
true but... (last famous words?)

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Glauber Costa

> 
> Changes since v1
> - use kerndoc
> - be more specific about mem_cgroup_move_parent possible failures
> 
> Signed-off-by: Michal Hocko 
> Reviewed-by: Tejun Heo 
Reviewed-by: Glauber Costa 

> + * move charges to its parent or the root cgroup if the group has no
> + * parent (aka use_hierarchy==0).
> + * Although this might fail (get_page_unless_zero, isolate_lru_page or
> + * mem_cgroup_move_account fails) the failure is always temporary and
> + * it signals a race with a page removal/uncharge or migration. In the
> + * first case the page is on the way out and it will vanish from the LRU
> + * on the next attempt and the call should be retried later.
> + * Isolation from the LRU fails only if page has been isolated from
> + * the LRU since we looked at it and that usually means either global
> + * reclaim or migration going on. The page will either get back to the
> + * LRU or vanish.

I just wonder for how long can it go in the worst case?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Andrew Morton
On Mon, 29 Oct 2012 17:58:45 +0400
Glauber Costa glom...@parallels.com wrote:

  + * move charges to its parent or the root cgroup if the group has no
  + * parent (aka use_hierarchy==0).
  + * Although this might fail (get_page_unless_zero, isolate_lru_page or
  + * mem_cgroup_move_account fails) the failure is always temporary and
  + * it signals a race with a page removal/uncharge or migration. In the
  + * first case the page is on the way out and it will vanish from the LRU
  + * on the next attempt and the call should be retried later.
  + * Isolation from the LRU fails only if page has been isolated from
  + * the LRU since we looked at it and that usually means either global
  + * reclaim or migration going on. The page will either get back to the
  + * LRU or vanish.
 
 I just wonder for how long can it go in the worst case?

If the kernel is uniprocessor and the caller is SCHED_FIFO: ad infinitum!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Glauber Costa

 
 Changes since v1
 - use kerndoc
 - be more specific about mem_cgroup_move_parent possible failures
 
 Signed-off-by: Michal Hocko mho...@suse.cz
 Reviewed-by: Tejun Heo t...@kernel.org
Reviewed-by: Glauber Costa glom...@parallels.com

 + * move charges to its parent or the root cgroup if the group has no
 + * parent (aka use_hierarchy==0).
 + * Although this might fail (get_page_unless_zero, isolate_lru_page or
 + * mem_cgroup_move_account fails) the failure is always temporary and
 + * it signals a race with a page removal/uncharge or migration. In the
 + * first case the page is on the way out and it will vanish from the LRU
 + * on the next attempt and the call should be retried later.
 + * Isolation from the LRU fails only if page has been isolated from
 + * the LRU since we looked at it and that usually means either global
 + * reclaim or migration going on. The page will either get back to the
 + * LRU or vanish.

I just wonder for how long can it go in the worst case?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Michal Hocko
On Mon 29-10-12 17:58:45, Glauber Costa wrote:
 
  
  Changes since v1
  - use kerndoc
  - be more specific about mem_cgroup_move_parent possible failures
  
  Signed-off-by: Michal Hocko mho...@suse.cz
  Reviewed-by: Tejun Heo t...@kernel.org
 Reviewed-by: Glauber Costa glom...@parallels.com

Thanks!

  + * move charges to its parent or the root cgroup if the group has no
  + * parent (aka use_hierarchy==0).
  + * Although this might fail (get_page_unless_zero, isolate_lru_page or
  + * mem_cgroup_move_account fails) the failure is always temporary and
  + * it signals a race with a page removal/uncharge or migration. In the
  + * first case the page is on the way out and it will vanish from the LRU
  + * on the next attempt and the call should be retried later.
  + * Isolation from the LRU fails only if page has been isolated from
  + * the LRU since we looked at it and that usually means either global
  + * reclaim or migration going on. The page will either get back to the
  + * LRU or vanish.
 
 I just wonder for how long can it go in the worst case?
 
That's a good question and to be honest I have no idea. The point is
that it will terminate eventually and that the group is on the way out
so the time to complete the removal is not a big deal IMHO. We had
basically similar situation previously when we would need to repeat
rmdir loop on EBUSY. The only change is that we do not have to retry
anymore.

So the key point is to check whether my assumption about temporarily is
correct and that we cannot block the rest of the kernel/userspace to
proceed even though we are waiting for finalization. I believe this is
true but... (last famous words?)

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] memcg: Simplify mem_cgroup_force_empty_list error handling

2012-10-29 Thread Glauber Costa
 + * move charges to its parent or the root cgroup if the group has no
 + * parent (aka use_hierarchy==0).
 + * Although this might fail (get_page_unless_zero, isolate_lru_page or
 + * mem_cgroup_move_account fails) the failure is always temporary and
 + * it signals a race with a page removal/uncharge or migration. In the
 + * first case the page is on the way out and it will vanish from the LRU
 + * on the next attempt and the call should be retried later.
 + * Isolation from the LRU fails only if page has been isolated from
 + * the LRU since we looked at it and that usually means either global
 + * reclaim or migration going on. The page will either get back to the
 + * LRU or vanish.

 I just wonder for how long can it go in the worst case?
  
 That's a good question and to be honest I have no idea. The point is
 that it will terminate eventually and that the group is on the way out
 so the time to complete the removal is not a big deal IMHO. We had
 basically similar situation previously when we would need to repeat
 rmdir loop on EBUSY. The only change is that we do not have to retry
 anymore.
 
 So the key point is to check whether my assumption about temporarily is
 correct and that we cannot block the rest of the kernel/userspace to
 proceed even though we are waiting for finalization. I believe this is
 true but... (last famous words?)
 
At least for me, it seems that this will hold.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/