Re: [PATCH v2 3/3] memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting

2013-10-29 Thread Johannes Weiner
On Sun, Oct 27, 2013 at 10:30:17AM -0700, Greg Thelen wrote:
> As of v3.11-9444-g3ea67d0 "memcg: add per cgroup writeback pages
> accounting" memcg counter errors are possible when moving charged
> memory to a different memcg.  Charge movement occurs when processing
> writes to memory.force_empty, moving tasks to a memcg with
> memcg.move_charge_at_immigrate=1, or memcg deletion.  An example
> showing error after memory.force_empty:
>   $ cd /sys/fs/cgroup/memory
>   $ mkdir x
>   $ rm /data/tmp/file
>   $ (echo $BASHPID >> x/tasks && exec mmap_writer /data/tmp/file 1M) &
>   [1] 13600
>   $ grep ^mapped x/memory.stat
>   mapped_file 1048576
>   $ echo 13600 > tasks
>   $ echo 1 > x/memory.force_empty
>   $ grep ^mapped x/memory.stat
>   mapped_file 4503599627370496
> 
> mapped_file should end with 0.
>   4503599627370496 == 0x10,,, == 0x100,, pages
>   1048576  == 0x10,   == 0x100 pages
> 
> This issue only affects the source memcg on 64 bit machines; the
> destination memcg counters are correct.  So the rmdir case is not too
> important because such counters are soon disappearing with the entire
> memcg.  But the memcg.force_empty and
> memory.move_charge_at_immigrate=1 cases are larger problems as the
> bogus counters are visible for the (possibly long) remaining life of
> the source memcg.
> 
> The problem is due to memcg use of __this_cpu_from(.., -nr_pages),
> which is subtly wrong because it subtracts the unsigned int nr_pages
> (either -1 or -512 for THP) from a signed long percpu counter.  When
> nr_pages=-1, -nr_pages=0x.  On 64 bit machines
> stat->count[idx] is signed 64 bit.  So memcg's attempt to simply
> decrement a count (e.g. from 1 to 0) boils down to:
>   long count = 1
>   unsigned int nr_pages = 1
>   count += -nr_pages  /* -nr_pages == 0x, */
>   count is now 0x1,, instead of 0
> 
> The fix is to subtract the unsigned page count rather than adding its
> negation.  This only works once "percpu: fix this_cpu_sub() subtrahend
> casting for unsigneds" is applied to fix this_cpu_sub().
> 
> Signed-off-by: Greg Thelen 
> Acked-by: Tejun Heo 

Huh, it looked so innocent...  At first I thought 2/3 would fix this
case as well but the cast happens only after the negation, so the sign
extension does not happen.  Alright, then.

Acked-by: Johannes Weiner 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/3] memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting

2013-10-29 Thread Johannes Weiner
On Sun, Oct 27, 2013 at 10:30:17AM -0700, Greg Thelen wrote:
 As of v3.11-9444-g3ea67d0 memcg: add per cgroup writeback pages
 accounting memcg counter errors are possible when moving charged
 memory to a different memcg.  Charge movement occurs when processing
 writes to memory.force_empty, moving tasks to a memcg with
 memcg.move_charge_at_immigrate=1, or memcg deletion.  An example
 showing error after memory.force_empty:
   $ cd /sys/fs/cgroup/memory
   $ mkdir x
   $ rm /data/tmp/file
   $ (echo $BASHPID  x/tasks  exec mmap_writer /data/tmp/file 1M) 
   [1] 13600
   $ grep ^mapped x/memory.stat
   mapped_file 1048576
   $ echo 13600  tasks
   $ echo 1  x/memory.force_empty
   $ grep ^mapped x/memory.stat
   mapped_file 4503599627370496
 
 mapped_file should end with 0.
   4503599627370496 == 0x10,,, == 0x100,, pages
   1048576  == 0x10,   == 0x100 pages
 
 This issue only affects the source memcg on 64 bit machines; the
 destination memcg counters are correct.  So the rmdir case is not too
 important because such counters are soon disappearing with the entire
 memcg.  But the memcg.force_empty and
 memory.move_charge_at_immigrate=1 cases are larger problems as the
 bogus counters are visible for the (possibly long) remaining life of
 the source memcg.
 
 The problem is due to memcg use of __this_cpu_from(.., -nr_pages),
 which is subtly wrong because it subtracts the unsigned int nr_pages
 (either -1 or -512 for THP) from a signed long percpu counter.  When
 nr_pages=-1, -nr_pages=0x.  On 64 bit machines
 stat-count[idx] is signed 64 bit.  So memcg's attempt to simply
 decrement a count (e.g. from 1 to 0) boils down to:
   long count = 1
   unsigned int nr_pages = 1
   count += -nr_pages  /* -nr_pages == 0x, */
   count is now 0x1,, instead of 0
 
 The fix is to subtract the unsigned page count rather than adding its
 negation.  This only works once percpu: fix this_cpu_sub() subtrahend
 casting for unsigneds is applied to fix this_cpu_sub().
 
 Signed-off-by: Greg Thelen gthe...@google.com
 Acked-by: Tejun Heo t...@kernel.org

Huh, it looked so innocent...  At first I thought 2/3 would fix this
case as well but the cast happens only after the negation, so the sign
extension does not happen.  Alright, then.

Acked-by: Johannes Weiner han...@cmpxchg.org
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting

2013-10-27 Thread Greg Thelen
As of v3.11-9444-g3ea67d0 "memcg: add per cgroup writeback pages
accounting" memcg counter errors are possible when moving charged
memory to a different memcg.  Charge movement occurs when processing
writes to memory.force_empty, moving tasks to a memcg with
memcg.move_charge_at_immigrate=1, or memcg deletion.  An example
showing error after memory.force_empty:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ rm /data/tmp/file
  $ (echo $BASHPID >> x/tasks && exec mmap_writer /data/tmp/file 1M) &
  [1] 13600
  $ grep ^mapped x/memory.stat
  mapped_file 1048576
  $ echo 13600 > tasks
  $ echo 1 > x/memory.force_empty
  $ grep ^mapped x/memory.stat
  mapped_file 4503599627370496

mapped_file should end with 0.
  4503599627370496 == 0x10,,, == 0x100,, pages
  1048576  == 0x10,   == 0x100 pages

This issue only affects the source memcg on 64 bit machines; the
destination memcg counters are correct.  So the rmdir case is not too
important because such counters are soon disappearing with the entire
memcg.  But the memcg.force_empty and
memory.move_charge_at_immigrate=1 cases are larger problems as the
bogus counters are visible for the (possibly long) remaining life of
the source memcg.

The problem is due to memcg use of __this_cpu_from(.., -nr_pages),
which is subtly wrong because it subtracts the unsigned int nr_pages
(either -1 or -512 for THP) from a signed long percpu counter.  When
nr_pages=-1, -nr_pages=0x.  On 64 bit machines
stat->count[idx] is signed 64 bit.  So memcg's attempt to simply
decrement a count (e.g. from 1 to 0) boils down to:
  long count = 1
  unsigned int nr_pages = 1
  count += -nr_pages  /* -nr_pages == 0x, */
  count is now 0x1,, instead of 0

The fix is to subtract the unsigned page count rather than adding its
negation.  This only works once "percpu: fix this_cpu_sub() subtrahend
casting for unsigneds" is applied to fix this_cpu_sub().

Signed-off-by: Greg Thelen 
Acked-by: Tejun Heo 
---
 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index aa8185c..b7ace0f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3773,7 +3773,7 @@ void mem_cgroup_move_account_page_stat(struct mem_cgroup 
*from,
 {
/* Update stat data for mem_cgroup */
preempt_disable();
-   __this_cpu_add(from->stat->count[idx], -nr_pages);
+   __this_cpu_sub(from->stat->count[idx], nr_pages);
__this_cpu_add(to->stat->count[idx], nr_pages);
preempt_enable();
 }
-- 
1.8.4.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 3/3] memcg: use __this_cpu_sub() to dec stats to avoid incorrect subtrahend casting

2013-10-27 Thread Greg Thelen
As of v3.11-9444-g3ea67d0 memcg: add per cgroup writeback pages
accounting memcg counter errors are possible when moving charged
memory to a different memcg.  Charge movement occurs when processing
writes to memory.force_empty, moving tasks to a memcg with
memcg.move_charge_at_immigrate=1, or memcg deletion.  An example
showing error after memory.force_empty:
  $ cd /sys/fs/cgroup/memory
  $ mkdir x
  $ rm /data/tmp/file
  $ (echo $BASHPID  x/tasks  exec mmap_writer /data/tmp/file 1M) 
  [1] 13600
  $ grep ^mapped x/memory.stat
  mapped_file 1048576
  $ echo 13600  tasks
  $ echo 1  x/memory.force_empty
  $ grep ^mapped x/memory.stat
  mapped_file 4503599627370496

mapped_file should end with 0.
  4503599627370496 == 0x10,,, == 0x100,, pages
  1048576  == 0x10,   == 0x100 pages

This issue only affects the source memcg on 64 bit machines; the
destination memcg counters are correct.  So the rmdir case is not too
important because such counters are soon disappearing with the entire
memcg.  But the memcg.force_empty and
memory.move_charge_at_immigrate=1 cases are larger problems as the
bogus counters are visible for the (possibly long) remaining life of
the source memcg.

The problem is due to memcg use of __this_cpu_from(.., -nr_pages),
which is subtly wrong because it subtracts the unsigned int nr_pages
(either -1 or -512 for THP) from a signed long percpu counter.  When
nr_pages=-1, -nr_pages=0x.  On 64 bit machines
stat-count[idx] is signed 64 bit.  So memcg's attempt to simply
decrement a count (e.g. from 1 to 0) boils down to:
  long count = 1
  unsigned int nr_pages = 1
  count += -nr_pages  /* -nr_pages == 0x, */
  count is now 0x1,, instead of 0

The fix is to subtract the unsigned page count rather than adding its
negation.  This only works once percpu: fix this_cpu_sub() subtrahend
casting for unsigneds is applied to fix this_cpu_sub().

Signed-off-by: Greg Thelen gthe...@google.com
Acked-by: Tejun Heo t...@kernel.org
---
 mm/memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index aa8185c..b7ace0f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3773,7 +3773,7 @@ void mem_cgroup_move_account_page_stat(struct mem_cgroup 
*from,
 {
/* Update stat data for mem_cgroup */
preempt_disable();
-   __this_cpu_add(from-stat-count[idx], -nr_pages);
+   __this_cpu_sub(from-stat-count[idx], nr_pages);
__this_cpu_add(to-stat-count[idx], nr_pages);
preempt_enable();
 }
-- 
1.8.4.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/