Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-10 Thread Vladimir Davydov
On 12/10/2013 01:13 PM, Michal Hocko wrote:
> On Mon 09-12-13 22:44:51, Vladimir Davydov wrote:
>> On 12/09/2013 07:22 PM, Michal Hocko wrote:
>>> On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
 On 12/04/2013 02:08 PM, Glauber Costa wrote:
>>> Could you do something clever with just one flag? Probably yes. But I
>>> doubt it would
>>> be that much cleaner, this is just the way that patching sites work.
>> Thank you for spending your time to listen to me.
>>
> Don't worry! I thank you for carrying this forward.
>
>> Let me try to explain what is bothering me.
>>
>> We have two state bits for each memcg, 'active' and 'activated'. There
>> are two scenarios where the bits can be modified:
>>
>> 1) The kmem limit is set on a memcg for the first time -
>> memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
>> which sets the 'activated' bit on success, then update static branching,
>> then set the 'active' bit. All three actions are done atomically in
>> respect to other tasks setting the limit due to the set_limit_mutex.
>> After both bits are set, they never get cleared for the memcg.
>>
> So far so good. But again, note how you yourself describe it:
> the cations are done atomically  *in respect to other tasks setting the 
> limit*
>
> But there are also tasks that are running its courses naturally and
> just allocating
> memory. For those, some call sites will be on, some will be off. We need 
> to make
> sure that *none* of them uses the patched site until *all* of them are 
> patched.
> This has nothing to do with updates, this is all about the readers.
>
>> 2) When a subgroup of a kmem-active cgroup is created -
>> memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
>> then increase static branching refcounter, then call
>> memcg_update_cache_sizes() for the new memcg, which may clear the
>> 'activated' bit on failure. After successful execution, the state bits
>> never get cleared for the new memcg.
>>
>> In scenario 2 there is no need bothering about the flags setting order,
>> because we don't have any tasks in the cgroup yet - the tasks can be
>> moved in only after css_online finishes when we have both of the bits
>> set and the static branching enabled. Actually, we already do not bother
>> about it, because we have both bits set before the cgroup is fully
>> initialized (memcg_update_cache_sizes() is called).
>>
> Yes, after the first cgroup is set none of that matters. But it is just 
> easier
> and less error prone just to follow the same path every time. As I have 
> said,
> if you can come up with a more clever way to deal with the problem above
> that doesn't involve the double flag - and you can prove it works - I
> am definitely
> fine with it. But this is subtle code, and in the past - Michal can
> attest this - we've
> changed this being sure it would work just to see it explode in our faces.
>
> So although I am willing to review every patch for correctness on that
> front (I never
> said I liked the 2-flags scheme...), unless you have a bug or real
> problem on it,
> I would advise against changing it if its just to make it more readable.
>
> But again, don't take me too seriously on this. If you and Michal think 
> you can
> come up with something better, I'm all for it.
 All right, I finally get you :-)

 Although I still don't think we need the second flag, I now understand
 that it's better not to change the code that works fine especially the
 change does not make it neither more readable nor more effective. Since
 I can be mistaken about the flags usage (it's by far not unlikely), it's
 better to leave it as is rather than being at risk of catching spurious
 hangs that might be caused by this modification.

 Thanks for the detailed explanation!
>>> It would be really great if we could push some of that into the
>>> comments, please?
>>>
>>> Anyway, reading this thread again, I guess I finally got what you meant
>>> Vladimir.
>>> You are basically saying that the two stage enabling can be done
>>> by static_key_slow_inc in the first step and memcg_kmem_set_active
>>> in the second step without an additional flag.
>>> Assuming that the writers cannot race (they cannot currently because
>>> they are linearized by set_limit_mutex and memcg_create_mutex) and
>>> readers (charging paths) are _always_ checking the static key before
>>> checking active flags?
>> Right. There is no point in checking the static key after checking
>> active flags, because the benefit of using static branching would
>> disappear then. So IMHO the only thing we should bother is that the
>> static key refcounter is incremented *before* the active bit is set.
>> That assures 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-10 Thread Michal Hocko
On Mon 09-12-13 22:44:51, Vladimir Davydov wrote:
> On 12/09/2013 07:22 PM, Michal Hocko wrote:
> > On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
> >> On 12/04/2013 02:08 PM, Glauber Costa wrote:
> > Could you do something clever with just one flag? Probably yes. But I
> > doubt it would
> > be that much cleaner, this is just the way that patching sites work.
>  Thank you for spending your time to listen to me.
> 
> >>> Don't worry! I thank you for carrying this forward.
> >>>
>  Let me try to explain what is bothering me.
> 
>  We have two state bits for each memcg, 'active' and 'activated'. There
>  are two scenarios where the bits can be modified:
> 
>  1) The kmem limit is set on a memcg for the first time -
>  memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
>  which sets the 'activated' bit on success, then update static branching,
>  then set the 'active' bit. All three actions are done atomically in
>  respect to other tasks setting the limit due to the set_limit_mutex.
>  After both bits are set, they never get cleared for the memcg.
> 
> >>> So far so good. But again, note how you yourself describe it:
> >>> the cations are done atomically  *in respect to other tasks setting the 
> >>> limit*
> >>>
> >>> But there are also tasks that are running its courses naturally and
> >>> just allocating
> >>> memory. For those, some call sites will be on, some will be off. We need 
> >>> to make
> >>> sure that *none* of them uses the patched site until *all* of them are 
> >>> patched.
> >>> This has nothing to do with updates, this is all about the readers.
> >>>
>  2) When a subgroup of a kmem-active cgroup is created -
>  memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
>  then increase static branching refcounter, then call
>  memcg_update_cache_sizes() for the new memcg, which may clear the
>  'activated' bit on failure. After successful execution, the state bits
>  never get cleared for the new memcg.
> 
>  In scenario 2 there is no need bothering about the flags setting order,
>  because we don't have any tasks in the cgroup yet - the tasks can be
>  moved in only after css_online finishes when we have both of the bits
>  set and the static branching enabled. Actually, we already do not bother
>  about it, because we have both bits set before the cgroup is fully
>  initialized (memcg_update_cache_sizes() is called).
> 
> >>> Yes, after the first cgroup is set none of that matters. But it is just 
> >>> easier
> >>> and less error prone just to follow the same path every time. As I have 
> >>> said,
> >>> if you can come up with a more clever way to deal with the problem above
> >>> that doesn't involve the double flag - and you can prove it works - I
> >>> am definitely
> >>> fine with it. But this is subtle code, and in the past - Michal can
> >>> attest this - we've
> >>> changed this being sure it would work just to see it explode in our faces.
> >>>
> >>> So although I am willing to review every patch for correctness on that
> >>> front (I never
> >>> said I liked the 2-flags scheme...), unless you have a bug or real
> >>> problem on it,
> >>> I would advise against changing it if its just to make it more readable.
> >>>
> >>> But again, don't take me too seriously on this. If you and Michal think 
> >>> you can
> >>> come up with something better, I'm all for it.
> >> All right, I finally get you :-)
> >>
> >> Although I still don't think we need the second flag, I now understand
> >> that it's better not to change the code that works fine especially the
> >> change does not make it neither more readable nor more effective. Since
> >> I can be mistaken about the flags usage (it's by far not unlikely), it's
> >> better to leave it as is rather than being at risk of catching spurious
> >> hangs that might be caused by this modification.
> >>
> >> Thanks for the detailed explanation!
> > It would be really great if we could push some of that into the
> > comments, please?
> >
> > Anyway, reading this thread again, I guess I finally got what you meant
> > Vladimir.
> > You are basically saying that the two stage enabling can be done
> > by static_key_slow_inc in the first step and memcg_kmem_set_active
> > in the second step without an additional flag.
> > Assuming that the writers cannot race (they cannot currently because
> > they are linearized by set_limit_mutex and memcg_create_mutex) and
> > readers (charging paths) are _always_ checking the static key before
> > checking active flags?
> 
> Right. There is no point in checking the static key after checking
> active flags, because the benefit of using static branching would
> disappear then. So IMHO the only thing we should bother is that the
> static key refcounter is incremented *before* the active bit is set.
> That assures all static branches have been patched if a 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-10 Thread Michal Hocko
On Mon 09-12-13 22:44:51, Vladimir Davydov wrote:
 On 12/09/2013 07:22 PM, Michal Hocko wrote:
  On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
  On 12/04/2013 02:08 PM, Glauber Costa wrote:
  Could you do something clever with just one flag? Probably yes. But I
  doubt it would
  be that much cleaner, this is just the way that patching sites work.
  Thank you for spending your time to listen to me.
 
  Don't worry! I thank you for carrying this forward.
 
  Let me try to explain what is bothering me.
 
  We have two state bits for each memcg, 'active' and 'activated'. There
  are two scenarios where the bits can be modified:
 
  1) The kmem limit is set on a memcg for the first time -
  memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
  which sets the 'activated' bit on success, then update static branching,
  then set the 'active' bit. All three actions are done atomically in
  respect to other tasks setting the limit due to the set_limit_mutex.
  After both bits are set, they never get cleared for the memcg.
 
  So far so good. But again, note how you yourself describe it:
  the cations are done atomically  *in respect to other tasks setting the 
  limit*
 
  But there are also tasks that are running its courses naturally and
  just allocating
  memory. For those, some call sites will be on, some will be off. We need 
  to make
  sure that *none* of them uses the patched site until *all* of them are 
  patched.
  This has nothing to do with updates, this is all about the readers.
 
  2) When a subgroup of a kmem-active cgroup is created -
  memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
  then increase static branching refcounter, then call
  memcg_update_cache_sizes() for the new memcg, which may clear the
  'activated' bit on failure. After successful execution, the state bits
  never get cleared for the new memcg.
 
  In scenario 2 there is no need bothering about the flags setting order,
  because we don't have any tasks in the cgroup yet - the tasks can be
  moved in only after css_online finishes when we have both of the bits
  set and the static branching enabled. Actually, we already do not bother
  about it, because we have both bits set before the cgroup is fully
  initialized (memcg_update_cache_sizes() is called).
 
  Yes, after the first cgroup is set none of that matters. But it is just 
  easier
  and less error prone just to follow the same path every time. As I have 
  said,
  if you can come up with a more clever way to deal with the problem above
  that doesn't involve the double flag - and you can prove it works - I
  am definitely
  fine with it. But this is subtle code, and in the past - Michal can
  attest this - we've
  changed this being sure it would work just to see it explode in our faces.
 
  So although I am willing to review every patch for correctness on that
  front (I never
  said I liked the 2-flags scheme...), unless you have a bug or real
  problem on it,
  I would advise against changing it if its just to make it more readable.
 
  But again, don't take me too seriously on this. If you and Michal think 
  you can
  come up with something better, I'm all for it.
  All right, I finally get you :-)
 
  Although I still don't think we need the second flag, I now understand
  that it's better not to change the code that works fine especially the
  change does not make it neither more readable nor more effective. Since
  I can be mistaken about the flags usage (it's by far not unlikely), it's
  better to leave it as is rather than being at risk of catching spurious
  hangs that might be caused by this modification.
 
  Thanks for the detailed explanation!
  It would be really great if we could push some of that into the
  comments, please?
 
  Anyway, reading this thread again, I guess I finally got what you meant
  Vladimir.
  You are basically saying that the two stage enabling can be done
  by static_key_slow_inc in the first step and memcg_kmem_set_active
  in the second step without an additional flag.
  Assuming that the writers cannot race (they cannot currently because
  they are linearized by set_limit_mutex and memcg_create_mutex) and
  readers (charging paths) are _always_ checking the static key before
  checking active flags?
 
 Right. There is no point in checking the static key after checking
 active flags, because the benefit of using static branching would
 disappear then. So IMHO the only thing we should bother is that the
 static key refcounter is incremented *before* the active bit is set.
 That assures all static branches have been patched if a charge path
 succeeds, because a charge path cannot succeed if the active bit is not
 set. That said we won't skip a commit or uncharge after a charge due to
 an unpatched static branch. That's why I think the 'active' bit is enough.
 
 Currently we have two flags 'activated' and 'active', and their usage
 looks strange to me. Throughout the code we only have the 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-10 Thread Vladimir Davydov
On 12/10/2013 01:13 PM, Michal Hocko wrote:
 On Mon 09-12-13 22:44:51, Vladimir Davydov wrote:
 On 12/09/2013 07:22 PM, Michal Hocko wrote:
 On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
 On 12/04/2013 02:08 PM, Glauber Costa wrote:
 Could you do something clever with just one flag? Probably yes. But I
 doubt it would
 be that much cleaner, this is just the way that patching sites work.
 Thank you for spending your time to listen to me.

 Don't worry! I thank you for carrying this forward.

 Let me try to explain what is bothering me.

 We have two state bits for each memcg, 'active' and 'activated'. There
 are two scenarios where the bits can be modified:

 1) The kmem limit is set on a memcg for the first time -
 memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
 which sets the 'activated' bit on success, then update static branching,
 then set the 'active' bit. All three actions are done atomically in
 respect to other tasks setting the limit due to the set_limit_mutex.
 After both bits are set, they never get cleared for the memcg.

 So far so good. But again, note how you yourself describe it:
 the cations are done atomically  *in respect to other tasks setting the 
 limit*

 But there are also tasks that are running its courses naturally and
 just allocating
 memory. For those, some call sites will be on, some will be off. We need 
 to make
 sure that *none* of them uses the patched site until *all* of them are 
 patched.
 This has nothing to do with updates, this is all about the readers.

 2) When a subgroup of a kmem-active cgroup is created -
 memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
 then increase static branching refcounter, then call
 memcg_update_cache_sizes() for the new memcg, which may clear the
 'activated' bit on failure. After successful execution, the state bits
 never get cleared for the new memcg.

 In scenario 2 there is no need bothering about the flags setting order,
 because we don't have any tasks in the cgroup yet - the tasks can be
 moved in only after css_online finishes when we have both of the bits
 set and the static branching enabled. Actually, we already do not bother
 about it, because we have both bits set before the cgroup is fully
 initialized (memcg_update_cache_sizes() is called).

 Yes, after the first cgroup is set none of that matters. But it is just 
 easier
 and less error prone just to follow the same path every time. As I have 
 said,
 if you can come up with a more clever way to deal with the problem above
 that doesn't involve the double flag - and you can prove it works - I
 am definitely
 fine with it. But this is subtle code, and in the past - Michal can
 attest this - we've
 changed this being sure it would work just to see it explode in our faces.

 So although I am willing to review every patch for correctness on that
 front (I never
 said I liked the 2-flags scheme...), unless you have a bug or real
 problem on it,
 I would advise against changing it if its just to make it more readable.

 But again, don't take me too seriously on this. If you and Michal think 
 you can
 come up with something better, I'm all for it.
 All right, I finally get you :-)

 Although I still don't think we need the second flag, I now understand
 that it's better not to change the code that works fine especially the
 change does not make it neither more readable nor more effective. Since
 I can be mistaken about the flags usage (it's by far not unlikely), it's
 better to leave it as is rather than being at risk of catching spurious
 hangs that might be caused by this modification.

 Thanks for the detailed explanation!
 It would be really great if we could push some of that into the
 comments, please?

 Anyway, reading this thread again, I guess I finally got what you meant
 Vladimir.
 You are basically saying that the two stage enabling can be done
 by static_key_slow_inc in the first step and memcg_kmem_set_active
 in the second step without an additional flag.
 Assuming that the writers cannot race (they cannot currently because
 they are linearized by set_limit_mutex and memcg_create_mutex) and
 readers (charging paths) are _always_ checking the static key before
 checking active flags?
 Right. There is no point in checking the static key after checking
 active flags, because the benefit of using static branching would
 disappear then. So IMHO the only thing we should bother is that the
 static key refcounter is incremented *before* the active bit is set.
 That assures all static branches have been patched if a charge path
 succeeds, because a charge path cannot succeed if the active bit is not
 set. That said we won't skip a commit or uncharge after a charge due to
 an unpatched static branch. That's why I think the 'active' bit is enough.

 Currently we have two flags 'activated' and 'active', and their usage
 looks strange to me. Throughout the code we only have the following checks:
 test_bit('active', 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-09 Thread Vladimir Davydov
On 12/09/2013 07:22 PM, Michal Hocko wrote:
> On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
>> On 12/04/2013 02:08 PM, Glauber Costa wrote:
> Could you do something clever with just one flag? Probably yes. But I
> doubt it would
> be that much cleaner, this is just the way that patching sites work.
 Thank you for spending your time to listen to me.

>>> Don't worry! I thank you for carrying this forward.
>>>
 Let me try to explain what is bothering me.

 We have two state bits for each memcg, 'active' and 'activated'. There
 are two scenarios where the bits can be modified:

 1) The kmem limit is set on a memcg for the first time -
 memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
 which sets the 'activated' bit on success, then update static branching,
 then set the 'active' bit. All three actions are done atomically in
 respect to other tasks setting the limit due to the set_limit_mutex.
 After both bits are set, they never get cleared for the memcg.

>>> So far so good. But again, note how you yourself describe it:
>>> the cations are done atomically  *in respect to other tasks setting the 
>>> limit*
>>>
>>> But there are also tasks that are running its courses naturally and
>>> just allocating
>>> memory. For those, some call sites will be on, some will be off. We need to 
>>> make
>>> sure that *none* of them uses the patched site until *all* of them are 
>>> patched.
>>> This has nothing to do with updates, this is all about the readers.
>>>
 2) When a subgroup of a kmem-active cgroup is created -
 memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
 then increase static branching refcounter, then call
 memcg_update_cache_sizes() for the new memcg, which may clear the
 'activated' bit on failure. After successful execution, the state bits
 never get cleared for the new memcg.

 In scenario 2 there is no need bothering about the flags setting order,
 because we don't have any tasks in the cgroup yet - the tasks can be
 moved in only after css_online finishes when we have both of the bits
 set and the static branching enabled. Actually, we already do not bother
 about it, because we have both bits set before the cgroup is fully
 initialized (memcg_update_cache_sizes() is called).

>>> Yes, after the first cgroup is set none of that matters. But it is just 
>>> easier
>>> and less error prone just to follow the same path every time. As I have 
>>> said,
>>> if you can come up with a more clever way to deal with the problem above
>>> that doesn't involve the double flag - and you can prove it works - I
>>> am definitely
>>> fine with it. But this is subtle code, and in the past - Michal can
>>> attest this - we've
>>> changed this being sure it would work just to see it explode in our faces.
>>>
>>> So although I am willing to review every patch for correctness on that
>>> front (I never
>>> said I liked the 2-flags scheme...), unless you have a bug or real
>>> problem on it,
>>> I would advise against changing it if its just to make it more readable.
>>>
>>> But again, don't take me too seriously on this. If you and Michal think you 
>>> can
>>> come up with something better, I'm all for it.
>> All right, I finally get you :-)
>>
>> Although I still don't think we need the second flag, I now understand
>> that it's better not to change the code that works fine especially the
>> change does not make it neither more readable nor more effective. Since
>> I can be mistaken about the flags usage (it's by far not unlikely), it's
>> better to leave it as is rather than being at risk of catching spurious
>> hangs that might be caused by this modification.
>>
>> Thanks for the detailed explanation!
> It would be really great if we could push some of that into the
> comments, please?
>
> Anyway, reading this thread again, I guess I finally got what you meant
> Vladimir.
> You are basically saying that the two stage enabling can be done
> by static_key_slow_inc in the first step and memcg_kmem_set_active
> in the second step without an additional flag.
> Assuming that the writers cannot race (they cannot currently because
> they are linearized by set_limit_mutex and memcg_create_mutex) and
> readers (charging paths) are _always_ checking the static key before
> checking active flags?

Right. There is no point in checking the static key after checking
active flags, because the benefit of using static branching would
disappear then. So IMHO the only thing we should bother is that the
static key refcounter is incremented *before* the active bit is set.
That assures all static branches have been patched if a charge path
succeeds, because a charge path cannot succeed if the active bit is not
set. That said we won't skip a commit or uncharge after a charge due to
an unpatched static branch. That's why I think the 'active' bit is enough.

Currently we 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-09 Thread Michal Hocko
On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
> On 12/04/2013 02:08 PM, Glauber Costa wrote:
> >>> Could you do something clever with just one flag? Probably yes. But I
> >>> doubt it would
> >>> be that much cleaner, this is just the way that patching sites work.
> >> Thank you for spending your time to listen to me.
> >>
> > Don't worry! I thank you for carrying this forward.
> >
> >> Let me try to explain what is bothering me.
> >>
> >> We have two state bits for each memcg, 'active' and 'activated'. There
> >> are two scenarios where the bits can be modified:
> >>
> >> 1) The kmem limit is set on a memcg for the first time -
> >> memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
> >> which sets the 'activated' bit on success, then update static branching,
> >> then set the 'active' bit. All three actions are done atomically in
> >> respect to other tasks setting the limit due to the set_limit_mutex.
> >> After both bits are set, they never get cleared for the memcg.
> >>
> > So far so good. But again, note how you yourself describe it:
> > the cations are done atomically  *in respect to other tasks setting the 
> > limit*
> >
> > But there are also tasks that are running its courses naturally and
> > just allocating
> > memory. For those, some call sites will be on, some will be off. We need to 
> > make
> > sure that *none* of them uses the patched site until *all* of them are 
> > patched.
> > This has nothing to do with updates, this is all about the readers.
> >
> >> 2) When a subgroup of a kmem-active cgroup is created -
> >> memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
> >> then increase static branching refcounter, then call
> >> memcg_update_cache_sizes() for the new memcg, which may clear the
> >> 'activated' bit on failure. After successful execution, the state bits
> >> never get cleared for the new memcg.
> >>
> >> In scenario 2 there is no need bothering about the flags setting order,
> >> because we don't have any tasks in the cgroup yet - the tasks can be
> >> moved in only after css_online finishes when we have both of the bits
> >> set and the static branching enabled. Actually, we already do not bother
> >> about it, because we have both bits set before the cgroup is fully
> >> initialized (memcg_update_cache_sizes() is called).
> >>
> > Yes, after the first cgroup is set none of that matters. But it is just 
> > easier
> > and less error prone just to follow the same path every time. As I have 
> > said,
> > if you can come up with a more clever way to deal with the problem above
> > that doesn't involve the double flag - and you can prove it works - I
> > am definitely
> > fine with it. But this is subtle code, and in the past - Michal can
> > attest this - we've
> > changed this being sure it would work just to see it explode in our faces.
> >
> > So although I am willing to review every patch for correctness on that
> > front (I never
> > said I liked the 2-flags scheme...), unless you have a bug or real
> > problem on it,
> > I would advise against changing it if its just to make it more readable.
> >
> > But again, don't take me too seriously on this. If you and Michal think you 
> > can
> > come up with something better, I'm all for it.
> 
> All right, I finally get you :-)
> 
> Although I still don't think we need the second flag, I now understand
> that it's better not to change the code that works fine especially the
> change does not make it neither more readable nor more effective. Since
> I can be mistaken about the flags usage (it's by far not unlikely), it's
> better to leave it as is rather than being at risk of catching spurious
> hangs that might be caused by this modification.
> 
> Thanks for the detailed explanation!

It would be really great if we could push some of that into the
comments, please?

Anyway, reading this thread again, I guess I finally got what you meant
Vladimir.
You are basically saying that the two stage enabling can be done
by static_key_slow_inc in the first step and memcg_kmem_set_active
in the second step without an additional flag.
Assuming that the writers cannot race (they cannot currently because
they are linearized by set_limit_mutex and memcg_create_mutex) and
readers (charging paths) are _always_ checking the static key before
checking active flags?

I guess this should work. But it would require a deep audit that the
above is correct in all places. For example we do not bother to check
static key during offline/free paths. I guess it should be harmless as
is but who knows...

I would rather see more detailed description of the current state first.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-09 Thread Michal Hocko
On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
 On 12/04/2013 02:08 PM, Glauber Costa wrote:
  Could you do something clever with just one flag? Probably yes. But I
  doubt it would
  be that much cleaner, this is just the way that patching sites work.
  Thank you for spending your time to listen to me.
 
  Don't worry! I thank you for carrying this forward.
 
  Let me try to explain what is bothering me.
 
  We have two state bits for each memcg, 'active' and 'activated'. There
  are two scenarios where the bits can be modified:
 
  1) The kmem limit is set on a memcg for the first time -
  memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
  which sets the 'activated' bit on success, then update static branching,
  then set the 'active' bit. All three actions are done atomically in
  respect to other tasks setting the limit due to the set_limit_mutex.
  After both bits are set, they never get cleared for the memcg.
 
  So far so good. But again, note how you yourself describe it:
  the cations are done atomically  *in respect to other tasks setting the 
  limit*
 
  But there are also tasks that are running its courses naturally and
  just allocating
  memory. For those, some call sites will be on, some will be off. We need to 
  make
  sure that *none* of them uses the patched site until *all* of them are 
  patched.
  This has nothing to do with updates, this is all about the readers.
 
  2) When a subgroup of a kmem-active cgroup is created -
  memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
  then increase static branching refcounter, then call
  memcg_update_cache_sizes() for the new memcg, which may clear the
  'activated' bit on failure. After successful execution, the state bits
  never get cleared for the new memcg.
 
  In scenario 2 there is no need bothering about the flags setting order,
  because we don't have any tasks in the cgroup yet - the tasks can be
  moved in only after css_online finishes when we have both of the bits
  set and the static branching enabled. Actually, we already do not bother
  about it, because we have both bits set before the cgroup is fully
  initialized (memcg_update_cache_sizes() is called).
 
  Yes, after the first cgroup is set none of that matters. But it is just 
  easier
  and less error prone just to follow the same path every time. As I have 
  said,
  if you can come up with a more clever way to deal with the problem above
  that doesn't involve the double flag - and you can prove it works - I
  am definitely
  fine with it. But this is subtle code, and in the past - Michal can
  attest this - we've
  changed this being sure it would work just to see it explode in our faces.
 
  So although I am willing to review every patch for correctness on that
  front (I never
  said I liked the 2-flags scheme...), unless you have a bug or real
  problem on it,
  I would advise against changing it if its just to make it more readable.
 
  But again, don't take me too seriously on this. If you and Michal think you 
  can
  come up with something better, I'm all for it.
 
 All right, I finally get you :-)
 
 Although I still don't think we need the second flag, I now understand
 that it's better not to change the code that works fine especially the
 change does not make it neither more readable nor more effective. Since
 I can be mistaken about the flags usage (it's by far not unlikely), it's
 better to leave it as is rather than being at risk of catching spurious
 hangs that might be caused by this modification.
 
 Thanks for the detailed explanation!

It would be really great if we could push some of that into the
comments, please?

Anyway, reading this thread again, I guess I finally got what you meant
Vladimir.
You are basically saying that the two stage enabling can be done
by static_key_slow_inc in the first step and memcg_kmem_set_active
in the second step without an additional flag.
Assuming that the writers cannot race (they cannot currently because
they are linearized by set_limit_mutex and memcg_create_mutex) and
readers (charging paths) are _always_ checking the static key before
checking active flags?

I guess this should work. But it would require a deep audit that the
above is correct in all places. For example we do not bother to check
static key during offline/free paths. I guess it should be harmless as
is but who knows...

I would rather see more detailed description of the current state first.

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-09 Thread Vladimir Davydov
On 12/09/2013 07:22 PM, Michal Hocko wrote:
 On Wed 04-12-13 15:56:51, Vladimir Davydov wrote:
 On 12/04/2013 02:08 PM, Glauber Costa wrote:
 Could you do something clever with just one flag? Probably yes. But I
 doubt it would
 be that much cleaner, this is just the way that patching sites work.
 Thank you for spending your time to listen to me.

 Don't worry! I thank you for carrying this forward.

 Let me try to explain what is bothering me.

 We have two state bits for each memcg, 'active' and 'activated'. There
 are two scenarios where the bits can be modified:

 1) The kmem limit is set on a memcg for the first time -
 memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
 which sets the 'activated' bit on success, then update static branching,
 then set the 'active' bit. All three actions are done atomically in
 respect to other tasks setting the limit due to the set_limit_mutex.
 After both bits are set, they never get cleared for the memcg.

 So far so good. But again, note how you yourself describe it:
 the cations are done atomically  *in respect to other tasks setting the 
 limit*

 But there are also tasks that are running its courses naturally and
 just allocating
 memory. For those, some call sites will be on, some will be off. We need to 
 make
 sure that *none* of them uses the patched site until *all* of them are 
 patched.
 This has nothing to do with updates, this is all about the readers.

 2) When a subgroup of a kmem-active cgroup is created -
 memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
 then increase static branching refcounter, then call
 memcg_update_cache_sizes() for the new memcg, which may clear the
 'activated' bit on failure. After successful execution, the state bits
 never get cleared for the new memcg.

 In scenario 2 there is no need bothering about the flags setting order,
 because we don't have any tasks in the cgroup yet - the tasks can be
 moved in only after css_online finishes when we have both of the bits
 set and the static branching enabled. Actually, we already do not bother
 about it, because we have both bits set before the cgroup is fully
 initialized (memcg_update_cache_sizes() is called).

 Yes, after the first cgroup is set none of that matters. But it is just 
 easier
 and less error prone just to follow the same path every time. As I have 
 said,
 if you can come up with a more clever way to deal with the problem above
 that doesn't involve the double flag - and you can prove it works - I
 am definitely
 fine with it. But this is subtle code, and in the past - Michal can
 attest this - we've
 changed this being sure it would work just to see it explode in our faces.

 So although I am willing to review every patch for correctness on that
 front (I never
 said I liked the 2-flags scheme...), unless you have a bug or real
 problem on it,
 I would advise against changing it if its just to make it more readable.

 But again, don't take me too seriously on this. If you and Michal think you 
 can
 come up with something better, I'm all for it.
 All right, I finally get you :-)

 Although I still don't think we need the second flag, I now understand
 that it's better not to change the code that works fine especially the
 change does not make it neither more readable nor more effective. Since
 I can be mistaken about the flags usage (it's by far not unlikely), it's
 better to leave it as is rather than being at risk of catching spurious
 hangs that might be caused by this modification.

 Thanks for the detailed explanation!
 It would be really great if we could push some of that into the
 comments, please?

 Anyway, reading this thread again, I guess I finally got what you meant
 Vladimir.
 You are basically saying that the two stage enabling can be done
 by static_key_slow_inc in the first step and memcg_kmem_set_active
 in the second step without an additional flag.
 Assuming that the writers cannot race (they cannot currently because
 they are linearized by set_limit_mutex and memcg_create_mutex) and
 readers (charging paths) are _always_ checking the static key before
 checking active flags?

Right. There is no point in checking the static key after checking
active flags, because the benefit of using static branching would
disappear then. So IMHO the only thing we should bother is that the
static key refcounter is incremented *before* the active bit is set.
That assures all static branches have been patched if a charge path
succeeds, because a charge path cannot succeed if the active bit is not
set. That said we won't skip a commit or uncharge after a charge due to
an unpatched static branch. That's why I think the 'active' bit is enough.

Currently we have two flags 'activated' and 'active', and their usage
looks strange to me. Throughout the code we only have the following checks:
test_bit('active', state_mask)
test_bit('active', state_mask)test_bit('activated', state_mask)
Since 'active' bit is always 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-04 Thread Vladimir Davydov
On 12/04/2013 02:08 PM, Glauber Costa wrote:
>>> Could you do something clever with just one flag? Probably yes. But I
>>> doubt it would
>>> be that much cleaner, this is just the way that patching sites work.
>> Thank you for spending your time to listen to me.
>>
> Don't worry! I thank you for carrying this forward.
>
>> Let me try to explain what is bothering me.
>>
>> We have two state bits for each memcg, 'active' and 'activated'. There
>> are two scenarios where the bits can be modified:
>>
>> 1) The kmem limit is set on a memcg for the first time -
>> memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
>> which sets the 'activated' bit on success, then update static branching,
>> then set the 'active' bit. All three actions are done atomically in
>> respect to other tasks setting the limit due to the set_limit_mutex.
>> After both bits are set, they never get cleared for the memcg.
>>
> So far so good. But again, note how you yourself describe it:
> the cations are done atomically  *in respect to other tasks setting the limit*
>
> But there are also tasks that are running its courses naturally and
> just allocating
> memory. For those, some call sites will be on, some will be off. We need to 
> make
> sure that *none* of them uses the patched site until *all* of them are 
> patched.
> This has nothing to do with updates, this is all about the readers.
>
>> 2) When a subgroup of a kmem-active cgroup is created -
>> memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
>> then increase static branching refcounter, then call
>> memcg_update_cache_sizes() for the new memcg, which may clear the
>> 'activated' bit on failure. After successful execution, the state bits
>> never get cleared for the new memcg.
>>
>> In scenario 2 there is no need bothering about the flags setting order,
>> because we don't have any tasks in the cgroup yet - the tasks can be
>> moved in only after css_online finishes when we have both of the bits
>> set and the static branching enabled. Actually, we already do not bother
>> about it, because we have both bits set before the cgroup is fully
>> initialized (memcg_update_cache_sizes() is called).
>>
> Yes, after the first cgroup is set none of that matters. But it is just easier
> and less error prone just to follow the same path every time. As I have said,
> if you can come up with a more clever way to deal with the problem above
> that doesn't involve the double flag - and you can prove it works - I
> am definitely
> fine with it. But this is subtle code, and in the past - Michal can
> attest this - we've
> changed this being sure it would work just to see it explode in our faces.
>
> So although I am willing to review every patch for correctness on that
> front (I never
> said I liked the 2-flags scheme...), unless you have a bug or real
> problem on it,
> I would advise against changing it if its just to make it more readable.
>
> But again, don't take me too seriously on this. If you and Michal think you 
> can
> come up with something better, I'm all for it.

All right, I finally get you :-)

Although I still don't think we need the second flag, I now understand
that it's better not to change the code that works fine especially the
change does not make it neither more readable nor more effective. Since
I can be mistaken about the flags usage (it's by far not unlikely), it's
better to leave it as is rather than being at risk of catching spurious
hangs that might be caused by this modification.

Thanks for the detailed explanation!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-04 Thread Glauber Costa
>>
>> Could you do something clever with just one flag? Probably yes. But I
>> doubt it would
>> be that much cleaner, this is just the way that patching sites work.
>
> Thank you for spending your time to listen to me.
>
Don't worry! I thank you for carrying this forward.

> Let me try to explain what is bothering me.
>
> We have two state bits for each memcg, 'active' and 'activated'. There
> are two scenarios where the bits can be modified:
>
> 1) The kmem limit is set on a memcg for the first time -
> memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
> which sets the 'activated' bit on success, then update static branching,
> then set the 'active' bit. All three actions are done atomically in
> respect to other tasks setting the limit due to the set_limit_mutex.
> After both bits are set, they never get cleared for the memcg.
>
So far so good. But again, note how you yourself describe it:
the cations are done atomically  *in respect to other tasks setting the limit*

But there are also tasks that are running its courses naturally and
just allocating
memory. For those, some call sites will be on, some will be off. We need to make
sure that *none* of them uses the patched site until *all* of them are patched.
This has nothing to do with updates, this is all about the readers.

> 2) When a subgroup of a kmem-active cgroup is created -
> memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
> then increase static branching refcounter, then call
> memcg_update_cache_sizes() for the new memcg, which may clear the
> 'activated' bit on failure. After successful execution, the state bits
> never get cleared for the new memcg.
>
> In scenario 2 there is no need bothering about the flags setting order,
> because we don't have any tasks in the cgroup yet - the tasks can be
> moved in only after css_online finishes when we have both of the bits
> set and the static branching enabled. Actually, we already do not bother
> about it, because we have both bits set before the cgroup is fully
> initialized (memcg_update_cache_sizes() is called).
>
Yes, after the first cgroup is set none of that matters. But it is just easier
and less error prone just to follow the same path every time. As I have said,
if you can come up with a more clever way to deal with the problem above
that doesn't involve the double flag - and you can prove it works - I
am definitely
fine with it. But this is subtle code, and in the past - Michal can
attest this - we've
changed this being sure it would work just to see it explode in our faces.

So although I am willing to review every patch for correctness on that
front (I never
said I liked the 2-flags scheme...), unless you have a bug or real
problem on it,
I would advise against changing it if its just to make it more readable.

But again, don't take me too seriously on this. If you and Michal think you can
come up with something better, I'm all for it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-04 Thread Glauber Costa

 Could you do something clever with just one flag? Probably yes. But I
 doubt it would
 be that much cleaner, this is just the way that patching sites work.

 Thank you for spending your time to listen to me.

Don't worry! I thank you for carrying this forward.

 Let me try to explain what is bothering me.

 We have two state bits for each memcg, 'active' and 'activated'. There
 are two scenarios where the bits can be modified:

 1) The kmem limit is set on a memcg for the first time -
 memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
 which sets the 'activated' bit on success, then update static branching,
 then set the 'active' bit. All three actions are done atomically in
 respect to other tasks setting the limit due to the set_limit_mutex.
 After both bits are set, they never get cleared for the memcg.

So far so good. But again, note how you yourself describe it:
the cations are done atomically  *in respect to other tasks setting the limit*

But there are also tasks that are running its courses naturally and
just allocating
memory. For those, some call sites will be on, some will be off. We need to make
sure that *none* of them uses the patched site until *all* of them are patched.
This has nothing to do with updates, this is all about the readers.

 2) When a subgroup of a kmem-active cgroup is created -
 memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
 then increase static branching refcounter, then call
 memcg_update_cache_sizes() for the new memcg, which may clear the
 'activated' bit on failure. After successful execution, the state bits
 never get cleared for the new memcg.

 In scenario 2 there is no need bothering about the flags setting order,
 because we don't have any tasks in the cgroup yet - the tasks can be
 moved in only after css_online finishes when we have both of the bits
 set and the static branching enabled. Actually, we already do not bother
 about it, because we have both bits set before the cgroup is fully
 initialized (memcg_update_cache_sizes() is called).

Yes, after the first cgroup is set none of that matters. But it is just easier
and less error prone just to follow the same path every time. As I have said,
if you can come up with a more clever way to deal with the problem above
that doesn't involve the double flag - and you can prove it works - I
am definitely
fine with it. But this is subtle code, and in the past - Michal can
attest this - we've
changed this being sure it would work just to see it explode in our faces.

So although I am willing to review every patch for correctness on that
front (I never
said I liked the 2-flags scheme...), unless you have a bug or real
problem on it,
I would advise against changing it if its just to make it more readable.

But again, don't take me too seriously on this. If you and Michal think you can
come up with something better, I'm all for it.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-04 Thread Vladimir Davydov
On 12/04/2013 02:08 PM, Glauber Costa wrote:
 Could you do something clever with just one flag? Probably yes. But I
 doubt it would
 be that much cleaner, this is just the way that patching sites work.
 Thank you for spending your time to listen to me.

 Don't worry! I thank you for carrying this forward.

 Let me try to explain what is bothering me.

 We have two state bits for each memcg, 'active' and 'activated'. There
 are two scenarios where the bits can be modified:

 1) The kmem limit is set on a memcg for the first time -
 memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
 which sets the 'activated' bit on success, then update static branching,
 then set the 'active' bit. All three actions are done atomically in
 respect to other tasks setting the limit due to the set_limit_mutex.
 After both bits are set, they never get cleared for the memcg.

 So far so good. But again, note how you yourself describe it:
 the cations are done atomically  *in respect to other tasks setting the limit*

 But there are also tasks that are running its courses naturally and
 just allocating
 memory. For those, some call sites will be on, some will be off. We need to 
 make
 sure that *none* of them uses the patched site until *all* of them are 
 patched.
 This has nothing to do with updates, this is all about the readers.

 2) When a subgroup of a kmem-active cgroup is created -
 memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
 then increase static branching refcounter, then call
 memcg_update_cache_sizes() for the new memcg, which may clear the
 'activated' bit on failure. After successful execution, the state bits
 never get cleared for the new memcg.

 In scenario 2 there is no need bothering about the flags setting order,
 because we don't have any tasks in the cgroup yet - the tasks can be
 moved in only after css_online finishes when we have both of the bits
 set and the static branching enabled. Actually, we already do not bother
 about it, because we have both bits set before the cgroup is fully
 initialized (memcg_update_cache_sizes() is called).

 Yes, after the first cgroup is set none of that matters. But it is just easier
 and less error prone just to follow the same path every time. As I have said,
 if you can come up with a more clever way to deal with the problem above
 that doesn't involve the double flag - and you can prove it works - I
 am definitely
 fine with it. But this is subtle code, and in the past - Michal can
 attest this - we've
 changed this being sure it would work just to see it explode in our faces.

 So although I am willing to review every patch for correctness on that
 front (I never
 said I liked the 2-flags scheme...), unless you have a bug or real
 problem on it,
 I would advise against changing it if its just to make it more readable.

 But again, don't take me too seriously on this. If you and Michal think you 
 can
 come up with something better, I'm all for it.

All right, I finally get you :-)

Although I still don't think we need the second flag, I now understand
that it's better not to change the code that works fine especially the
change does not make it neither more readable nor more effective. Since
I can be mistaken about the flags usage (it's by far not unlikely), it's
better to leave it as is rather than being at risk of catching spurious
hangs that might be caused by this modification.

Thanks for the detailed explanation!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov
On 12/04/2013 02:38 AM, Glauber Costa wrote:
>> In memcg_update_kmem_limit() we do the whole process of limit
>> initialization under a mutex so the situation we need protection from in
>> tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
>> never cleared and never checked alone, only along with the 'active'
>> flag, that's why I doubt we need it at all.
> The updates are protected by a mutex, but the readers are not, and should not.
> So we can still be patching the readers, and the double-flag was
> initially used so
> we can make sure that both flags are only set after the static branches are 
> in.
>
> Note that one of the flags is set inside memcg_update_cache_sizes(). After 
> that,
> we call static_key_slow_inc(). At this point, someone whose code is
> patched in could
> start accounting, but it shouldn't - because not all sites are patched in.
>
> So after the update is done, we set the other flag, and now everybody
> will start going
> through.
>
> Could you do something clever with just one flag? Probably yes. But I
> doubt it would
> be that much cleaner, this is just the way that patching sites work.

Thank you for spending your time to listen to me.

Let me try to explain what is bothering me.

We have two state bits for each memcg, 'active' and 'activated'. There
are two scenarios where the bits can be modified:

1) The kmem limit is set on a memcg for the first time -
memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
which sets the 'activated' bit on success, then update static branching,
then set the 'active' bit. All three actions are done atomically in
respect to other tasks setting the limit due to the set_limit_mutex.
After both bits are set, they never get cleared for the memcg.

2) When a subgroup of a kmem-active cgroup is created -
memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
then increase static branching refcounter, then call
memcg_update_cache_sizes() for the new memcg, which may clear the
'activated' bit on failure. After successful execution, the state bits
never get cleared for the new memcg.

In scenario 2 there is no need bothering about the flags setting order,
because we don't have any tasks in the cgroup yet - the tasks can be
moved in only after css_online finishes when we have both of the bits
set and the static branching enabled. Actually, we already do not bother
about it, because we have both bits set before the cgroup is fully
initialized (memcg_update_cache_sizes() is called).

Let's look at scenario 1. There we have two bits - 'activated' and
'active' - the latter is always set after the former and after the
static branching is enabled. Moreover, none of the bits is cleared once
it's set. That said checking if both bits are set - I mean
memcg_can_account_kmem() - is equivalent to checking if the 'acitve' bit
is set. Next, the 'activated' bit is never checked alone, only along
with the 'active' bit in memcg_can_account_kmem() - I do not count
(!memcg->kmem_account_flags) check in memcg_update_kmem_limit(), because
it is done under the set_limit_mutex. What's the deal having it then?

Thanks and sorry for disturbing you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Glauber Costa
>>> Hi, Glauber

Hi.

>
> In memcg_update_kmem_limit() we do the whole process of limit
> initialization under a mutex so the situation we need protection from in
> tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
> never cleared and never checked alone, only along with the 'active'
> flag, that's why I doubt we need it at all.

The updates are protected by a mutex, but the readers are not, and should not.
So we can still be patching the readers, and the double-flag was
initially used so
we can make sure that both flags are only set after the static branches are in.

Note that one of the flags is set inside memcg_update_cache_sizes(). After that,
we call static_key_slow_inc(). At this point, someone whose code is
patched in could
start accounting, but it shouldn't - because not all sites are patched in.

So after the update is done, we set the other flag, and now everybody
will start going
through.

Could you do something clever with just one flag? Probably yes. But I
doubt it would
be that much cleaner, this is just the way that patching sites work.

-- 
E Mare, Libertas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 11:56 AM, Glauber Costa wrote:
> On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov
>  wrote:
>> On 12/02/2013 10:26 PM, Glauber Costa wrote:
>>> On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko  wrote:
 [CCing Glauber - please do so in other posts for kmem related changes]

 On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
> use static branches when code not in use") in order to guarantee that
> static_key_slow_inc(_kmem_enabled_key) will be called only once
> for each memory cgroup when its kmem limit is set. The point is that at
> that time the memcg_update_kmem_limit() function's workflow looked like
> this:
>
>bool must_inc_static_branch = false;
>
>cgroup_lock();
>mutex_lock(_limit_mutex);
>if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
>/* The kmem limit is set for the first time */
>ret = res_counter_set_limit(>kmem, val);
>
>memcg_kmem_set_activated(memcg);
>must_inc_static_branch = true;
>} else
>ret = res_counter_set_limit(>kmem, val);
>mutex_unlock(_limit_mutex);
>cgroup_unlock();
>
>if (must_inc_static_branch) {
>/* We can't do this under cgroup_lock */
>static_key_slow_inc(_kmem_enabled_key);
>memcg_kmem_set_active(memcg);
>}
>
> Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
> static_key_slow_inc() is called under the set_limit_mutex, but the
> leftover from the above-mentioned commit is still here. Let's remove it.
 OK, so I have looked there again and 692e89abd154b (memcg: increment
 static branch right after limit set) which went in after cgroup_mutex
 has been removed. It came along with the following comment.
  /*
   * setting the active bit after the inc will guarantee
 no one
   * starts accounting before all call sites are patched
   */

 This suggests that the flag is needed after all because we have
 to be sure that _all_ the places have to be patched. AFAIU
 memcg_kmem_newpage_charge might see the static key already patched so
 it would do a charge but memcg_kmem_commit_charge would still see it
 unpatched and so the charge won't be committed.

 Or am I missing something?
>>> You are correct. This flag is there due to the way we are using static
>>> branches.
>>> The patching of one call site is atomic, but the patching of all of
>>> them are not.
>>> Therefore we need to use a two-flag scheme to guarantee that in the first
>>> time
>>> we turn the static branches on, there will be a clear point after
>>> which we're going
>>> to start accounting.
>>
>> Hi, Glauber
>>
>> Sorry, but I don't understand why we need two flags. Isn't checking the flag
>> set after all call sites have been patched (I mean KMEM_ACCOUNTED_ACTIVE)
>> not enough?
> Take a look at net/ipv4/tcp_memcontrol.c. There are comprehensive comments 
> there
> for a mechanism that basically achieves the same thing. The idea is
> that one flag is used
> at all times and means "it is enabled". The second flags is a one time
> only flag to indicate
> that the patching process is complete. With one flag it seems to work,
> but it is racy.

AFAIU, the point of using two flags in tcp_update_limit() is that we set
the limit and update static branching lockless so the 'activated' flag
is needed there in order to make sure only one process will call
static_key_slow_inc() in case there are concurrent processes setting the
limit. The comment there confirms my assumption:

 * The activated bit is used to guarantee that no two writers
 * will do the update in the same memcg. Without that, we can't
 * properly shutdown the static key.
 */
if (!test_and_set_bit(MEMCG_SOCK_ACTIVATED, _proto->flags))
static_key_slow_inc(_socket_limit_enabled);
set_bit(MEMCG_SOCK_ACTIVE, _proto->flags);

In memcg_update_kmem_limit() we do the whole process of limit
initialization under a mutex so the situation we need protection from in
tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
never cleared and never checked alone, only along with the 'active'
flag, that's why I doubt we need it at all.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov
On 12/03/2013 11:56 AM, Glauber Costa wrote:
 On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov
 vdavy...@parallels.com wrote:
 On 12/02/2013 10:26 PM, Glauber Costa wrote:
 On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:
 [CCing Glauber - please do so in other posts for kmem related changes]

 On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
 The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
 use static branches when code not in use) in order to guarantee that
 static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
 for each memory cgroup when its kmem limit is set. The point is that at
 that time the memcg_update_kmem_limit() function's workflow looked like
 this:

bool must_inc_static_branch = false;

cgroup_lock();
mutex_lock(set_limit_mutex);
if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(memcg-kmem, val);

memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(memcg-kmem, val);
mutex_unlock(set_limit_mutex);
cgroup_unlock();

if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(memcg_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}

 Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
 static_key_slow_inc() is called under the set_limit_mutex, but the
 leftover from the above-mentioned commit is still here. Let's remove it.
 OK, so I have looked there again and 692e89abd154b (memcg: increment
 static branch right after limit set) which went in after cgroup_mutex
 has been removed. It came along with the following comment.
  /*
   * setting the active bit after the inc will guarantee
 no one
   * starts accounting before all call sites are patched
   */

 This suggests that the flag is needed after all because we have
 to be sure that _all_ the places have to be patched. AFAIU
 memcg_kmem_newpage_charge might see the static key already patched so
 it would do a charge but memcg_kmem_commit_charge would still see it
 unpatched and so the charge won't be committed.

 Or am I missing something?
 You are correct. This flag is there due to the way we are using static
 branches.
 The patching of one call site is atomic, but the patching of all of
 them are not.
 Therefore we need to use a two-flag scheme to guarantee that in the first
 time
 we turn the static branches on, there will be a clear point after
 which we're going
 to start accounting.

 Hi, Glauber

 Sorry, but I don't understand why we need two flags. Isn't checking the flag
 set after all call sites have been patched (I mean KMEM_ACCOUNTED_ACTIVE)
 not enough?
 Take a look at net/ipv4/tcp_memcontrol.c. There are comprehensive comments 
 there
 for a mechanism that basically achieves the same thing. The idea is
 that one flag is used
 at all times and means it is enabled. The second flags is a one time
 only flag to indicate
 that the patching process is complete. With one flag it seems to work,
 but it is racy.

AFAIU, the point of using two flags in tcp_update_limit() is that we set
the limit and update static branching lockless so the 'activated' flag
is needed there in order to make sure only one process will call
static_key_slow_inc() in case there are concurrent processes setting the
limit. The comment there confirms my assumption:

 * The activated bit is used to guarantee that no two writers
 * will do the update in the same memcg. Without that, we can't
 * properly shutdown the static key.
 */
if (!test_and_set_bit(MEMCG_SOCK_ACTIVATED, cg_proto-flags))
static_key_slow_inc(memcg_socket_limit_enabled);
set_bit(MEMCG_SOCK_ACTIVE, cg_proto-flags);

In memcg_update_kmem_limit() we do the whole process of limit
initialization under a mutex so the situation we need protection from in
tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
never cleared and never checked alone, only along with the 'active'
flag, that's why I doubt we need it at all.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Glauber Costa
 Hi, Glauber

Hi.


 In memcg_update_kmem_limit() we do the whole process of limit
 initialization under a mutex so the situation we need protection from in
 tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
 never cleared and never checked alone, only along with the 'active'
 flag, that's why I doubt we need it at all.

The updates are protected by a mutex, but the readers are not, and should not.
So we can still be patching the readers, and the double-flag was
initially used so
we can make sure that both flags are only set after the static branches are in.

Note that one of the flags is set inside memcg_update_cache_sizes(). After that,
we call static_key_slow_inc(). At this point, someone whose code is
patched in could
start accounting, but it shouldn't - because not all sites are patched in.

So after the update is done, we set the other flag, and now everybody
will start going
through.

Could you do something clever with just one flag? Probably yes. But I
doubt it would
be that much cleaner, this is just the way that patching sites work.

-- 
E Mare, Libertas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-03 Thread Vladimir Davydov
On 12/04/2013 02:38 AM, Glauber Costa wrote:
 In memcg_update_kmem_limit() we do the whole process of limit
 initialization under a mutex so the situation we need protection from in
 tcp_update_limit() is impossible. BTW once set, the 'activated' flag is
 never cleared and never checked alone, only along with the 'active'
 flag, that's why I doubt we need it at all.
 The updates are protected by a mutex, but the readers are not, and should not.
 So we can still be patching the readers, and the double-flag was
 initially used so
 we can make sure that both flags are only set after the static branches are 
 in.

 Note that one of the flags is set inside memcg_update_cache_sizes(). After 
 that,
 we call static_key_slow_inc(). At this point, someone whose code is
 patched in could
 start accounting, but it shouldn't - because not all sites are patched in.

 So after the update is done, we set the other flag, and now everybody
 will start going
 through.

 Could you do something clever with just one flag? Probably yes. But I
 doubt it would
 be that much cleaner, this is just the way that patching sites work.

Thank you for spending your time to listen to me.

Let me try to explain what is bothering me.

We have two state bits for each memcg, 'active' and 'activated'. There
are two scenarios where the bits can be modified:

1) The kmem limit is set on a memcg for the first time -
memcg_update_kmem_limit(). Here we call memcg_update_cache_sizes(),
which sets the 'activated' bit on success, then update static branching,
then set the 'active' bit. All three actions are done atomically in
respect to other tasks setting the limit due to the set_limit_mutex.
After both bits are set, they never get cleared for the memcg.

2) When a subgroup of a kmem-active cgroup is created -
memcg_propagate_kmem(). Here we copy kmem_account_flags from the parent,
then increase static branching refcounter, then call
memcg_update_cache_sizes() for the new memcg, which may clear the
'activated' bit on failure. After successful execution, the state bits
never get cleared for the new memcg.

In scenario 2 there is no need bothering about the flags setting order,
because we don't have any tasks in the cgroup yet - the tasks can be
moved in only after css_online finishes when we have both of the bits
set and the static branching enabled. Actually, we already do not bother
about it, because we have both bits set before the cgroup is fully
initialized (memcg_update_cache_sizes() is called).

Let's look at scenario 1. There we have two bits - 'activated' and
'active' - the latter is always set after the former and after the
static branching is enabled. Moreover, none of the bits is cleared once
it's set. That said checking if both bits are set - I mean
memcg_can_account_kmem() - is equivalent to checking if the 'acitve' bit
is set. Next, the 'activated' bit is never checked alone, only along
with the 'active' bit in memcg_can_account_kmem() - I do not count
(!memcg-kmem_account_flags) check in memcg_update_kmem_limit(), because
it is done under the set_limit_mutex. What's the deal having it then?

Thanks and sorry for disturbing you.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Glauber Costa
On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov
 wrote:
> On 12/02/2013 10:26 PM, Glauber Costa wrote:
>>
>> On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko  wrote:
>>>
>>> [CCing Glauber - please do so in other posts for kmem related changes]
>>>
>>> On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:

 The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
 use static branches when code not in use") in order to guarantee that
 static_key_slow_inc(_kmem_enabled_key) will be called only once
 for each memory cgroup when its kmem limit is set. The point is that at
 that time the memcg_update_kmem_limit() function's workflow looked like
 this:

bool must_inc_static_branch = false;

cgroup_lock();
mutex_lock(_limit_mutex);
if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(>kmem, val);

memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(>kmem, val);
mutex_unlock(_limit_mutex);
cgroup_unlock();

if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}

 Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
 static_key_slow_inc() is called under the set_limit_mutex, but the
 leftover from the above-mentioned commit is still here. Let's remove it.
>>>
>>> OK, so I have looked there again and 692e89abd154b (memcg: increment
>>> static branch right after limit set) which went in after cgroup_mutex
>>> has been removed. It came along with the following comment.
>>>  /*
>>>   * setting the active bit after the inc will guarantee
>>> no one
>>>   * starts accounting before all call sites are patched
>>>   */
>>>
>>> This suggests that the flag is needed after all because we have
>>> to be sure that _all_ the places have to be patched. AFAIU
>>> memcg_kmem_newpage_charge might see the static key already patched so
>>> it would do a charge but memcg_kmem_commit_charge would still see it
>>> unpatched and so the charge won't be committed.
>>>
>>> Or am I missing something?
>>
>> You are correct. This flag is there due to the way we are using static
>> branches.
>> The patching of one call site is atomic, but the patching of all of
>> them are not.
>> Therefore we need to use a two-flag scheme to guarantee that in the first
>> time
>> we turn the static branches on, there will be a clear point after
>> which we're going
>> to start accounting.
>
>
> Hi, Glauber
>
> Sorry, but I don't understand why we need two flags. Isn't checking the flag
> set after all call sites have been patched (I mean KMEM_ACCOUNTED_ACTIVE)
> not enough?

Take a look at net/ipv4/tcp_memcontrol.c. There are comprehensive comments there
for a mechanism that basically achieves the same thing. The idea is
that one flag is used
at all times and means "it is enabled". The second flags is a one time
only flag to indicate
that the patching process is complete. With one flag it seems to work,
but it is racy.

-- 
E Mare, Libertas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov

On 12/02/2013 10:15 PM, Michal Hocko wrote:

[CCing Glauber - please do so in other posts for kmem related changes]

On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:

The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
use static branches when code not in use") in order to guarantee that
static_key_slow_inc(_kmem_enabled_key) will be called only once
for each memory cgroup when its kmem limit is set. The point is that at
that time the memcg_update_kmem_limit() function's workflow looked like
this:

bool must_inc_static_branch = false;

cgroup_lock();
mutex_lock(_limit_mutex);
if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(>kmem, val);

memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(>kmem, val);
mutex_unlock(_limit_mutex);
cgroup_unlock();

if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}

Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
static_key_slow_inc() is called under the set_limit_mutex, but the
leftover from the above-mentioned commit is still here. Let's remove it.

OK, so I have looked there again and 692e89abd154b (memcg: increment
static branch right after limit set) which went in after cgroup_mutex
has been removed. It came along with the following comment.
 /*
  * setting the active bit after the inc will guarantee no one
  * starts accounting before all call sites are patched
  */

This suggests that the flag is needed after all because we have
to be sure that _all_ the places have to be patched. AFAIU
memcg_kmem_newpage_charge might see the static key already patched so
it would do a charge but memcg_kmem_commit_charge would still see it
unpatched and so the charge won't be committed.


Hmm, but we have the KMEM_ACCOUNTED_ACTIVE flag, which is set *after* 
static_key_slow_inc() in memcg_update_kmem_limit(), and 
__memcg_kmem_newpage_charge() checks it before charging 
(memcg_can_account_kmem()). So the situation you described is impossible?




Or am I missing something?
  

Signed-off-by: Vladimir Davydov 
Cc: Johannes Weiner 
Cc: Michal Hocko 
Cc: Balbir Singh 
Cc: KAMEZAWA Hiroyuki 
---
  mm/memcontrol.c |   26 +-
  1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f1a0ae6..f78661f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -344,14 +344,9 @@ static size_t memcg_size(void)
  /* internal only representation about the status of kmem accounting. */
  enum {
KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
-   KMEM_ACCOUNTED_ACTIVATED, /* static key enabled. */
KMEM_ACCOUNTED_DEAD, /* dead memcg with pending kmem charges */
  };
  
-/* We account when limit is on, but only after call sites are patched */

-#define KMEM_ACCOUNTED_MASK \
-   ((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
-
  #ifdef CONFIG_MEMCG_KMEM
  static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
  {
@@ -363,16 +358,6 @@ static bool memcg_kmem_is_active(struct mem_cgroup *memcg)
return test_bit(KMEM_ACCOUNTED_ACTIVE, >kmem_account_flags);
  }
  
-static void memcg_kmem_set_activated(struct mem_cgroup *memcg)

-{
-   set_bit(KMEM_ACCOUNTED_ACTIVATED, >kmem_account_flags);
-}
-
-static void memcg_kmem_clear_activated(struct mem_cgroup *memcg)
-{
-   clear_bit(KMEM_ACCOUNTED_ACTIVATED, >kmem_account_flags);
-}
-
  static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
  {
/*
@@ -2956,7 +2941,7 @@ static DEFINE_MUTEX(set_limit_mutex);
  static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
  {
return !mem_cgroup_disabled() && !mem_cgroup_is_root(memcg) &&
-   (memcg->kmem_account_flags & KMEM_ACCOUNTED_MASK);
+   memcg_kmem_is_active(memcg);
  }
  
  /*

@@ -3091,19 +3076,10 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg)
0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
if (num < 0)
return num;
-   /*
-* After this point, kmem_accounted (that we test atomically in
-* the beginning of this conditional), is no longer 0. This
-* guarantees only one process will set the following boolean
-* to true. We don't need test_and_set because we're protected
-* by the set_limit_mutex anyway.
-*/
-   memcg_kmem_set_activated(memcg);
  
  	ret = memcg_update_all_caches(num+1);

if (ret) {
ida_simple_remove(_limited_groups, num);
-   

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov

On 12/02/2013 10:26 PM, Glauber Costa wrote:

On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko  wrote:

[CCing Glauber - please do so in other posts for kmem related changes]

On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:

The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
use static branches when code not in use") in order to guarantee that
static_key_slow_inc(_kmem_enabled_key) will be called only once
for each memory cgroup when its kmem limit is set. The point is that at
that time the memcg_update_kmem_limit() function's workflow looked like
this:

   bool must_inc_static_branch = false;

   cgroup_lock();
   mutex_lock(_limit_mutex);
   if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
   /* The kmem limit is set for the first time */
   ret = res_counter_set_limit(>kmem, val);

   memcg_kmem_set_activated(memcg);
   must_inc_static_branch = true;
   } else
   ret = res_counter_set_limit(>kmem, val);
   mutex_unlock(_limit_mutex);
   cgroup_unlock();

   if (must_inc_static_branch) {
   /* We can't do this under cgroup_lock */
   static_key_slow_inc(_kmem_enabled_key);
   memcg_kmem_set_active(memcg);
   }

Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
static_key_slow_inc() is called under the set_limit_mutex, but the
leftover from the above-mentioned commit is still here. Let's remove it.

OK, so I have looked there again and 692e89abd154b (memcg: increment
static branch right after limit set) which went in after cgroup_mutex
has been removed. It came along with the following comment.
 /*
  * setting the active bit after the inc will guarantee no one
  * starts accounting before all call sites are patched
  */

This suggests that the flag is needed after all because we have
to be sure that _all_ the places have to be patched. AFAIU
memcg_kmem_newpage_charge might see the static key already patched so
it would do a charge but memcg_kmem_commit_charge would still see it
unpatched and so the charge won't be committed.

Or am I missing something?

You are correct. This flag is there due to the way we are using static branches.
The patching of one call site is atomic, but the patching of all of
them are not.
Therefore we need to use a two-flag scheme to guarantee that in the first time
we turn the static branches on, there will be a clear point after
which we're going
to start accounting.


Hi, Glauber

Sorry, but I don't understand why we need two flags. Isn't checking the 
flag set after all call sites have been patched (I mean 
KMEM_ACCOUNTED_ACTIVE) not enough?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Glauber Costa
On Mon, Dec 2, 2013 at 10:51 PM, Michal Hocko  wrote:
> On Mon 02-12-13 22:26:48, Glauber Costa wrote:
>> On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko  wrote:
>> > [CCing Glauber - please do so in other posts for kmem related changes]
>> >
>> > On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
>> >> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
>> >> use static branches when code not in use") in order to guarantee that
>> >> static_key_slow_inc(_kmem_enabled_key) will be called only once
>> >> for each memory cgroup when its kmem limit is set. The point is that at
>> >> that time the memcg_update_kmem_limit() function's workflow looked like
>> >> this:
>> >>
>> >>   bool must_inc_static_branch = false;
>> >>
>> >>   cgroup_lock();
>> >>   mutex_lock(_limit_mutex);
>> >>   if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
>> >>   /* The kmem limit is set for the first time */
>> >>   ret = res_counter_set_limit(>kmem, val);
>> >>
>> >>   memcg_kmem_set_activated(memcg);
>> >>   must_inc_static_branch = true;
>> >>   } else
>> >>   ret = res_counter_set_limit(>kmem, val);
>> >>   mutex_unlock(_limit_mutex);
>> >>   cgroup_unlock();
>> >>
>> >>   if (must_inc_static_branch) {
>> >>   /* We can't do this under cgroup_lock */
>> >>   static_key_slow_inc(_kmem_enabled_key);
>> >>   memcg_kmem_set_active(memcg);
>> >>   }
>> >>
>> >> Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
>> >> static_key_slow_inc() is called under the set_limit_mutex, but the
>> >> leftover from the above-mentioned commit is still here. Let's remove it.
>> >
>> > OK, so I have looked there again and 692e89abd154b (memcg: increment
>> > static branch right after limit set) which went in after cgroup_mutex
>> > has been removed. It came along with the following comment.
>> > /*
>> >  * setting the active bit after the inc will guarantee no 
>> > one
>> >  * starts accounting before all call sites are patched
>> >  */
>> >
>> > This suggests that the flag is needed after all because we have
>> > to be sure that _all_ the places have to be patched. AFAIU
>> > memcg_kmem_newpage_charge might see the static key already patched so
>> > it would do a charge but memcg_kmem_commit_charge would still see it
>> > unpatched and so the charge won't be committed.
>> >
>> > Or am I missing something?
>>
>> You are correct. This flag is there due to the way we are using static 
>> branches.
>> The patching of one call site is atomic, but the patching of all of
>> them are not.
>> Therefore we need to use a two-flag scheme to guarantee that in the first 
>> time
>> we turn the static branches on, there will be a clear point after
>> which we're going
>> to start accounting.
>
> So http://lkml.org/lkml/2013/11/27/314 is correct then, right?

It looks correct.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Michal Hocko
On Mon 02-12-13 22:26:48, Glauber Costa wrote:
> On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko  wrote:
> > [CCing Glauber - please do so in other posts for kmem related changes]
> >
> > On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
> >> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
> >> use static branches when code not in use") in order to guarantee that
> >> static_key_slow_inc(_kmem_enabled_key) will be called only once
> >> for each memory cgroup when its kmem limit is set. The point is that at
> >> that time the memcg_update_kmem_limit() function's workflow looked like
> >> this:
> >>
> >>   bool must_inc_static_branch = false;
> >>
> >>   cgroup_lock();
> >>   mutex_lock(_limit_mutex);
> >>   if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
> >>   /* The kmem limit is set for the first time */
> >>   ret = res_counter_set_limit(>kmem, val);
> >>
> >>   memcg_kmem_set_activated(memcg);
> >>   must_inc_static_branch = true;
> >>   } else
> >>   ret = res_counter_set_limit(>kmem, val);
> >>   mutex_unlock(_limit_mutex);
> >>   cgroup_unlock();
> >>
> >>   if (must_inc_static_branch) {
> >>   /* We can't do this under cgroup_lock */
> >>   static_key_slow_inc(_kmem_enabled_key);
> >>   memcg_kmem_set_active(memcg);
> >>   }
> >>
> >> Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
> >> static_key_slow_inc() is called under the set_limit_mutex, but the
> >> leftover from the above-mentioned commit is still here. Let's remove it.
> >
> > OK, so I have looked there again and 692e89abd154b (memcg: increment
> > static branch right after limit set) which went in after cgroup_mutex
> > has been removed. It came along with the following comment.
> > /*
> >  * setting the active bit after the inc will guarantee no 
> > one
> >  * starts accounting before all call sites are patched
> >  */
> >
> > This suggests that the flag is needed after all because we have
> > to be sure that _all_ the places have to be patched. AFAIU
> > memcg_kmem_newpage_charge might see the static key already patched so
> > it would do a charge but memcg_kmem_commit_charge would still see it
> > unpatched and so the charge won't be committed.
> >
> > Or am I missing something?
> 
> You are correct. This flag is there due to the way we are using static 
> branches.
> The patching of one call site is atomic, but the patching of all of
> them are not.
> Therefore we need to use a two-flag scheme to guarantee that in the first time
> we turn the static branches on, there will be a clear point after
> which we're going
> to start accounting.

So http://lkml.org/lkml/2013/11/27/314 is correct then, right?
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Glauber Costa
On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko  wrote:
> [CCing Glauber - please do so in other posts for kmem related changes]
>
> On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
>> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
>> use static branches when code not in use") in order to guarantee that
>> static_key_slow_inc(_kmem_enabled_key) will be called only once
>> for each memory cgroup when its kmem limit is set. The point is that at
>> that time the memcg_update_kmem_limit() function's workflow looked like
>> this:
>>
>>   bool must_inc_static_branch = false;
>>
>>   cgroup_lock();
>>   mutex_lock(_limit_mutex);
>>   if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
>>   /* The kmem limit is set for the first time */
>>   ret = res_counter_set_limit(>kmem, val);
>>
>>   memcg_kmem_set_activated(memcg);
>>   must_inc_static_branch = true;
>>   } else
>>   ret = res_counter_set_limit(>kmem, val);
>>   mutex_unlock(_limit_mutex);
>>   cgroup_unlock();
>>
>>   if (must_inc_static_branch) {
>>   /* We can't do this under cgroup_lock */
>>   static_key_slow_inc(_kmem_enabled_key);
>>   memcg_kmem_set_active(memcg);
>>   }
>>
>> Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
>> static_key_slow_inc() is called under the set_limit_mutex, but the
>> leftover from the above-mentioned commit is still here. Let's remove it.
>
> OK, so I have looked there again and 692e89abd154b (memcg: increment
> static branch right after limit set) which went in after cgroup_mutex
> has been removed. It came along with the following comment.
> /*
>  * setting the active bit after the inc will guarantee no one
>  * starts accounting before all call sites are patched
>  */
>
> This suggests that the flag is needed after all because we have
> to be sure that _all_ the places have to be patched. AFAIU
> memcg_kmem_newpage_charge might see the static key already patched so
> it would do a charge but memcg_kmem_commit_charge would still see it
> unpatched and so the charge won't be committed.
>
> Or am I missing something?

You are correct. This flag is there due to the way we are using static branches.
The patching of one call site is atomic, but the patching of all of
them are not.
Therefore we need to use a two-flag scheme to guarantee that in the first time
we turn the static branches on, there will be a clear point after
which we're going
to start accounting.





-- 
E Mare, Libertas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Michal Hocko
[CCing Glauber - please do so in other posts for kmem related changes]

On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
> The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b ("memcg:
> use static branches when code not in use") in order to guarantee that
> static_key_slow_inc(_kmem_enabled_key) will be called only once
> for each memory cgroup when its kmem limit is set. The point is that at
> that time the memcg_update_kmem_limit() function's workflow looked like
> this:
> 
>   bool must_inc_static_branch = false;
> 
>   cgroup_lock();
>   mutex_lock(_limit_mutex);
>   if (!memcg->kmem_account_flags && val != RESOURCE_MAX) {
>   /* The kmem limit is set for the first time */
>   ret = res_counter_set_limit(>kmem, val);
> 
>   memcg_kmem_set_activated(memcg);
>   must_inc_static_branch = true;
>   } else
>   ret = res_counter_set_limit(>kmem, val);
>   mutex_unlock(_limit_mutex);
>   cgroup_unlock();
> 
>   if (must_inc_static_branch) {
>   /* We can't do this under cgroup_lock */
>   static_key_slow_inc(_kmem_enabled_key);
>   memcg_kmem_set_active(memcg);
>   }
> 
> Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
> static_key_slow_inc() is called under the set_limit_mutex, but the
> leftover from the above-mentioned commit is still here. Let's remove it.

OK, so I have looked there again and 692e89abd154b (memcg: increment
static branch right after limit set) which went in after cgroup_mutex
has been removed. It came along with the following comment.
/*
 * setting the active bit after the inc will guarantee no one
 * starts accounting before all call sites are patched
 */

This suggests that the flag is needed after all because we have
to be sure that _all_ the places have to be patched. AFAIU
memcg_kmem_newpage_charge might see the static key already patched so
it would do a charge but memcg_kmem_commit_charge would still see it
unpatched and so the charge won't be committed.

Or am I missing something?
 
> Signed-off-by: Vladimir Davydov 
> Cc: Johannes Weiner 
> Cc: Michal Hocko 
> Cc: Balbir Singh 
> Cc: KAMEZAWA Hiroyuki 
> ---
>  mm/memcontrol.c |   26 +-
>  1 file changed, 1 insertion(+), 25 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index f1a0ae6..f78661f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -344,14 +344,9 @@ static size_t memcg_size(void)
>  /* internal only representation about the status of kmem accounting. */
>  enum {
>   KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
> - KMEM_ACCOUNTED_ACTIVATED, /* static key enabled. */
>   KMEM_ACCOUNTED_DEAD, /* dead memcg with pending kmem charges */
>  };
>  
> -/* We account when limit is on, but only after call sites are patched */
> -#define KMEM_ACCOUNTED_MASK \
> - ((1 << KMEM_ACCOUNTED_ACTIVE) | (1 << KMEM_ACCOUNTED_ACTIVATED))
> -
>  #ifdef CONFIG_MEMCG_KMEM
>  static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
>  {
> @@ -363,16 +358,6 @@ static bool memcg_kmem_is_active(struct mem_cgroup 
> *memcg)
>   return test_bit(KMEM_ACCOUNTED_ACTIVE, >kmem_account_flags);
>  }
>  
> -static void memcg_kmem_set_activated(struct mem_cgroup *memcg)
> -{
> - set_bit(KMEM_ACCOUNTED_ACTIVATED, >kmem_account_flags);
> -}
> -
> -static void memcg_kmem_clear_activated(struct mem_cgroup *memcg)
> -{
> - clear_bit(KMEM_ACCOUNTED_ACTIVATED, >kmem_account_flags);
> -}
> -
>  static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
>  {
>   /*
> @@ -2956,7 +2941,7 @@ static DEFINE_MUTEX(set_limit_mutex);
>  static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
>  {
>   return !mem_cgroup_disabled() && !mem_cgroup_is_root(memcg) &&
> - (memcg->kmem_account_flags & KMEM_ACCOUNTED_MASK);
> + memcg_kmem_is_active(memcg);
>  }
>  
>  /*
> @@ -3091,19 +3076,10 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg)
>   0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
>   if (num < 0)
>   return num;
> - /*
> -  * After this point, kmem_accounted (that we test atomically in
> -  * the beginning of this conditional), is no longer 0. This
> -  * guarantees only one process will set the following boolean
> -  * to true. We don't need test_and_set because we're protected
> -  * by the set_limit_mutex anyway.
> -  */
> - memcg_kmem_set_activated(memcg);
>  
>   ret = memcg_update_all_caches(num+1);
>   if (ret) {
>   ida_simple_remove(_limited_groups, num);
> - memcg_kmem_clear_activated(memcg);
>   return ret;
>   }
>  
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe cgroups" in
> the body of a message to 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Michal Hocko
[CCing Glauber - please do so in other posts for kmem related changes]

On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
 The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
 use static branches when code not in use) in order to guarantee that
 static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
 for each memory cgroup when its kmem limit is set. The point is that at
 that time the memcg_update_kmem_limit() function's workflow looked like
 this:
 
   bool must_inc_static_branch = false;
 
   cgroup_lock();
   mutex_lock(set_limit_mutex);
   if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
   /* The kmem limit is set for the first time */
   ret = res_counter_set_limit(memcg-kmem, val);
 
   memcg_kmem_set_activated(memcg);
   must_inc_static_branch = true;
   } else
   ret = res_counter_set_limit(memcg-kmem, val);
   mutex_unlock(set_limit_mutex);
   cgroup_unlock();
 
   if (must_inc_static_branch) {
   /* We can't do this under cgroup_lock */
   static_key_slow_inc(memcg_kmem_enabled_key);
   memcg_kmem_set_active(memcg);
   }
 
 Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
 static_key_slow_inc() is called under the set_limit_mutex, but the
 leftover from the above-mentioned commit is still here. Let's remove it.

OK, so I have looked there again and 692e89abd154b (memcg: increment
static branch right after limit set) which went in after cgroup_mutex
has been removed. It came along with the following comment.
/*
 * setting the active bit after the inc will guarantee no one
 * starts accounting before all call sites are patched
 */

This suggests that the flag is needed after all because we have
to be sure that _all_ the places have to be patched. AFAIU
memcg_kmem_newpage_charge might see the static key already patched so
it would do a charge but memcg_kmem_commit_charge would still see it
unpatched and so the charge won't be committed.

Or am I missing something?
 
 Signed-off-by: Vladimir Davydov vdavy...@parallels.com
 Cc: Johannes Weiner han...@cmpxchg.org
 Cc: Michal Hocko mho...@suse.cz
 Cc: Balbir Singh bsinghar...@gmail.com
 Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
 ---
  mm/memcontrol.c |   26 +-
  1 file changed, 1 insertion(+), 25 deletions(-)
 
 diff --git a/mm/memcontrol.c b/mm/memcontrol.c
 index f1a0ae6..f78661f 100644
 --- a/mm/memcontrol.c
 +++ b/mm/memcontrol.c
 @@ -344,14 +344,9 @@ static size_t memcg_size(void)
  /* internal only representation about the status of kmem accounting. */
  enum {
   KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
 - KMEM_ACCOUNTED_ACTIVATED, /* static key enabled. */
   KMEM_ACCOUNTED_DEAD, /* dead memcg with pending kmem charges */
  };
  
 -/* We account when limit is on, but only after call sites are patched */
 -#define KMEM_ACCOUNTED_MASK \
 - ((1  KMEM_ACCOUNTED_ACTIVE) | (1  KMEM_ACCOUNTED_ACTIVATED))
 -
  #ifdef CONFIG_MEMCG_KMEM
  static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
  {
 @@ -363,16 +358,6 @@ static bool memcg_kmem_is_active(struct mem_cgroup 
 *memcg)
   return test_bit(KMEM_ACCOUNTED_ACTIVE, memcg-kmem_account_flags);
  }
  
 -static void memcg_kmem_set_activated(struct mem_cgroup *memcg)
 -{
 - set_bit(KMEM_ACCOUNTED_ACTIVATED, memcg-kmem_account_flags);
 -}
 -
 -static void memcg_kmem_clear_activated(struct mem_cgroup *memcg)
 -{
 - clear_bit(KMEM_ACCOUNTED_ACTIVATED, memcg-kmem_account_flags);
 -}
 -
  static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
  {
   /*
 @@ -2956,7 +2941,7 @@ static DEFINE_MUTEX(set_limit_mutex);
  static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
  {
   return !mem_cgroup_disabled()  !mem_cgroup_is_root(memcg) 
 - (memcg-kmem_account_flags  KMEM_ACCOUNTED_MASK);
 + memcg_kmem_is_active(memcg);
  }
  
  /*
 @@ -3091,19 +3076,10 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg)
   0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
   if (num  0)
   return num;
 - /*
 -  * After this point, kmem_accounted (that we test atomically in
 -  * the beginning of this conditional), is no longer 0. This
 -  * guarantees only one process will set the following boolean
 -  * to true. We don't need test_and_set because we're protected
 -  * by the set_limit_mutex anyway.
 -  */
 - memcg_kmem_set_activated(memcg);
  
   ret = memcg_update_all_caches(num+1);
   if (ret) {
   ida_simple_remove(kmem_limited_groups, num);
 - memcg_kmem_clear_activated(memcg);
   return ret;
   }
  
 -- 
 1.7.10.4
 
 --
 To unsubscribe from this list: send the line unsubscribe cgroups in
 the body of a 

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Glauber Costa
On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:
 [CCing Glauber - please do so in other posts for kmem related changes]

 On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
 The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
 use static branches when code not in use) in order to guarantee that
 static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
 for each memory cgroup when its kmem limit is set. The point is that at
 that time the memcg_update_kmem_limit() function's workflow looked like
 this:

   bool must_inc_static_branch = false;

   cgroup_lock();
   mutex_lock(set_limit_mutex);
   if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
   /* The kmem limit is set for the first time */
   ret = res_counter_set_limit(memcg-kmem, val);

   memcg_kmem_set_activated(memcg);
   must_inc_static_branch = true;
   } else
   ret = res_counter_set_limit(memcg-kmem, val);
   mutex_unlock(set_limit_mutex);
   cgroup_unlock();

   if (must_inc_static_branch) {
   /* We can't do this under cgroup_lock */
   static_key_slow_inc(memcg_kmem_enabled_key);
   memcg_kmem_set_active(memcg);
   }

 Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
 static_key_slow_inc() is called under the set_limit_mutex, but the
 leftover from the above-mentioned commit is still here. Let's remove it.

 OK, so I have looked there again and 692e89abd154b (memcg: increment
 static branch right after limit set) which went in after cgroup_mutex
 has been removed. It came along with the following comment.
 /*
  * setting the active bit after the inc will guarantee no one
  * starts accounting before all call sites are patched
  */

 This suggests that the flag is needed after all because we have
 to be sure that _all_ the places have to be patched. AFAIU
 memcg_kmem_newpage_charge might see the static key already patched so
 it would do a charge but memcg_kmem_commit_charge would still see it
 unpatched and so the charge won't be committed.

 Or am I missing something?

You are correct. This flag is there due to the way we are using static branches.
The patching of one call site is atomic, but the patching of all of
them are not.
Therefore we need to use a two-flag scheme to guarantee that in the first time
we turn the static branches on, there will be a clear point after
which we're going
to start accounting.





-- 
E Mare, Libertas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Michal Hocko
On Mon 02-12-13 22:26:48, Glauber Costa wrote:
 On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:
  [CCing Glauber - please do so in other posts for kmem related changes]
 
  On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
  The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
  use static branches when code not in use) in order to guarantee that
  static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
  for each memory cgroup when its kmem limit is set. The point is that at
  that time the memcg_update_kmem_limit() function's workflow looked like
  this:
 
bool must_inc_static_branch = false;
 
cgroup_lock();
mutex_lock(set_limit_mutex);
if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(memcg-kmem, val);
 
memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(memcg-kmem, val);
mutex_unlock(set_limit_mutex);
cgroup_unlock();
 
if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(memcg_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}
 
  Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
  static_key_slow_inc() is called under the set_limit_mutex, but the
  leftover from the above-mentioned commit is still here. Let's remove it.
 
  OK, so I have looked there again and 692e89abd154b (memcg: increment
  static branch right after limit set) which went in after cgroup_mutex
  has been removed. It came along with the following comment.
  /*
   * setting the active bit after the inc will guarantee no 
  one
   * starts accounting before all call sites are patched
   */
 
  This suggests that the flag is needed after all because we have
  to be sure that _all_ the places have to be patched. AFAIU
  memcg_kmem_newpage_charge might see the static key already patched so
  it would do a charge but memcg_kmem_commit_charge would still see it
  unpatched and so the charge won't be committed.
 
  Or am I missing something?
 
 You are correct. This flag is there due to the way we are using static 
 branches.
 The patching of one call site is atomic, but the patching of all of
 them are not.
 Therefore we need to use a two-flag scheme to guarantee that in the first time
 we turn the static branches on, there will be a clear point after
 which we're going
 to start accounting.

So http://lkml.org/lkml/2013/11/27/314 is correct then, right?
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Glauber Costa
On Mon, Dec 2, 2013 at 10:51 PM, Michal Hocko mho...@suse.cz wrote:
 On Mon 02-12-13 22:26:48, Glauber Costa wrote:
 On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:
  [CCing Glauber - please do so in other posts for kmem related changes]
 
  On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:
  The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
  use static branches when code not in use) in order to guarantee that
  static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
  for each memory cgroup when its kmem limit is set. The point is that at
  that time the memcg_update_kmem_limit() function's workflow looked like
  this:
 
bool must_inc_static_branch = false;
 
cgroup_lock();
mutex_lock(set_limit_mutex);
if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(memcg-kmem, val);
 
memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(memcg-kmem, val);
mutex_unlock(set_limit_mutex);
cgroup_unlock();
 
if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(memcg_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}
 
  Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
  static_key_slow_inc() is called under the set_limit_mutex, but the
  leftover from the above-mentioned commit is still here. Let's remove it.
 
  OK, so I have looked there again and 692e89abd154b (memcg: increment
  static branch right after limit set) which went in after cgroup_mutex
  has been removed. It came along with the following comment.
  /*
   * setting the active bit after the inc will guarantee no 
  one
   * starts accounting before all call sites are patched
   */
 
  This suggests that the flag is needed after all because we have
  to be sure that _all_ the places have to be patched. AFAIU
  memcg_kmem_newpage_charge might see the static key already patched so
  it would do a charge but memcg_kmem_commit_charge would still see it
  unpatched and so the charge won't be committed.
 
  Or am I missing something?

 You are correct. This flag is there due to the way we are using static 
 branches.
 The patching of one call site is atomic, but the patching of all of
 them are not.
 Therefore we need to use a two-flag scheme to guarantee that in the first 
 time
 we turn the static branches on, there will be a clear point after
 which we're going
 to start accounting.

 So http://lkml.org/lkml/2013/11/27/314 is correct then, right?

It looks correct.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov

On 12/02/2013 10:26 PM, Glauber Costa wrote:

On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:

[CCing Glauber - please do so in other posts for kmem related changes]

On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:

The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
use static branches when code not in use) in order to guarantee that
static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
for each memory cgroup when its kmem limit is set. The point is that at
that time the memcg_update_kmem_limit() function's workflow looked like
this:

   bool must_inc_static_branch = false;

   cgroup_lock();
   mutex_lock(set_limit_mutex);
   if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
   /* The kmem limit is set for the first time */
   ret = res_counter_set_limit(memcg-kmem, val);

   memcg_kmem_set_activated(memcg);
   must_inc_static_branch = true;
   } else
   ret = res_counter_set_limit(memcg-kmem, val);
   mutex_unlock(set_limit_mutex);
   cgroup_unlock();

   if (must_inc_static_branch) {
   /* We can't do this under cgroup_lock */
   static_key_slow_inc(memcg_kmem_enabled_key);
   memcg_kmem_set_active(memcg);
   }

Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
static_key_slow_inc() is called under the set_limit_mutex, but the
leftover from the above-mentioned commit is still here. Let's remove it.

OK, so I have looked there again and 692e89abd154b (memcg: increment
static branch right after limit set) which went in after cgroup_mutex
has been removed. It came along with the following comment.
 /*
  * setting the active bit after the inc will guarantee no one
  * starts accounting before all call sites are patched
  */

This suggests that the flag is needed after all because we have
to be sure that _all_ the places have to be patched. AFAIU
memcg_kmem_newpage_charge might see the static key already patched so
it would do a charge but memcg_kmem_commit_charge would still see it
unpatched and so the charge won't be committed.

Or am I missing something?

You are correct. This flag is there due to the way we are using static branches.
The patching of one call site is atomic, but the patching of all of
them are not.
Therefore we need to use a two-flag scheme to guarantee that in the first time
we turn the static branches on, there will be a clear point after
which we're going
to start accounting.


Hi, Glauber

Sorry, but I don't understand why we need two flags. Isn't checking the 
flag set after all call sites have been patched (I mean 
KMEM_ACCOUNTED_ACTIVE) not enough?

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Vladimir Davydov

On 12/02/2013 10:15 PM, Michal Hocko wrote:

[CCing Glauber - please do so in other posts for kmem related changes]

On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:

The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
use static branches when code not in use) in order to guarantee that
static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
for each memory cgroup when its kmem limit is set. The point is that at
that time the memcg_update_kmem_limit() function's workflow looked like
this:

bool must_inc_static_branch = false;

cgroup_lock();
mutex_lock(set_limit_mutex);
if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(memcg-kmem, val);

memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(memcg-kmem, val);
mutex_unlock(set_limit_mutex);
cgroup_unlock();

if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(memcg_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}

Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
static_key_slow_inc() is called under the set_limit_mutex, but the
leftover from the above-mentioned commit is still here. Let's remove it.

OK, so I have looked there again and 692e89abd154b (memcg: increment
static branch right after limit set) which went in after cgroup_mutex
has been removed. It came along with the following comment.
 /*
  * setting the active bit after the inc will guarantee no one
  * starts accounting before all call sites are patched
  */

This suggests that the flag is needed after all because we have
to be sure that _all_ the places have to be patched. AFAIU
memcg_kmem_newpage_charge might see the static key already patched so
it would do a charge but memcg_kmem_commit_charge would still see it
unpatched and so the charge won't be committed.


Hmm, but we have the KMEM_ACCOUNTED_ACTIVE flag, which is set *after* 
static_key_slow_inc() in memcg_update_kmem_limit(), and 
__memcg_kmem_newpage_charge() checks it before charging 
(memcg_can_account_kmem()). So the situation you described is impossible?




Or am I missing something?
  

Signed-off-by: Vladimir Davydov vdavy...@parallels.com
Cc: Johannes Weiner han...@cmpxchg.org
Cc: Michal Hocko mho...@suse.cz
Cc: Balbir Singh bsinghar...@gmail.com
Cc: KAMEZAWA Hiroyuki kamezawa.hir...@jp.fujitsu.com
---
  mm/memcontrol.c |   26 +-
  1 file changed, 1 insertion(+), 25 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f1a0ae6..f78661f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -344,14 +344,9 @@ static size_t memcg_size(void)
  /* internal only representation about the status of kmem accounting. */
  enum {
KMEM_ACCOUNTED_ACTIVE = 0, /* accounted by this cgroup itself */
-   KMEM_ACCOUNTED_ACTIVATED, /* static key enabled. */
KMEM_ACCOUNTED_DEAD, /* dead memcg with pending kmem charges */
  };
  
-/* We account when limit is on, but only after call sites are patched */

-#define KMEM_ACCOUNTED_MASK \
-   ((1  KMEM_ACCOUNTED_ACTIVE) | (1  KMEM_ACCOUNTED_ACTIVATED))
-
  #ifdef CONFIG_MEMCG_KMEM
  static inline void memcg_kmem_set_active(struct mem_cgroup *memcg)
  {
@@ -363,16 +358,6 @@ static bool memcg_kmem_is_active(struct mem_cgroup *memcg)
return test_bit(KMEM_ACCOUNTED_ACTIVE, memcg-kmem_account_flags);
  }
  
-static void memcg_kmem_set_activated(struct mem_cgroup *memcg)

-{
-   set_bit(KMEM_ACCOUNTED_ACTIVATED, memcg-kmem_account_flags);
-}
-
-static void memcg_kmem_clear_activated(struct mem_cgroup *memcg)
-{
-   clear_bit(KMEM_ACCOUNTED_ACTIVATED, memcg-kmem_account_flags);
-}
-
  static void memcg_kmem_mark_dead(struct mem_cgroup *memcg)
  {
/*
@@ -2956,7 +2941,7 @@ static DEFINE_MUTEX(set_limit_mutex);
  static inline bool memcg_can_account_kmem(struct mem_cgroup *memcg)
  {
return !mem_cgroup_disabled()  !mem_cgroup_is_root(memcg) 
-   (memcg-kmem_account_flags  KMEM_ACCOUNTED_MASK);
+   memcg_kmem_is_active(memcg);
  }
  
  /*

@@ -3091,19 +3076,10 @@ int memcg_update_cache_sizes(struct mem_cgroup *memcg)
0, MEMCG_CACHES_MAX_SIZE, GFP_KERNEL);
if (num  0)
return num;
-   /*
-* After this point, kmem_accounted (that we test atomically in
-* the beginning of this conditional), is no longer 0. This
-* guarantees only one process will set the following boolean
-* to true. We don't need test_and_set because we're protected
-* by the set_limit_mutex anyway.
-*/
-   memcg_kmem_set_activated(memcg);
  
  	

Re: [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-02 Thread Glauber Costa
On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov
vdavy...@parallels.com wrote:
 On 12/02/2013 10:26 PM, Glauber Costa wrote:

 On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote:

 [CCing Glauber - please do so in other posts for kmem related changes]

 On Mon 02-12-13 17:08:13, Vladimir Davydov wrote:

 The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg:
 use static branches when code not in use) in order to guarantee that
 static_key_slow_inc(memcg_kmem_enabled_key) will be called only once
 for each memory cgroup when its kmem limit is set. The point is that at
 that time the memcg_update_kmem_limit() function's workflow looked like
 this:

bool must_inc_static_branch = false;

cgroup_lock();
mutex_lock(set_limit_mutex);
if (!memcg-kmem_account_flags  val != RESOURCE_MAX) {
/* The kmem limit is set for the first time */
ret = res_counter_set_limit(memcg-kmem, val);

memcg_kmem_set_activated(memcg);
must_inc_static_branch = true;
} else
ret = res_counter_set_limit(memcg-kmem, val);
mutex_unlock(set_limit_mutex);
cgroup_unlock();

if (must_inc_static_branch) {
/* We can't do this under cgroup_lock */
static_key_slow_inc(memcg_kmem_enabled_key);
memcg_kmem_set_active(memcg);
}

 Today, we don't use cgroup_lock in memcg_update_kmem_limit(), and
 static_key_slow_inc() is called under the set_limit_mutex, but the
 leftover from the above-mentioned commit is still here. Let's remove it.

 OK, so I have looked there again and 692e89abd154b (memcg: increment
 static branch right after limit set) which went in after cgroup_mutex
 has been removed. It came along with the following comment.
  /*
   * setting the active bit after the inc will guarantee
 no one
   * starts accounting before all call sites are patched
   */

 This suggests that the flag is needed after all because we have
 to be sure that _all_ the places have to be patched. AFAIU
 memcg_kmem_newpage_charge might see the static key already patched so
 it would do a charge but memcg_kmem_commit_charge would still see it
 unpatched and so the charge won't be committed.

 Or am I missing something?

 You are correct. This flag is there due to the way we are using static
 branches.
 The patching of one call site is atomic, but the patching of all of
 them are not.
 Therefore we need to use a two-flag scheme to guarantee that in the first
 time
 we turn the static branches on, there will be a clear point after
 which we're going
 to start accounting.


 Hi, Glauber

 Sorry, but I don't understand why we need two flags. Isn't checking the flag
 set after all call sites have been patched (I mean KMEM_ACCOUNTED_ACTIVE)
 not enough?

Take a look at net/ipv4/tcp_memcontrol.c. There are comprehensive comments there
for a mechanism that basically achieves the same thing. The idea is
that one flag is used
at all times and means it is enabled. The second flags is a one time
only flag to indicate
that the patching process is complete. With one flag it seems to work,
but it is racy.

-- 
E Mare, Libertas
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/