Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Wed, Jun 11, 2014 at 04:11:17PM +0200, Michal Hocko wrote: > > I still think it'd be less useful than "high", but as there seem to be > > use cases which can be served with that and especially as a part of a > > consistent control scheme, I have no objection. > > > > "low" definitely requires a notification mechanism tho. > > Would vmpressure notification be sufficient? That one is in place for > any memcg which is reclaimed. Yeah, as long as it can reliably notify userland that the soft guarantee has been breached, it'd be great as it means we'd have a single mechanism to monitor both "low" and "high" while "min" and "max" are oom based, which BTW needs more work but that's a separate piece of work. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Wed 11-06-14 08:31:09, Tejun Heo wrote: > Hello, Michal. > > On Wed, Jun 11, 2014 at 09:57:29AM +0200, Michal Hocko wrote: > > Is this the kind of symmetry Tejun is asking for and that would make > > change is Nack position? I am still not sure it satisfies his soft > > Yes, pretty much. What primarily bothered me was the soft/hard > guarantees being chosen by a toggle switch while the soft/hard limits > can be configured separately and combined. The last consensus at LSF was that there would be a knob which will distinguish hard/best effort behavior. The weaker semantic has strong usecases IMHO so I wanted to start with it and add a knob for the hard guarantee later when explicitly asked for. Going with min, low, high and hard makes more sense to me of course. > > guarantee objections from other email. > > I was wondering about the usefulness of "low" itself in isolation and I think it has more usecases than "min" from simply practical POV. OOM means a potential service down time and that is a no go. Optimistic isolation on the other hand adds an advantages of the isolation most of the time while not getting completely flat on an exception (be it misconfiguration or a corner case like mentioned during the discussion). That doesn't mean "min" is not useful. It definitely is, the category of usecases will be more specific though. > I still think it'd be less useful than "high", but as there seem to be > use cases which can be served with that and especially as a part of a > consistent control scheme, I have no objection. > > "low" definitely requires a notification mechanism tho. Would vmpressure notification be sufficient? That one is in place for any memcg which is reclaimed. Or are you thinking about something more like oom_control? -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
Hello, Michal. On Wed, Jun 11, 2014 at 09:57:29AM +0200, Michal Hocko wrote: > Is this the kind of symmetry Tejun is asking for and that would make > change is Nack position? I am still not sure it satisfies his soft Yes, pretty much. What primarily bothered me was the soft/hard guarantees being chosen by a toggle switch while the soft/hard limits can be configured separately and combined. > guarantee objections from other email. I was wondering about the usefulness of "low" itself in isolation and I still think it'd be less useful than "high", but as there seem to be use cases which can be served with that and especially as a part of a consistent control scheme, I have no objection. "low" definitely requires a notification mechanism tho. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Tue 10-06-14 12:57:56, Johannes Weiner wrote: > On Mon, Jun 09, 2014 at 03:52:51PM -0700, Greg Thelen wrote: > > > > On Fri, Jun 06 2014, Michal Hocko wrote: > > > > > Some users (e.g. Google) would like to have stronger semantic than low > > > limit offers currently. The fallback mode is not desirable and they > > > prefer hitting OOM killer rather than ignoring low limit for protected > > > groups. There are other possible usecases which can benefit from hard > > > guarantees. I can imagine workloads where setting low_limit to the same > > > value as hard_limit to prevent from any reclaim at all makes a lot of > > > sense because reclaim is much more disrupting than restart of the load. > > > > > > This patch adds a new per memcg memory.reclaim_strategy knob which > > > tells what to do in a situation when memory reclaim cannot do any > > > progress because all groups in the reclaimed hierarchy are within their > > > low_limit. There are two options available: > > > - low_limit_best_effort - the current mode when reclaim falls > > > back to the even reclaim of all groups in the reclaimed > > > hierarchy > > > - low_limit_guarantee - groups within low_limit are never > > > reclaimed and OOM killer is triggered instead. OOM message > > > will mention the fact that the OOM was triggered due to > > > low_limit reclaim protection. > > > > To (a) be consistent with existing hard and soft limits APIs and (b) > > allow use of both best effort and guarantee memory limits, I wonder if > > it's best to offer three per memcg limits, rather than two limits (hard, > > low_limit) and a related reclaim_strategy knob. The three limits I'm > > thinking about are: > > > > 1) hard_limit (aka the existing limit_in_bytes cgroupfs file). No > >change needed here. This is an upper bound on a memcg hierarchy's > >memory consumption (assuming use_hierarchy=1). > > This creates internal pressure. Outside reclaim is not affected by > it, but internal charges can not exceed this limit. This is set to > hard limit the maximum memory consumption of a group (max). > > > 2) best_effort_limit (aka desired working set). This allow an > >application or administrator to provide a hint to the kernel about > >desired working set size. Before oom'ing the kernel is allowed to > >reclaim below this limit. I think the current soft_limit_in_bytes > >claims to provide this. If we prefer to deprecate > >soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a > >hopefully better named) API seems reasonable. > > This controls how external pressure applies to the group. > > But it's conceivable that we'd like to have the equivalent of such a > soft limit for *internal* pressure. Set below the hard limit, this > internal soft limit would have charges trigger direct reclaim in the > memcg but allow them to continue to the hard limit. This would create > a situation wherein the allocating tasks are not killed, but throttled > under reclaim, which gives the administrator a window to detect the > situation with vmpressure and possibly intervene. Because as it > stands, once the current hard limit is hit things can go down pretty > fast and the window for reacting to vmpressure readings is often too > small. This would offer a more gradual deterioration. It would be > set to the upper end of the working set size range (high). > > I think for many users such an internal soft limit would actually be > preferred over the current hard limit, as they'd rather have some > reclaim throttling than an OOM kill when the group reaches its upper > bound. Yes, this sounds useful. We have already discussed that and the primary question is whether the high limit reclaim should be direct or background. There are some cons and pros for both. Direct one is much easier to implement but it is questionable whether it is too heavy. Background is much more tricky to implement on the other hand. The obvious advantage would be a more convergence to the global behavior while we still get the notification that something bad is going on. I assume that a dedicated workqueque would be doable but we would definitely need an evaluation of what happens with zillions of high_limit reclaimers. > The current hard limit would be reserved for more advanced or paid > cases, where the admin would rather see a memcg get OOM killed than > exceed a certain size. So the hard_limit will not change, right? Still reclaim and fallback to OOM if nothing can be reclaimable as we do currently. > Then, as you proposed, we'd have the soft limit for external pressure, > where the kernel only reclaims groups within that limit in order to > avoid OOM kills. It would be set to the estimated lower end of the > working set size range (low). OK, that is how the current low_limit is implemented. > > 3) low_limit_guarantee which is a lower bound of memory usage. A memcg > >would prefer to be oom killed rather t
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Tue, Jun 10 2014, Johannes Weiner wrote: > On Mon, Jun 09, 2014 at 03:52:51PM -0700, Greg Thelen wrote: >> >> On Fri, Jun 06 2014, Michal Hocko wrote: >> >> > Some users (e.g. Google) would like to have stronger semantic than low >> > limit offers currently. The fallback mode is not desirable and they >> > prefer hitting OOM killer rather than ignoring low limit for protected >> > groups. There are other possible usecases which can benefit from hard >> > guarantees. I can imagine workloads where setting low_limit to the same >> > value as hard_limit to prevent from any reclaim at all makes a lot of >> > sense because reclaim is much more disrupting than restart of the load. >> > >> > This patch adds a new per memcg memory.reclaim_strategy knob which >> > tells what to do in a situation when memory reclaim cannot do any >> > progress because all groups in the reclaimed hierarchy are within their >> > low_limit. There are two options available: >> >- low_limit_best_effort - the current mode when reclaim falls >> > back to the even reclaim of all groups in the reclaimed >> > hierarchy >> >- low_limit_guarantee - groups within low_limit are never >> > reclaimed and OOM killer is triggered instead. OOM message >> > will mention the fact that the OOM was triggered due to >> > low_limit reclaim protection. >> >> To (a) be consistent with existing hard and soft limits APIs and (b) >> allow use of both best effort and guarantee memory limits, I wonder if >> it's best to offer three per memcg limits, rather than two limits (hard, >> low_limit) and a related reclaim_strategy knob. The three limits I'm >> thinking about are: >> >> 1) hard_limit (aka the existing limit_in_bytes cgroupfs file). No >>change needed here. This is an upper bound on a memcg hierarchy's >>memory consumption (assuming use_hierarchy=1). > > This creates internal pressure. Outside reclaim is not affected by > it, but internal charges can not exceed this limit. This is set to > hard limit the maximum memory consumption of a group (max). > >> 2) best_effort_limit (aka desired working set). This allow an >>application or administrator to provide a hint to the kernel about >>desired working set size. Before oom'ing the kernel is allowed to >>reclaim below this limit. I think the current soft_limit_in_bytes >>claims to provide this. If we prefer to deprecate >>soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a >>hopefully better named) API seems reasonable. > > This controls how external pressure applies to the group. > > But it's conceivable that we'd like to have the equivalent of such a > soft limit for *internal* pressure. Set below the hard limit, this > internal soft limit would have charges trigger direct reclaim in the > memcg but allow them to continue to the hard limit. This would create > a situation wherein the allocating tasks are not killed, but throttled > under reclaim, which gives the administrator a window to detect the > situation with vmpressure and possibly intervene. Because as it > stands, once the current hard limit is hit things can go down pretty > fast and the window for reacting to vmpressure readings is often too > small. This would offer a more gradual deterioration. It would be > set to the upper end of the working set size range (high). > > I think for many users such an internal soft limit would actually be > preferred over the current hard limit, as they'd rather have some > reclaim throttling than an OOM kill when the group reaches its upper > bound. The current hard limit would be reserved for more advanced or > paid cases, where the admin would rather see a memcg get OOM killed > than exceed a certain size. > > Then, as you proposed, we'd have the soft limit for external pressure, > where the kernel only reclaims groups within that limit in order to > avoid OOM kills. It would be set to the estimated lower end of the > working set size range (low). > >> 3) low_limit_guarantee which is a lower bound of memory usage. A memcg >>would prefer to be oom killed rather than operate below this >>threshold. Default value is zero to preserve compatibility with >>existing apps. > > And this would be the external pressure hard limit, which would be set > to the absolute minimum requirement of the group (min). > > Either because it would be hopelessly thrashing without it, or because > this guaranteed memory is actually paid for. Again, I would expect > many users to not even set this minimum guarantee but solely use the > external soft limit (low) instead. > >> Logically hard_limit >= best_effort_limit >= low_limit_guarantee. > > max >= high >= low >= min > > I think we should be able to express all desired usecases with these > four limits, including the advanced configurations, while making it > easy for many users to set up groups without being a) dead certain > about their memory consumption or b) prep
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Mon, Jun 09, 2014 at 03:52:51PM -0700, Greg Thelen wrote: > > On Fri, Jun 06 2014, Michal Hocko wrote: > > > Some users (e.g. Google) would like to have stronger semantic than low > > limit offers currently. The fallback mode is not desirable and they > > prefer hitting OOM killer rather than ignoring low limit for protected > > groups. There are other possible usecases which can benefit from hard > > guarantees. I can imagine workloads where setting low_limit to the same > > value as hard_limit to prevent from any reclaim at all makes a lot of > > sense because reclaim is much more disrupting than restart of the load. > > > > This patch adds a new per memcg memory.reclaim_strategy knob which > > tells what to do in a situation when memory reclaim cannot do any > > progress because all groups in the reclaimed hierarchy are within their > > low_limit. There are two options available: > > - low_limit_best_effort - the current mode when reclaim falls > > back to the even reclaim of all groups in the reclaimed > > hierarchy > > - low_limit_guarantee - groups within low_limit are never > > reclaimed and OOM killer is triggered instead. OOM message > > will mention the fact that the OOM was triggered due to > > low_limit reclaim protection. > > To (a) be consistent with existing hard and soft limits APIs and (b) > allow use of both best effort and guarantee memory limits, I wonder if > it's best to offer three per memcg limits, rather than two limits (hard, > low_limit) and a related reclaim_strategy knob. The three limits I'm > thinking about are: > > 1) hard_limit (aka the existing limit_in_bytes cgroupfs file). No >change needed here. This is an upper bound on a memcg hierarchy's >memory consumption (assuming use_hierarchy=1). This creates internal pressure. Outside reclaim is not affected by it, but internal charges can not exceed this limit. This is set to hard limit the maximum memory consumption of a group (max). > 2) best_effort_limit (aka desired working set). This allow an >application or administrator to provide a hint to the kernel about >desired working set size. Before oom'ing the kernel is allowed to >reclaim below this limit. I think the current soft_limit_in_bytes >claims to provide this. If we prefer to deprecate >soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a >hopefully better named) API seems reasonable. This controls how external pressure applies to the group. But it's conceivable that we'd like to have the equivalent of such a soft limit for *internal* pressure. Set below the hard limit, this internal soft limit would have charges trigger direct reclaim in the memcg but allow them to continue to the hard limit. This would create a situation wherein the allocating tasks are not killed, but throttled under reclaim, which gives the administrator a window to detect the situation with vmpressure and possibly intervene. Because as it stands, once the current hard limit is hit things can go down pretty fast and the window for reacting to vmpressure readings is often too small. This would offer a more gradual deterioration. It would be set to the upper end of the working set size range (high). I think for many users such an internal soft limit would actually be preferred over the current hard limit, as they'd rather have some reclaim throttling than an OOM kill when the group reaches its upper bound. The current hard limit would be reserved for more advanced or paid cases, where the admin would rather see a memcg get OOM killed than exceed a certain size. Then, as you proposed, we'd have the soft limit for external pressure, where the kernel only reclaims groups within that limit in order to avoid OOM kills. It would be set to the estimated lower end of the working set size range (low). > 3) low_limit_guarantee which is a lower bound of memory usage. A memcg >would prefer to be oom killed rather than operate below this >threshold. Default value is zero to preserve compatibility with >existing apps. And this would be the external pressure hard limit, which would be set to the absolute minimum requirement of the group (min). Either because it would be hopelessly thrashing without it, or because this guaranteed memory is actually paid for. Again, I would expect many users to not even set this minimum guarantee but solely use the external soft limit (low) instead. > Logically hard_limit >= best_effort_limit >= low_limit_guarantee. max >= high >= low >= min I think we should be able to express all desired usecases with these four limits, including the advanced configurations, while making it easy for many users to set up groups without being a) dead certain about their memory consumption or b) prepared for frequent OOM kills, while still allowing them to properly utilize their machines. What do you think? -- To unsubscribe from this list: send the line "unsubscrib
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Fri, Jun 06 2014, Michal Hocko wrote: > Some users (e.g. Google) would like to have stronger semantic than low > limit offers currently. The fallback mode is not desirable and they > prefer hitting OOM killer rather than ignoring low limit for protected > groups. There are other possible usecases which can benefit from hard > guarantees. I can imagine workloads where setting low_limit to the same > value as hard_limit to prevent from any reclaim at all makes a lot of > sense because reclaim is much more disrupting than restart of the load. > > This patch adds a new per memcg memory.reclaim_strategy knob which > tells what to do in a situation when memory reclaim cannot do any > progress because all groups in the reclaimed hierarchy are within their > low_limit. There are two options available: > - low_limit_best_effort - the current mode when reclaim falls > back to the even reclaim of all groups in the reclaimed > hierarchy > - low_limit_guarantee - groups within low_limit are never > reclaimed and OOM killer is triggered instead. OOM message > will mention the fact that the OOM was triggered due to > low_limit reclaim protection. To (a) be consistent with existing hard and soft limits APIs and (b) allow use of both best effort and guarantee memory limits, I wonder if it's best to offer three per memcg limits, rather than two limits (hard, low_limit) and a related reclaim_strategy knob. The three limits I'm thinking about are: 1) hard_limit (aka the existing limit_in_bytes cgroupfs file). No change needed here. This is an upper bound on a memcg hierarchy's memory consumption (assuming use_hierarchy=1). 2) best_effort_limit (aka desired working set). This allow an application or administrator to provide a hint to the kernel about desired working set size. Before oom'ing the kernel is allowed to reclaim below this limit. I think the current soft_limit_in_bytes claims to provide this. If we prefer to deprecate soft_limit_in_bytes, then a new desired_working_set_in_bytes (or a hopefully better named) API seems reasonable. 3) low_limit_guarantee which is a lower bound of memory usage. A memcg would prefer to be oom killed rather than operate below this threshold. Default value is zero to preserve compatibility with existing apps. Logically hard_limit >= best_effort_limit >= low_limit_guarantee. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
Hello, On Mon, Jun 09, 2014 at 10:30:42AM +0200, Michal Hocko wrote: > On Fri 06-06-14 11:29:14, Tejun Heo wrote: > > Why is this necessary? > > It allows user/admin to set the default behavior. By recomipling the kernel for something which can be trivially configured post-boot without any difference? The only thing it'll achieve is confusing the hell out of people why different kernels show different behaviors without any userland differences while taxing the already constrained kernel configuration process more for no gain whatsoever. > How do you propose to tell the default then? Only at the runtime? > I really do not insist on the kconfig. I find it useful for a) > documentation purpose b) easy way to configure the default. Please don't ever add Kconfig options like this. This is uttrely unnecessary and idiotic. You don't add completely redundant Kconfig option for documentation purposes. > > * Are you sure soft and hard guarantees aren't useful when used in > > combination? If so, why would that be the case? > > This was a call from Google to have per-memcg setup AFAIR. Using > different reclaim protection on the global case vs. limit reclaim makes > a lot of sense to me. If this is a major obstacle then I am OK to drop > it and only have a global setting for now. Isn't it obvious that what needs to be investigated is why we're trying to add an interface which is completely different for guarantees as compared to limits? Why wouldn't they have a symmetric interface in the reverse direction as soft/hard limits? If not, where does the asymmetry come from? Thse are the *first* questions which should come to anyone's mind when [s]he is trying to add configs for a different type of threshholds and something which must be explicitly laid out as rationales for the design choices. > > * We have pressure monitoring interface which can be used for soft > > limit pressure monitoring. > > Which one is that? I only know about oom_control triggered by the hard > limit pressure. Weren't you guys planning to use vmpressre notification to find out about softlimit breach conditions? > > How should breaching soft guarantee be > > factored into that? There doesn't seem to be any way of notifying > > that at the moment? Wouldn't we want that to be integrated into the > > same mechanism? > > Yes, there is. We have a counter in memory.stat file which tells how > many times the limit has been breached. How does the userland find out? By polling the file every frigging second? Note that there actually is an actual asymmetry here which makes breaching soft guarantee a much more significant event than breaching soft limit - the former is violation of the configured objective, the latter is not. You *need* a way to notify the event. > > What scares me the most is that you don't even seem to have noticed > > the asymmetry and are proposing userland-facing interface without > > actually thinking things through. This is exactly how we've been > > getting into trouble. > > This has been discussed up and down for the last _two_ years. I have > considered other options how to provide a very _useful_ feature users > are calling for. There is even general consensus among developers that AFAIR, there hasn't been much discussion about the details of the interface and the proposed one is almost laughable. How is this acceptable as a userland visible API that we need to maintain for the future? It's broken on delivery. > the feature is desirable and that the two modes (soft/hard) memory > protection are needed. Yet I would _really_ like to hear any > suggestion to get unstuck. It is far from useful to come and Nack this > _again_ without providing any alternative suggestions. I've pointed out two major points where the proposed interface is evidently deficient and told you why they're so and it's not like the said deficiencies are anything subtle. If you can't figure out what to do next from there on, I don't think I can help you. Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
On Fri 06-06-14 11:29:14, Tejun Heo wrote: > Hello, Michal. > > On Fri, Jun 06, 2014 at 04:46:50PM +0200, Michal Hocko wrote: > > +choice > > + prompt "Memory Resource Controller reclaim protection" > > + depends on MEMCG > > + help > > Why is this necessary? It allows user/admin to set the default behavior. > - This doesn't affect boot. > > - memcg requires runtime config *anyway*. > > - The config is inherited from the parent, so the default flipping > isn't exactly difficult. > > Please drop the kconfig option. How do you propose to tell the default then? Only at the runtime? I really do not insist on the kconfig. I find it useful for a) documentation purpose b) easy way to configure the default. > > +static int mem_cgroup_write_reclaim_strategy(struct cgroup_subsys_state > > *css, struct cftype *cft, > > + char *buffer) > > +{ > > + struct mem_cgroup *memcg = mem_cgroup_from_css(css); > > + int ret = 0; > > + > > + if (!strncmp(buffer, "low_limit_guarantee", > > + sizeof("low_limit_guarantee"))) { > > + memcg->hard_low_limit = true; > > + } else if (!strncmp(buffer, "low_limit_best_effort", > > + sizeof("low_limit_best_effort"))) { > > + memcg->hard_low_limit = false; > > + } else > > + ret = -EINVAL; > > + > > + return ret; > > +} > > So, ummm, this raises a big red flag for me. You're now implementing > two behaviors in a mostly symmetric manner to soft/hard limits but > choosing a completely different scheme in how they're configured > without any rationale. So what is your suggestion then? Using a global setting? Using a separate knob? Something completely different? > * Are you sure soft and hard guarantees aren't useful when used in > combination? If so, why would that be the case? This was a call from Google to have per-memcg setup AFAIR. Using different reclaim protection on the global case vs. limit reclaim makes a lot of sense to me. If this is a major obstacle then I am OK to drop it and only have a global setting for now. > * We have pressure monitoring interface which can be used for soft > limit pressure monitoring. Which one is that? I only know about oom_control triggered by the hard limit pressure. > How should breaching soft guarantee be > factored into that? There doesn't seem to be any way of notifying > that at the moment? Wouldn't we want that to be integrated into the > same mechanism? Yes, there is. We have a counter in memory.stat file which tells how many times the limit has been breached. > What scares me the most is that you don't even seem to have noticed > the asymmetry and are proposing userland-facing interface without > actually thinking things through. This is exactly how we've been > getting into trouble. This has been discussed up and down for the last _two_ years. I have considered other options how to provide a very _useful_ feature users are calling for. There is even general consensus among developers that the feature is desirable and that the two modes (soft/hard) memory protection are needed. Yet I would _really_ like to hear any suggestion to get unstuck. It is far from useful to come and Nack this _again_ without providing any alternative suggestions. > For now, for everything. > > Nacked-by: Tejun Heo > > Thanks. > > -- > tejun -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
A bit of addition. Let's *please* think through how memcg should be configured and different knobs / limits interact with each other and come up with a consistent scheme before adding more shits on top. This "oh I know this use case and maybe that behavior is necessary too, let's add N different and incompatible ways to mix and match them" doesn't fly. Aren't we suppposed to at least have learned that already? -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 2/2] memcg: Allow hard guarantee mode for low limit reclaim
Hello, Michal. On Fri, Jun 06, 2014 at 04:46:50PM +0200, Michal Hocko wrote: > +choice > + prompt "Memory Resource Controller reclaim protection" > + depends on MEMCG > + help Why is this necessary? - This doesn't affect boot. - memcg requires runtime config *anyway*. - The config is inherited from the parent, so the default flipping isn't exactly difficult. Please drop the kconfig option. > +static int mem_cgroup_write_reclaim_strategy(struct cgroup_subsys_state > *css, struct cftype *cft, > + char *buffer) > +{ > + struct mem_cgroup *memcg = mem_cgroup_from_css(css); > + int ret = 0; > + > + if (!strncmp(buffer, "low_limit_guarantee", > + sizeof("low_limit_guarantee"))) { > + memcg->hard_low_limit = true; > + } else if (!strncmp(buffer, "low_limit_best_effort", > + sizeof("low_limit_best_effort"))) { > + memcg->hard_low_limit = false; > + } else > + ret = -EINVAL; > + > + return ret; > +} So, ummm, this raises a big red flag for me. You're now implementing two behaviors in a mostly symmetric manner to soft/hard limits but choosing a completely different scheme in how they're configured without any rationale. * Are you sure soft and hard guarantees aren't useful when used in combination? If so, why would that be the case? * We have pressure monitoring interface which can be used for soft limit pressure monitoring. How should breaching soft guarantee be factored into that? There doesn't seem to be any way of notifying that at the moment? Wouldn't we want that to be integrated into the same mechanism? What scares me the most is that you don't even seem to have noticed the asymmetry and are proposing userland-facing interface without actually thinking things through. This is exactly how we've been getting into trouble. For now, for everything. Nacked-by: Tejun Heo Thanks. -- tejun -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/