Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
On 30-06-17, 06:53, Dominik Brodowski wrote: > On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote: > > On 29-06-17, 20:01, Dominik Brodowski wrote: > > > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote: > > > > The cpufreq core and governors aren't supposed to set a limit on how > > > > fast we want to try changing the frequency. This is currently done for > > > > the legacy governors with help of min_sampling_rate. > > > > > > > > At worst, we may end up setting the sampling rate to a value lower than > > > > the rate at which frequency can be changed and then one of the CPUs in > > > > the policy will be only changing frequency for ever. > > > > > > Is it safe to issue requests to change the CPU frequency so frequently, > > > > Well, I assumed so. I am not sure the hardware would break though. > > Overheating ? > > > > > even > > > on historic hardware such as speedstep-{ich,smi,centrino}? In the past, speedstep-smi is the only one which sets transition_latency to CPUFREQ_ETERNAL and the others are putting some meaningful values. So yes, they should be doing DVFS dynamically. > > > these checks more or less disallowed the running of dynamic frequency > > > scaling at least on speedstep-smi[*], > > > > We must by doing dynamic freq scaling even without this patch. I don't > > see why you say the above then. > > > > All we do here is that we get rid of the limit on how soon we can > > change the freq again. > > Well, as I understand it, first generation "speedstep" was designed more or > less to switch frequencies only when AC power was lost or restored. > > The Linux implementation merely said: "no on-the-fly changes", but switch > frequencies whenever a user explicitly requested such a change (presumably > only every once in an unspecified while). > > This same reasoning may be present in other drivers using CPUFREQ_ETERNAL. Thanks for the explanation here and I am convinced that this series has at least done one thing wrong. And that is removal of max_transition_latency from governors and allowing ondemand to run on such platforms (which may end up breaking them). So I will actually modify that patch and set max_transition_latency to CPUFREQ_ETERNAL for ondemand/conservative instead of 10ms. Also we should do the same for schedutil as well, so that will also use the max_transition_latency field. But I hope, this patch will still be fine. Right ? > I am not *sure* either, I am just worried of the consequences of doing > things out-of-spec... Thanks for your inputs Dominik. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
On Fri, Jun 30, 2017 at 09:04:25AM +0530, Viresh Kumar wrote: > On 29-06-17, 20:01, Dominik Brodowski wrote: > > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote: > > > The cpufreq core and governors aren't supposed to set a limit on how > > > fast we want to try changing the frequency. This is currently done for > > > the legacy governors with help of min_sampling_rate. > > > > > > At worst, we may end up setting the sampling rate to a value lower than > > > the rate at which frequency can be changed and then one of the CPUs in > > > the policy will be only changing frequency for ever. > > > > Is it safe to issue requests to change the CPU frequency so frequently, > > Well, I assumed so. I am not sure the hardware would break though. > Overheating ? > > > even > > on historic hardware such as speedstep-{ich,smi,centrino}? In the past, > > these checks more or less disallowed the running of dynamic frequency > > scaling at least on speedstep-smi[*], > > We must by doing dynamic freq scaling even without this patch. I don't > see why you say the above then. > > All we do here is that we get rid of the limit on how soon we can > change the freq again. Well, as I understand it, first generation "speedstep" was designed more or less to switch frequencies only when AC power was lost or restored. The Linux implementation merely said: "no on-the-fly changes", but switch frequencies whenever a user explicitly requested such a change (presumably only every once in an unspecified while). This same reasoning may be present in other drivers using CPUFREQ_ETERNAL. > > but maybe on a few other platforms as > > well. That's why I am curious on whether this may break systems potentially > > on a hardware level if the hardware was not designed to do dynamic frequency > > scaling (and not just frequency switches on battery/AC). > > Honestly I am not sure if any hardware can break or not, just because > of this commit. I am not *sure* either, I am just worried of the consequences of doing things out-of-spec... Best Dominik signature.asc Description: PGP signature
Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
On 29-06-17, 20:01, Dominik Brodowski wrote: > On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote: > > The cpufreq core and governors aren't supposed to set a limit on how > > fast we want to try changing the frequency. This is currently done for > > the legacy governors with help of min_sampling_rate. > > > > At worst, we may end up setting the sampling rate to a value lower than > > the rate at which frequency can be changed and then one of the CPUs in > > the policy will be only changing frequency for ever. > > Is it safe to issue requests to change the CPU frequency so frequently, Well, I assumed so. I am not sure the hardware would break though. Overheating ? > even > on historic hardware such as speedstep-{ich,smi,centrino}? In the past, > these checks more or less disallowed the running of dynamic frequency > scaling at least on speedstep-smi[*], We must by doing dynamic freq scaling even without this patch. I don't see why you say the above then. All we do here is that we get rid of the limit on how soon we can change the freq again. > but maybe on a few other platforms as > well. That's why I am curious on whether this may break systems potentially > on a hardware level if the hardware was not designed to do dynamic frequency > scaling (and not just frequency switches on battery/AC). Honestly I am not sure if any hardware can break or not, just because of this commit. -- viresh -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection
Roman Gushchin wrote: > On Fri, Jun 23, 2017 at 06:52:20AM +0900, Tetsuo Handa wrote: > > Tetsuo Handa wrote: > > Oops, I misinterpreted. This is where a multithreaded OOM victim with or > > without > > the OOM reaper can get stuck forever. Think about a process with two > > threads is > > selected by the OOM killer and only one of these two threads can get > > TIF_MEMDIE. > > > > Thread-1 Thread-2 The OOM killer > > The OOM reaper > > > >Calls down_write(¤t->mm->mmap_sem). > > Enters __alloc_pages_slowpath(). > >Enters __alloc_pages_slowpath(). > > Takes oom_lock. > > Calls out_of_memory(). > > Selects Thread-1 as an > > OOM victim. > > Gets SIGKILL.Gets SIGKILL. > > Gets TIF_MEMDIE. > > Releases oom_lock. > > Leaves __alloc_pages_slowpath() because Thread-1 has TIF_MEMDIE. > > > > Takes oom_lock. > > > > Will do nothing because down_read_trylock() fails. > > > > Releases oom_lock. > > > > Gives up and sets MMF_OOM_SKIP after one second. > >Takes oom_lock. > >Calls out_of_memory(). > >Will not check MMF_OOM_SKIP because Thread-1 > > still has TIF_MEMDIE. // <= get stuck waiting for Thread-1. > >Releases oom_lock. > >Will not leave __alloc_pages_slowpath() because > > Thread-2 does not have TIF_MEMDIE. > >Will not call up_write(¤t->mm->mmap_sem). > > Reaches do_exit(). > > Calls down_read(¤t->mm->mmap_sem) in exit_mm() in do_exit(). // <= > > get stuck waiting for Thread-2. > > Will not call up_read(¤t->mm->mmap_sem) in exit_mm() in do_exit(). > > Will not clear TIF_MEMDIE in exit_oom_victim() in exit_mm() in do_exit(). > > That's interesting... Does it mean, that we have to give an access to the > reserves > to all threads to guarantee the forward progress? Yes, for we don't have __GFP_KILLABLE flag. > > What do you think about Michal's approach? He posted a link in the thread. Please read that thread. -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] Documentation: fix wrong example command
On Thu, 29 Jun 2017 18:36:35 +0200 Matteo Croce wrote: > Signed-off-by: Matteo Croce > --- > Documentation/networking/ipvlan.txt | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/Documentation/networking/ipvlan.txt > b/Documentation/networking/ipvlan.txt > index 24196ce..1fe42a8 100644 > --- a/Documentation/networking/ipvlan.txt > +++ b/Documentation/networking/ipvlan.txt > @@ -22,9 +22,9 @@ The driver can be built into the kernel (CONFIG_IPVLAN=y) > or as a module > There are no module parameters for this driver and it can be configured > using IProute2/ip utility. > > - ip link add link type ipvlan mode { l2 | l3 | > l3s } > + ip link add link name type ipvlan mode { l2 | > l3 | l3s } > > - e.g. ip link add link ipvl0 eth0 type ipvlan mode l2 > + e.g. ip link add link eth0 name ipvl0 type ipvlan mode l2 > Patches to the networking documentation go through the networking tree, so this one should be resent with a copy to the netdev list. I'd also recommend putting in a real changelog. Thanks, jon -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection
On Fri, Jun 23, 2017 at 06:52:20AM +0900, Tetsuo Handa wrote: > Tetsuo Handa wrote: > > Roman Gushchin wrote: > > > On Thu, Jun 22, 2017 at 09:40:28AM +0900, Tetsuo Handa wrote: > > > > Roman Gushchin wrote: > > > > > --- a/mm/oom_kill.c > > > > > +++ b/mm/oom_kill.c > > > > > @@ -992,6 +992,13 @@ bool out_of_memory(struct oom_control *oc) > > > > > if (oom_killer_disabled) > > > > > return false; > > > > > > > > > > + /* > > > > > + * If there are oom victims in flight, we don't need to select > > > > > + * a new victim. > > > > > + */ > > > > > + if (atomic_read(&oom_victims) > 0) > > > > > + return true; > > > > > + > > > > > if (!is_memcg_oom(oc)) { > > > > > blocking_notifier_call_chain(&oom_notify_list, 0, > > > > > &freed); > > > > > if (freed > 0) > > > > > > > > The OOM reaper is not available for CONFIG_MMU=n kernels, and timeout > > > > based > > > > giveup is not permitted, but a multithreaded process might be selected > > > > as > > > > an OOM victim. Not setting TIF_MEMDIE to all threads sharing an OOM > > > > victim's > > > > mm increases possibility of preventing some OOM victim thread from > > > > terminating > > > > (e.g. one of them cannot leave __alloc_pages_slowpath() with mmap_sem > > > > held for > > > > write due to waiting for the TIF_MEMDIE thread to call > > > > exit_oom_victim() when > > > > the TIF_MEMDIE thread is waiting for the thread with mmap_sem held for > > > > write). > > > > > > I agree, that CONFIG_MMU=n is a special case, and the proposed approach > > > can't > > > be used directly. But can you, please, why do you find the first chunk > > > wrong? > > > > Since you are checking oom_victims before checking > > task_will_free_mem(current), > > only one thread can get TIF_MEMDIE. This is where a multithreaded OOM > > victim without > > the OOM reaper can get stuck forever. > > Oops, I misinterpreted. This is where a multithreaded OOM victim with or > without > the OOM reaper can get stuck forever. Think about a process with two threads > is > selected by the OOM killer and only one of these two threads can get > TIF_MEMDIE. > > Thread-1 Thread-2 The OOM killer > The OOM reaper > >Calls down_write(¤t->mm->mmap_sem). > Enters __alloc_pages_slowpath(). >Enters __alloc_pages_slowpath(). > Takes oom_lock. > Calls out_of_memory(). > Selects Thread-1 as an > OOM victim. > Gets SIGKILL.Gets SIGKILL. > Gets TIF_MEMDIE. > Releases oom_lock. > Leaves __alloc_pages_slowpath() because Thread-1 has TIF_MEMDIE. > > Takes oom_lock. > > Will do nothing because down_read_trylock() fails. > > Releases oom_lock. > > Gives up and sets MMF_OOM_SKIP after one second. >Takes oom_lock. >Calls out_of_memory(). >Will not check MMF_OOM_SKIP because Thread-1 still > has TIF_MEMDIE. // <= get stuck waiting for Thread-1. >Releases oom_lock. >Will not leave __alloc_pages_slowpath() because > Thread-2 does not have TIF_MEMDIE. >Will not call up_write(¤t->mm->mmap_sem). > Reaches do_exit(). > Calls down_read(¤t->mm->mmap_sem) in exit_mm() in do_exit(). // <= > get stuck waiting for Thread-2. > Will not call up_read(¤t->mm->mmap_sem) in exit_mm() in do_exit(). > Will not clear TIF_MEMDIE in exit_oom_victim() in exit_mm() in do_exit(). That's interesting... Does it mean, that we have to give an access to the reserves to all threads to guarantee the forward progress? What do you think about Michal's approach? He posted a link in the thread. Thank you! Roman -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 5/6] mm, oom: don't mark all oom victims tasks with TIF_MEMDIE
On Thu, Jun 29, 2017 at 10:53:57AM +0200, Michal Hocko wrote: > On Wed 21-06-17 22:19:15, Roman Gushchin wrote: > > We want to limit the number of tasks which are having an access > > to the memory reserves. To ensure the progress it's enough > > to have one such process at the time. > > > > If we need to kill the whole cgroup, let's give an access to the > > memory reserves only to the first process in the list, which is > > (usually) the biggest process. > > This will give us good chances that all other processes will be able > > to quit without an access to the memory reserves. > > I don't like this to be honest. Is there any reason to go the reduced > memory reserves access to oom victims I was suggesting earlier [1]? > > [1] > http://lkml.kernel.org/r/http://lkml.kernel.org/r/1472723464-22866-2-git-send-email-mho...@kernel.org I've nothing against your approach. What's the state of this patchset? Do you plan to bring it upstream? Roman -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
On Thu, Jun 29, 2017 at 04:29:06PM +0530, Viresh Kumar wrote: > The cpufreq core and governors aren't supposed to set a limit on how > fast we want to try changing the frequency. This is currently done for > the legacy governors with help of min_sampling_rate. > > At worst, we may end up setting the sampling rate to a value lower than > the rate at which frequency can be changed and then one of the CPUs in > the policy will be only changing frequency for ever. Is it safe to issue requests to change the CPU frequency so frequently, even on historic hardware such as speedstep-{ich,smi,centrino}? In the past, these checks more or less disallowed the running of dynamic frequency scaling at least on speedstep-smi[*], but maybe on a few other platforms as well. That's why I am curious on whether this may break systems potentially on a hardware level if the hardware was not designed to do dynamic frequency scaling (and not just frequency switches on battery/AC). Best, Dominik -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Documentation: fix wrong example command
Signed-off-by: Matteo Croce --- Documentation/networking/ipvlan.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/networking/ipvlan.txt b/Documentation/networking/ipvlan.txt index 24196ce..1fe42a8 100644 --- a/Documentation/networking/ipvlan.txt +++ b/Documentation/networking/ipvlan.txt @@ -22,9 +22,9 @@ The driver can be built into the kernel (CONFIG_IPVLAN=y) or as a module There are no module parameters for this driver and it can be configured using IProute2/ip utility. - ip link add link type ipvlan mode { l2 | l3 | l3s } + ip link add link name type ipvlan mode { l2 | l3 | l3s } - e.g. ip link add link ipvl0 eth0 type ipvlan mode l2 + e.g. ip link add link eth0 name ipvl0 type ipvlan mode l2 4. Operating modes: -- 2.9.4 -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8 00/20] ILP32 for ARM64
Hi Yury, On Mon, Jun 19, 2017 at 06:49:43PM +0300, Yury Norov wrote: > This series enables aarch64 with ilp32 mode. Thanks for putting this series together, I do appreciate the effort. There are still some review comments coming in but I'm happy with how the ABI looks now. I did some LTP testing (AArch64/LP64, AArch64/ILP32, AArch32) and benchmarking and didn't see any regressions (apart from an LTP bug with sync_file_range2). James Morse is working on reproducing similar testing in ARM. Szabolcs reported some glibc test-suite regressions on the libc-alpha list which I assume will be followed up. VDSO in C is another issue I'd like sorted but this is not strictly specific to ILP32 and can be done as a follow up. Note that I didn't run any big-endian tests, though this is something that needs doing. Now, having agreed on the ABI and implementation very close to being ready doesn't necessarily make the code suitable for upstream. With my maintainer hat on, I'm trying to see where ILP32 will be in 2-5-10 years, whether anyone still cares about it in this time frame. The difference from a driver or SoC support is that ABIs are very hard to revert, though are as (or even more) likely to bit-rot when not in use or regularly tested (we have the big-endian experience here). There are two main aspects to make the code upstream-worthy: 1. Actual/real users (current, future). I don't mean just a few distros showing that it can be done but actual/planned real deployments 2. Long term testing/maintenance plan. This is not about kernel code maintenance but a healthy ILP32 ecosystem: a) readily available toolchains (x86-hosted and AArch64-hosted) b) filesystems (can be large distros like openSUSE or more embedded-oriented like Yocto or OpenEmbedded) c) suitable continuous regression testing (kernel + userland) d) commitment from all parties involved (including ARM Ltd) to treat the ILP32 ABI as a (nearly) first class citizen It is pretty clear from private discussions that there are potential users but at the moment I can't tell if those would turn into real deployments of production systems. As for (2), the long term plans are not convincing (or I haven't spotted them yet), so I'd like to see the interested parties putting a plan together (something along the lines of kernelci.org + LTP, glibc buildbot). What I'd like to propose is that Will and I (as arm64 maintainers, maybe with with the help of others including this series' authors) take over the series and push it to a staging branch under the arm64 kernel on git.kernel.org. This is aimed as a commitment to keep the ABI *stable* and will be rebased with every kernel release (starting with 4.13). The decision to merge upstream will be revisited every 6 months, assessing the progress on the points I mentioned above, with a time limit of 2 years when, if still not upstream, we will stop maintaining such branch. I am aware that the above proposal has an impact on the glibc patches since they will not merge a new ABI upstream until officially supported by the kernel. I cc'ed some of the glibc developers and they will follow up on the libc-alpha list. > As supporting work, it introduces ARCH_32BIT_OFF_T configuration > option that is enabled for existing 32-bit architectures but disabled > for new arches (so 64-bit off_t userspace type is used by new userspace). > Also it deprecates getrlimit and setrlimit syscalls prior to prlimit64. [...] > Patches 1, 2, 3 and 8 are general, and may be applied separately. These 4 patches should be merged independently, I don't see a point in carrying them with the ILP32 series. Arnd, are you ok to push them upstream? BTW, patch 3 seems to never make it to the linux-arm-kernel list, I guess too many on cc. -- Catalin -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/6] cpufreq: governor: Drop min_sampling_rate
The cpufreq core and governors aren't supposed to set a limit on how fast we want to try changing the frequency. This is currently done for the legacy governors with help of min_sampling_rate. At worst, we may end up setting the sampling rate to a value lower than the rate at which frequency can be changed and then one of the CPUs in the policy will be only changing frequency for ever. But that is something for the user to decide and there is no need to have special handling for such cases in the core. Leave it for the user to figure out. Signed-off-by: Viresh Kumar --- Documentation/admin-guide/pm/cpufreq.rst | 8 drivers/cpufreq/cpufreq_conservative.c | 6 -- drivers/cpufreq/cpufreq_governor.c | 10 ++ drivers/cpufreq/cpufreq_governor.h | 1 - drivers/cpufreq/cpufreq_ondemand.c | 12 include/linux/cpufreq.h | 2 -- 6 files changed, 2 insertions(+), 37 deletions(-) diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst index 09aa2e949787..6adbe1ed58b9 100644 --- a/Documentation/admin-guide/pm/cpufreq.rst +++ b/Documentation/admin-guide/pm/cpufreq.rst @@ -471,14 +471,6 @@ it is allowed to use (the ``scaling_max_freq`` policy limit). # echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate - -``min_sampling_rate`` - The minimum value of ``sampling_rate``. - - Equal to 1 (10 ms) if :c:macro:`CONFIG_NO_HZ_COMMON` and - :c:data:`tick_nohz_active` are both set or to 20 times the value of - :c:data:`jiffies` in microseconds otherwise. - ``up_threshold`` If the estimated CPU load is above this value (in percent), the governor will set the frequency to the maximum value allowed for the policy. diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c index 88220ff3e1c2..f20f20a77d4d 100644 --- a/drivers/cpufreq/cpufreq_conservative.c +++ b/drivers/cpufreq/cpufreq_conservative.c @@ -246,7 +246,6 @@ gov_show_one_common(sampling_rate); gov_show_one_common(sampling_down_factor); gov_show_one_common(up_threshold); gov_show_one_common(ignore_nice_load); -gov_show_one_common(min_sampling_rate); gov_show_one(cs, down_threshold); gov_show_one(cs, freq_step); @@ -254,12 +253,10 @@ gov_attr_rw(sampling_rate); gov_attr_rw(sampling_down_factor); gov_attr_rw(up_threshold); gov_attr_rw(ignore_nice_load); -gov_attr_ro(min_sampling_rate); gov_attr_rw(down_threshold); gov_attr_rw(freq_step); static struct attribute *cs_attributes[] = { - &min_sampling_rate.attr, &sampling_rate.attr, &sampling_down_factor.attr, &up_threshold.attr, @@ -297,10 +294,7 @@ static int cs_init(struct dbs_data *dbs_data) dbs_data->up_threshold = DEF_FREQUENCY_UP_THRESHOLD; dbs_data->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR; dbs_data->ignore_nice_load = 0; - dbs_data->tuners = tuners; - dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO * - jiffies_to_usecs(10); return 0; } diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c index 47e24b5384b3..858081f9c3d7 100644 --- a/drivers/cpufreq/cpufreq_governor.c +++ b/drivers/cpufreq/cpufreq_governor.c @@ -47,14 +47,11 @@ ssize_t store_sampling_rate(struct gov_attr_set *attr_set, const char *buf, { struct dbs_data *dbs_data = to_dbs_data(attr_set); struct policy_dbs_info *policy_dbs; - unsigned int rate; int ret; - ret = sscanf(buf, "%u", &rate); + ret = sscanf(buf, "%u", &dbs_data->sampling_rate); if (ret != 1) return -EINVAL; - dbs_data->sampling_rate = max(rate, dbs_data->min_sampling_rate); - /* * We are operating under dbs_data->mutex and so the list and its * entries can't be freed concurrently. @@ -437,10 +434,7 @@ int cpufreq_dbs_governor_init(struct cpufreq_policy *policy) latency = 1; /* Bring kernel and HW constraints together */ - dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate, - MIN_LATENCY_MULTIPLIER * latency); - dbs_data->sampling_rate = max(dbs_data->min_sampling_rate, - LATENCY_MULTIPLIER * latency); + dbs_data->sampling_rate = LATENCY_MULTIPLIER * latency; if (!have_governor_per_policy()) gov->gdbs_data = dbs_data; diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h index 7cbb07512e4c..06d9f90ede93 100644 --- a/drivers/cpufreq/cpufreq_governor.h +++ b/drivers/cpufreq/cpufreq_governor.h @@ -41,7 +41,6 @@ enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE}; struct dbs_data { struct gov_attr_set attr_set; void *tuners; - unsigned int min_sampling_rate; unsigned int ignore_nice_load; unsigned
Re: [v3 1/6] mm, oom: use oom_victims counter to synchronize oom victim selection
On Wed 21-06-17 22:19:11, Roman Gushchin wrote: > Oom killer should avoid unnecessary kills. To prevent them, during > the tasks list traverse we check for task which was previously > selected as oom victims. If there is such a task, new victim > is not selected. > > This approach is sub-optimal (we're doing costly iteration over the task > list every time) and will not work for the cgroup-aware oom killer. > > We already have oom_victims counter, which can be effectively used > for the task. A global counter will not work properly, I am afraid. a) you should consider the oom domain and do not block oom on unrelated domains and b) you have no guarantee that the oom victim will terminate reasonably. That is why we have MMF_OOM_SKIP check in oom_evaluate_task. I think you should have something similar for your memcg victim selection. If you see a memcg in the oom hierarchy with oom victims which are alive and not MMF_OOM_SKIP, you should abort the scanning. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [v3 5/6] mm, oom: don't mark all oom victims tasks with TIF_MEMDIE
On Wed 21-06-17 22:19:15, Roman Gushchin wrote: > We want to limit the number of tasks which are having an access > to the memory reserves. To ensure the progress it's enough > to have one such process at the time. > > If we need to kill the whole cgroup, let's give an access to the > memory reserves only to the first process in the list, which is > (usually) the biggest process. > This will give us good chances that all other processes will be able > to quit without an access to the memory reserves. I don't like this to be honest. Is there any reason to go the reduced memory reserves access to oom victims I was suggesting earlier [1]? [1] http://lkml.kernel.org/r/http://lkml.kernel.org/r/1472723464-22866-2-git-send-email-mho...@kernel.org -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-doc" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html