Hi Folks,
On Tue, Apr 20, 2021 at 12:18 PM Roman Gushchin wrote:
>
> On Mon, Apr 19, 2021 at 06:44:02PM -0700, Shakeel Butt wrote:
> > Proposal: Provide memory guarantees to userspace oom-killer.
> >
> > Background:
> >
> > Issues with kernel oom-killer:
&g
On Mon, Apr 19, 2021 at 06:44:02PM -0700, Shakeel Butt wrote:
> Proposal: Provide memory guarantees to userspace oom-killer.
>
> Background:
>
> Issues with kernel oom-killer:
> 1. Very conservative and prefer to reclaim. Applications can suffer
> for a long time.
> 2. Bo
On Mon, Apr 19, 2021 at 11:46 PM Michal Hocko wrote:
>
> On Mon 19-04-21 18:44:02, Shakeel Butt wrote:
[...]
> > memory.min. However a new allocation from userspace oom-killer can
> > still get stuck in the reclaim and policy rich oom-killer do trigger
> > new allocations
On Mon 19-04-21 18:44:02, Shakeel Butt wrote:
> Proposal: Provide memory guarantees to userspace oom-killer.
>
> Background:
>
> Issues with kernel oom-killer:
> 1. Very conservative and prefer to reclaim. Applications can suffer
> for a long time.
> 2. Borrows the contex
Proposal: Provide memory guarantees to userspace oom-killer.
Background:
Issues with kernel oom-killer:
1. Very conservative and prefer to reclaim. Applications can suffer
for a long time.
2. Borrows the context of the allocator which can be resource limited
(low sched priority or limited CPU
On Thu, Jun 18, 2020 at 8:37 PM Chris Down wrote:
>
> Yafang Shao writes:
> >On Thu, Jun 18, 2020 at 5:09 AM Chris Down wrote:
> >>
> >> Naresh Kamboju writes:
> >> >After this patch applied the reported issue got fixed.
> >>
> >> Great! Thank you Naresh and Michal for helping to get to the botto
Michal Hocko writes:
I would really prefer to do that work on top of the fixes we (used to)
have in mmotm (with the fixup).
Oh, for sure. We should reintroduce the patches with the fix, and then look at
longer-term solutions once that's in :-)
On Thu 18-06-20 13:37:43, Chris Down wrote:
> Yafang Shao writes:
> > On Thu, Jun 18, 2020 at 5:09 AM Chris Down wrote:
> > >
> > > Naresh Kamboju writes:
> > > >After this patch applied the reported issue got fixed.
> > >
> > > Great! Thank you Naresh and Michal for helping to get to the bottom
Yafang Shao writes:
On Thu, Jun 18, 2020 at 5:09 AM Chris Down wrote:
Naresh Kamboju writes:
>After this patch applied the reported issue got fixed.
Great! Thank you Naresh and Michal for helping to get to the bottom of this :-)
I'll send out a new version tomorrow with the fixes applied and
On Thu, Jun 18, 2020 at 5:09 AM Chris Down wrote:
>
> Naresh Kamboju writes:
> >After this patch applied the reported issue got fixed.
>
> Great! Thank you Naresh and Michal for helping to get to the bottom of this
> :-)
>
> I'll send out a new version tomorrow with the fixes applied and both of
Naresh Kamboju writes:
After this patch applied the reported issue got fixed.
Great! Thank you Naresh and Michal for helping to get to the bottom of this :-)
I'll send out a new version tomorrow with the fixes applied and both of you
credited in the changelog for the detection and fix.
h should indeed be the case if you haven't set
> > > > them
> > > > in the hierarchy).
> > > >
> > > > My guess is that page_counter_read(&memcg->memory) is 0, which means
> > > > mem_cgroup_below_min will return
oup_below_min even
> > > when min/emin is 0 (which should indeed be the case if you haven't set
> > > them
> > > in the hierarchy).
> > >
> > > My guess is that page_counter_read(&memcg->memory) is 0, which means
> > > mem_cgroup_be
age_counter_read(&memcg->memory) is 0, which means
> > mem_cgroup_below_min will return 1.
>
> Yes this is the case because this is likely the root memcg which skips
> all charges.
>
> > However, I don't know for sure why that should then result in the OOM kille
ld indeed be the case if you haven't set them
> in the hierarchy).
>
> My guess is that page_counter_read(&memcg->memory) is 0, which means
> mem_cgroup_below_min will return 1.
Yes this is the case because this is likely the root memcg which skips
all charges.
> H
Michal Hocko writes:
and it makes some sense. Except for the root memcg where we do not
account any memory. Adding if (mem_cgroup_is_root(memcg)) return false;
should do the trick. The same is the case for mem_cgroup_below_low.
Could you give it a try please just to confirm?
Oh, of course :-) T
he following two patches have been reverted on next-20200519 and
> > > > > retested the
> > > > > reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> > > > > ( invoked oom-killer is gone now)
> > > > >
> > > >
even when
min/emin is 0 (which should indeed be the case if you haven't set them in the
hierarchy).
My guess is that page_counter_read(&memcg->memory) is 0, which means
mem_cgroup_below_min will return 1.
However, I don't know for sure why that should then result in the OOM ki
ue LKFT teammate Anders Roxell
> > > > git bisected the problem and found bad commit(s) which caused this
> > > > problem.
> > > >
> > > > The following two patches have been reverted on next-20200519 and
> > > > retested the
> &g
On Fri 12-06-20 15:13:22, Naresh Kamboju wrote:
> On Thu, 11 Jun 2020 at 15:25, Michal Hocko wrote:
> >
> > On Fri 29-05-20 11:49:20, Michal Hocko wrote:
> > > On Fri 29-05-20 02:56:44, Chris Down wrote:
> > > > Yafang Shao writes:
> > > Agreed. Even if e{low,min} might still have some rough edges
On Thu, 11 Jun 2020 at 15:25, Michal Hocko wrote:
>
> On Fri 29-05-20 11:49:20, Michal Hocko wrote:
> > On Fri 29-05-20 02:56:44, Chris Down wrote:
> > > Yafang Shao writes:
> > Agreed. Even if e{low,min} might still have some rough edges I am
> > completely puzzled how we could end up oom if none
ersal the memcg tree we calculate a
> > > protection value for this reclaimer, finnaly it disapears after the
> > > reclaimer stops. That is why I highly suggest to add an new protection
> > > member in scan_control before.
> >
> > I agree with you that the e
the
> > reclaimer stops. That is why I highly suggest to add an new protection
> > member in scan_control before.
>
> I agree with you that the e{min,low} lifecycle is confusing for everyone --
> the only thing I've not seen confirmation of is any confirmed correlation
>
confusing for everyone -- the
only thing I've not seen confirmation of is any confirmed correlation with the
i386 oom killer issue. If you've validated that, I'd like to see the data :-)
history this problem started happening from
> >> > Bad : next-20200430 (still reproducible on next-20200519)
> >> > Good : next-20200429
> >> >
> >> > The git tree / tag used for testing is from linux next-20200430 tag and
> >> > reverted
&g
ext-20200429
>
> The git tree / tag used for testing is from linux next-20200430 tag and
reverted
> following three patches and oom-killer problem fixed.
>
> Revert "mm, memcg: avoid stale protection values when cgroup is above
> protection"
> Revert "mm, memcg: dec
next-20200429
> >
> > The git tree / tag used for testing is from linux next-20200430 tag and
> > reverted
> > following three patches and oom-killer problem fixed.
> >
> > Revert "mm, memcg: avoid stale protection values when cgroup is above
> >
t-20200430 tag and
> reverted
> following three patches and oom-killer problem fixed.
>
> Revert "mm, memcg: avoid stale protection values when cgroup is above
> protection"
> Revert "mm, memcg: decouple e{low,min} state mutations from protectinn checks"
> Revert
[Sorry for a late reply - was offline for few days]
On Thu 21-05-20 17:58:55, Johannes Weiner wrote:
> On Thu, May 21, 2020 at 01:06:28PM -0700, Hugh Dickins wrote:
[...]
> >From d9e7ed15d1c9248a3fd99e35e82437549154dac7 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner
> Date: Thu, 21 May 2020 17:
On Thu, 21 May 2020, Johannes Weiner wrote:
> On Thu, May 21, 2020 at 01:06:28PM -0700, Hugh Dickins wrote:
> > On Thu, 21 May 2020, Johannes Weiner wrote:
> > > do_memsw_account() used to be automatically false when the cgroup
> > > controller was disabled. Now that it's replaced by
> > > cgroup_m
On Thu, May 21, 2020 at 01:06:28PM -0700, Hugh Dickins wrote:
> On Thu, 21 May 2020, Johannes Weiner wrote:
> > do_memsw_account() used to be automatically false when the cgroup
> > controller was disabled. Now that it's replaced by
> > cgroup_memory_noswap, for which this isn't true, make the
> >
My apology !
As per the test results history this problem started happening from
Bad : next-20200430 (still reproducible on next-20200519)
Good : next-20200429
The git tree / tag used for testing is from linux next-20200430 tag and reverted
following three patches and oom-killer problem fixed
On Thu, 21 May 2020, Johannes Weiner wrote:
>
> Very much appreciate you guys tracking it down so quickly. Sorry about
> the breakage.
>
> I think mem_cgroup_disabled() checks are pretty good markers of public
> entry points to the memcg API, so I'd prefer that even if a bit more
> verbose. What
T teammate Anders
> > > > > > > Roxell
> > > > > > > git bisected the problem and found bad commit(s) which caused
> > > > > > > this problem.
> > > > > > >
> > > > > > >
ue LKFT teammate Anders Roxell
> > > > git bisected the problem and found bad commit(s) which caused this
> > > > problem.
> > > >
> > > > The following two patches have been reverted on next-20200519 and
> > > > retested the
> &g
d found bad commit(s) which caused this
> > > problem.
> > >
> > > The following two patches have been reverted on next-20200519 and
> > > retested the
> > > reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> > > ( invoked oom-kil
caused this
> > > > > > problem.
> > > > > >
> > > > > > The following two patches have been reverted on next-20200519 and
> > > > > > retested the
> > > > > > reproducible steps and confirmed the test cas
following two patches have been reverted on next-20200519 and
> > > > > retested the
> > > > > reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> > > > > ( invoked oom-killer is gone now)
> > > > >
> > > &
on this issue LKFT teammate Anders Roxell
> > > > git bisected the problem and found bad commit(s) which caused this
> > > > problem.
> > > >
> > > > The following two patches have been reverted on next-20200519 and
> > > > retested the
>
and found bad commit(s) which caused this
> > > problem.
> > >
> > > The following two patches have been reverted on next-20200519 and
> > > retested the
> > > reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> > > ( invoked oom
two patches have been reverted on next-20200519 and retested
> > the
> > reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> > ( invoked oom-killer is gone now)
> >
> > Revert "mm, memcg: avoid stale protection va
:
> > > >
> > > >
> > > > This issue is specific on 32-bit architectures i386 and arm on
> > > > linux-next tree.
> > > > As per the test results history this problem started happening from
> > > > mkfs -t ext4 /dev/disk/
On Thu, May 21, 2020 at 11:22 AM Naresh Kamboju
wrote:
> On Thu, 21 May 2020 at 00:39, Chris Down wrote:
> > Since you have i386 hardware available, and I don't, could you please apply
> > only "avoid stale protection" again and check if it only happens with that
> > commit, or requires both? Tha
ng two patches have been reverted on next-20200519 and retested
> >the
> >reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> >( invoked oom-killer is gone now)
> >
> >Revert "mm, memcg: avoid stale protectio
86 and arm on linux-next
> > > tree.
> > > As per the test results history this problem started happening from
> > > mkfs -t ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190804A00BE5
> > >
> > >
> > > Problem:
> > > [ 38.802375]
nd bad commit(s) which caused this problem.
>
> The following two patches have been reverted on next-20200519 and retested the
> reproducible steps and confirmed the test case mkfs -t ext4 got PASS.
> ( invoked oom-killer is gone now)
>
> Revert "mm, memcg: avoid stale protectio
test case mkfs -t ext4 got PASS.
( invoked oom-killer is gone now)
Revert "mm, memcg: avoid stale protection values when cgroup is above
protection"
This reverts commit 23a53e1c02006120f89383270d46cbd040a70bc6.
Revert "mm, memcg: decouple e{low,min} state mutations from pr
eproduce:
> dd if=/dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190504A00573
> of=/dev/null bs=1M count=2048
> or
> mkfs -t ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190804A00BE5
>
>
> Problem:
> [ 38.802375] dd invoked oom-killer: gfp_mask=0x100cc0(GFP_USER),
> o
count=2048
or
mkfs -t ext4 /dev/disk/by-id/ata-SanDisk_SSD_PLUS_120GB_190804A00BE5
Problem:
[ 38.802375] dd invoked oom-killer: gfp_mask=0x100cc0(GFP_USER),
order=0, oom_score_adj=0
i386 crash log: https://pastebin.com/Hb8U89vU
arm crash log: https://pastebin.com/BD9t3JTm
On Tue, 19 May 2020
n
> > > wrote:
> > > >
> > > > On Fri, 1 May 2020 18:08:28 +0530 Naresh Kamboju
> > > > wrote:
> > > >
> > > > > mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64
> > > > > device
> > >
2020 18:08:28 +0530 Naresh Kamboju
> > > wrote:
> > >
> > > > mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64 device
> > > > and started happening on linux -next master branch kernel tag
> > > > next-20200430
> > > > and nex
On Mon 18-05-20 19:40:55, Naresh Kamboju wrote:
> Thanks for looking into this problem.
>
> On Sat, 2 May 2020 at 02:28, Andrew Morton wrote:
> >
> > On Fri, 1 May 2020 18:08:28 +0530 Naresh Kamboju
> > wrote:
> >
> > > mkfs -t ext4 invoked oom-kill
Thanks for looking into this problem.
On Sat, 2 May 2020 at 02:28, Andrew Morton wrote:
>
> On Fri, 1 May 2020 18:08:28 +0530 Naresh Kamboju
> wrote:
>
> > mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64 device
> > and started happening on linux -next
On Mon 11-05-20 12:34:00, Konstantin Khlebnikov wrote:
>
>
> On 11/05/2020 11.39, Michal Hocko wrote:
> > On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
> > > Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
> > > back to the
On 11/05/2020 11.39, Michal Hocko wrote:
On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
back to the charge path") cgroup oom killer is no longer invoked only from
page faults. Now it implement
On Fri 08-05-20 17:16:29, Konstantin Khlebnikov wrote:
> Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
> back to the charge path") cgroup oom killer is no longer invoked only from
> page faults. Now it implements the same semantics as global OOM k
Hi,
On 5/8/20 7:16 AM, Konstantin Khlebnikov wrote:
> Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
> back to the charge path") cgroup oom killer is no longer invoked only from
> page faults. Now it implements the same semantics as global OOM k
Starting from v4.19 commit 29ef680ae7c2 ("memcg, oom: move out_of_memory
back to the charge path") cgroup oom killer is no longer invoked only from
page faults. Now it implements the same semantics as global OOM killer:
allocation context invokes OOM killer and keeps retrying unt
On Fri, 1 May 2020 18:08:28 +0530 Naresh Kamboju
wrote:
> mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64 device
> and started happening on linux -next master branch kernel tag next-20200430
> and next-20200501. We did not bisect this problem.
It would be wonderf
mkfs -t ext4 invoked oom-killer on i386 kernel running on x86_64 device
and started happening on linux -next master branch kernel tag next-20200430
and next-20200501. We did not bisect this problem.
metadata
git branch: master
git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next
s retrying
forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
cannot invoke the OOM killer due to commit 3da88fb3bacfaa33 ("mm, oom:
move GFP_NOFS check to out_of_memory").
Allowing forced charge due to being unable to invoke memcg OOM killer will
lead to global OOM si
s retrying
forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
cannot invoke the OOM killer due to commit 3da88fb3bacfaa33 ("mm, oom:
move GFP_NOFS check to out_of_memory").
Allowing forced charge due to being unable to invoke memcg OOM killer will
lead to global OOM si
s retrying
forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
cannot invoke the OOM killer due to commit 3da88fb3bacfaa33 ("mm, oom:
move GFP_NOFS check to out_of_memory").
Allowing forced charge due to being unable to invoke memcg OOM killer will
lead to global OOM si
s retrying
forever without making forward progress because mem_cgroup_oom(GFP_NOFS)
cannot invoke the OOM killer due to commit 3da88fb3bacfaa33 ("mm, oom:
move GFP_NOFS check to out_of_memory").
Allowing forced charge due to being unable to invoke memcg OOM killer will
lead to global OOM si
On Tue 06-08-19 20:39:22, Pankaj Suryawanshi wrote:
> On Tue, Aug 6, 2019 at 8:37 PM Michal Hocko wrote:
> >
> > On Tue 06-08-19 20:24:03, Pankaj Suryawanshi wrote:
> > > On Tue, 6 Aug, 2019, 1:46 AM Michal Hocko, wrote:
> > > >
> > > > On Mon 05-08-19 21:04:53, Pankaj Suryawanshi wrote:
> > > >
On Tue 06-08-19 20:25:51, Pankaj Suryawanshi wrote:
[...]
> lowmem reserve ? it is min_free_kbytes or something else.
Nope. Lowmem rezerve is a measure to protect from allocations targetting
higher zones (have a look at setup_per_zone_lowmem_reserve). The value
for each zone depends on the amount
On Tue, Aug 6, 2019 at 8:37 PM Michal Hocko wrote:
>
> On Tue 06-08-19 20:24:03, Pankaj Suryawanshi wrote:
> > On Tue, 6 Aug, 2019, 1:46 AM Michal Hocko, wrote:
> > >
> > > On Mon 05-08-19 21:04:53, Pankaj Suryawanshi wrote:
> > > > On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote:
> > > > >
>
On Tue 06-08-19 20:24:03, Pankaj Suryawanshi wrote:
> On Tue, 6 Aug, 2019, 1:46 AM Michal Hocko, wrote:
> >
> > On Mon 05-08-19 21:04:53, Pankaj Suryawanshi wrote:
> > > On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote:
> > > >
> > > > On Mon 05-08-19 13:56:20, Vlastimil Babka wrote:
> > > > > O
On Tue, 6 Aug, 2019, 1:46 AM Michal Hocko, wrote:
>
> On Mon 05-08-19 21:04:53, Pankaj Suryawanshi wrote:
> > On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote:
> > >
> > > On Mon 05-08-19 13:56:20, Vlastimil Babka wrote:
> > > > On 8/5/19 1:24 PM, Michal Hocko wrote:
> > > > >> [ 727.954355] CP
FS)
cannot invoke the OOM killer due to commit 3da88fb3bacfaa33 ("mm, oom:
move GFP_NOFS check to out_of_memory").
Allowing forced charge due to being unable to invoke memcg OOM killer
will lead to global OOM situation. Also, just returning -ENOMEM will be
risky because OOM path is lost
On 8/5/19 5:34 PM, Pankaj Suryawanshi wrote:
> On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote:
>>
>> On Mon 05-08-19 13:56:20, Vlastimil Babka wrote:
>> > On 8/5/19 1:24 PM, Michal Hocko wrote:
>> > >> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 Tainted: P O
>> > >> 4.14.65 #60
On Mon 05-08-19 21:04:53, Pankaj Suryawanshi wrote:
> On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote:
> >
> > On Mon 05-08-19 13:56:20, Vlastimil Babka wrote:
> > > On 8/5/19 1:24 PM, Michal Hocko wrote:
> > > >> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 Tainted: P
> > > >> O
On Mon, Aug 5, 2019 at 5:35 PM Michal Hocko wrote:
>
> On Mon 05-08-19 13:56:20, Vlastimil Babka wrote:
> > On 8/5/19 1:24 PM, Michal Hocko wrote:
> > >> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 Tainted: P O
> > >> 4.14.65 #606
> > > [...]
> > >> [ 728.029390] [] (oom_kill_pro
On Mon 05-08-19 13:56:20, Vlastimil Babka wrote:
> On 8/5/19 1:24 PM, Michal Hocko wrote:
> >> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 Tainted: P O
> >> 4.14.65 #606
> > [...]
> >> [ 728.029390] [] (oom_kill_process) from []
> >> (out_of_memory+0x140/0x368)
> >> [ 728.037569
On 8/5/19 1:24 PM, Michal Hocko wrote:
>> [ 727.954355] CPU: 0 PID: 56 Comm: kworker/u8:2 Tainted: P O
>> 4.14.65 #606
> [...]
>> [ 728.029390] [] (oom_kill_process) from []
>> (out_of_memory+0x140/0x368)
>> [ 728.037569] r10:0001 r9:c12169bc r8:0041 r7:c121e680 r6:c1216588
On Sat 03-08-19 18:53:50, Pankaj Suryawanshi wrote:
> Hello,
>
> Below are the logs from oom-kller. I am not able to interpret/decode the
> logs as well as not able to find root cause of oom-killer.
>
> Note: CPU Arch: Arm 32-bit , Kernel - 4.14.65
Fixed up line wrapping and t
trigger a race when the oom killer complains that there are no oom
elible tasks and complain into the log which is both annoying and
confusing because there is no actual problem. The race looks as
follows:
P1 oom_reaper P2
try_charge
trigger a race when the oom killer complains that there are no oom
elible tasks and complain into the log which is both annoying and
confusing because there is no actual problem. The race looks as
follows:
P1 oom_reaper P2
try_charge
From: Tetsuo Handa
[ Upstream commit 7775face207922ea62a4e96b9cd45abfdc7b9840 ]
If a memory cgroup contains a single process with many threads
(including different process group sharing the mm) then it is possible
to trigger a race when the oom killer complains that there are no oom
elible
From: Tetsuo Handa
[ Upstream commit 7775face207922ea62a4e96b9cd45abfdc7b9840 ]
If a memory cgroup contains a single process with many threads
(including different process group sharing the mm) then it is possible
to trigger a race when the oom killer complains that there are no oom
elible
On Fri, Feb 22, 2019 at 08:10:01AM +0100, Michal Hocko wrote:
> On Fri 22-02-19 13:37:33, Junil Lee wrote:
> > The oom killer use get_mm_rss() function to estimate how free memory
> > will be reclaimed when the oom killer select victim task.
> >
> > However, the retu
On Fri 22-02-19 13:37:33, Junil Lee wrote:
> The oom killer use get_mm_rss() function to estimate how free memory
> will be reclaimed when the oom killer select victim task.
>
> However, the returned rss size by get_mm_rss() function was changed from
> "mm, shmem: add int
The oom killer use get_mm_rss() function to estimate how free memory
will be reclaimed when the oom killer select victim task.
However, the returned rss size by get_mm_rss() function was changed from
"mm, shmem: add internal shmem resident memory accounting" commit.
This commit
4.19-stable review patch. If anyone has any objections, please let me know.
--
From: Michal Hocko
commit 7056d3a37d2c6b10c13e8e69adc67ec1fc65 upstream.
Burt Holzman has noticed that memcg v1 doesn't notify about OOM events via
eventfd anymore. The reason is that 29ef680ae
4.20-stable review patch. If anyone has any objections, please let me know.
--
From: Michal Hocko
commit 7056d3a37d2c6b10c13e8e69adc67ec1fc65 upstream.
Burt Holzman has noticed that memcg v1 doesn't notify about OOM events via
eventfd anymore. The reason is that 29ef680ae
From: Michal Hocko
Burt Holzman has noticed that memcg v1 doesn't notify about OOM events
via eventfd anymore. The reason is that 29ef680ae7c2 ("memcg, oom: move
out_of_memory back to the charge path") has moved the oom handling back
to the charge path. While doing so the notification was left be
;
>
> On Tue, 7 Aug 2018, David Rientjes wrote:
>
> > On Mon, 6 Aug 2018, Roman Gushchin wrote:
> >
> > > > In a cgroup-aware oom killer world, yes, we need the ability to specify
> > > > that the usage of the entire subtree should be compared as a
Roman, have you had time to go through this?
On Tue, 7 Aug 2018, David Rientjes wrote:
> On Mon, 6 Aug 2018, Roman Gushchin wrote:
>
> > > In a cgroup-aware oom killer world, yes, we need the ability to specify
> > > that the usage of the entire subtree should
Roman Gushchin wrote:
> On Tue, Jul 17, 2018 at 06:13:47AM +0900, Tetsuo Handa wrote:
> > No response from Roman and David...
> >
> > Andrew, will you once drop Roman's cgroup-aware OOM killer and David's
> > patches?
> > Roman's series has a bug
There are three significant concerns about the cgroup aware oom killer as
it is implemented in -mm:
(1) allows users to evade the oom killer by creating subcontainers or
using other controllers since scoring is done per cgroup and not
hierarchically,
(2) unfairly compares the root
s from the traditional per process selection, and (2) a remount to
change.
Instead of enabling the cgroup aware oom killer with the "groupoom" mount
option, set the mem cgroup subtree's memory.oom_policy to "cgroup".
The heuristic used to select a process or cgroup to kil
The cgroup-aware oom killer currently considers the set of allowed nodes
for the allocation that triggers the oom killer and discounts usage from
disallowed nodes when comparing cgroups.
If a cgroup has both the cpuset and memory controllers enabled, it may be
possible to restrict allocations to
; > > > Reported-by: Tetsuo Handa
> > > > Signed-off-by: Paul E. McKenney
> > >
> > > I would also note that waiting in the notifier might be a problem on its
> > > own because we are holding the oom_lock and the system cannot trigger
>
> > > problems,
> > > it can easily be reinserted.
> > >
> > > Reported-by: Michal Hocko
> > > Reported-by: Tetsuo Handa
> > > Signed-off-by: Paul E. McKenney
> >
> > I would also note that waiting in the notifier
Handa
> > Signed-off-by: Paul E. McKenney
>
> I would also note that waiting in the notifier might be a problem on its
> own because we are holding the oom_lock and the system cannot trigger
> the OOM killer while we are holding it and waiting for oom_callback_wq
> ev
7;s OOM code. If this causes problems,
> it can easily be reinserted.
>
> Reported-by: Michal Hocko
> Reported-by: Tetsuo Handa
> Signed-off-by: Paul E. McKenney
I would also note that waiting in the notifier might be a problem on its
own because we are ho
o the CPU in question.
> > > >
> > > > I am afraid this is too low level for my to understand what is going on
> > > > here. What are lazy callbacks and why do they need any specific action
> > > > when we are getting close to OOM? I mean, I do unders
on.
> > > >
> > > > I am afraid this is too low level for my to understand what is going on
> > > > here. What are lazy callbacks and why do they need any specific action
> > > > when we are getting close to OOM? I mean, I do understand that we mi
On Fri, Jun 29, 2018 at 11:35:48PM +0900, Tetsuo Handa wrote:
> On 2018/06/29 21:52, Paul E. McKenney wrote:
> > The effect of RCU's current OOM code is to speed up callback invocation
> > by at most a few seconds (assuming no stalled CPUs, in which case
> > it is not possible to speed up callback
> > > The enqueuing happens via an IPI to the CPU in question.
> > >
> > > I am afraid this is too low level for my to understand what is going on
> > > here. What are lazy callbacks and why do they need any specific action
> > > when we are getting close
1 - 100 of 1001 matches
Mail list logo