Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-18 Thread Chris Down
Eiichi Tsukata writes: But that comes with a challenge: despite listening on cgroup for pressure notifications (which happen from those runtime events we do not control), We do also have global pressure (PSI) counters. Have you tried to look into those and try to back off even when the

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-18 Thread Eiichi Tsukata
Hi Michal > On Feb 17, 2021, at 21:31, Michal Hocko wrote: > > On Wed 17-02-21 10:42:24, Eiichi Tsukata wrote: >> Hi All, >> >> Firstly, thank you for your careful review and attention to my patch >> (and apologies for top-posting!). Let me first explain why our use >> case requires hugetlb

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-17 Thread Shakeel Butt
On Tue, Feb 16, 2021 at 5:25 PM David Rientjes wrote: > > On Tue, 16 Feb 2021, Michal Hocko wrote: > > > > Hugepages can be preallocated to avoid unpredictable allocation latency. > > > If we run into 4k page shortage, the kernel can trigger OOM even though > > > there were free hugepages. When

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-17 Thread Michal Hocko
On Wed 17-02-21 13:31:07, Michal Hocko wrote: [...] > Thanks for your usecase description. It helped me to understand what you > are doing and how this can be really useful for your particular setup. > This is really a very specific situation from my POV. I am not yet sure > this is generic enough

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-17 Thread Michal Hocko
On Wed 17-02-21 10:42:24, Eiichi Tsukata wrote: > Hi All, > > Firstly, thank you for your careful review and attention to my patch > (and apologies for top-posting!). Let me first explain why our use > case requires hugetlb over THP and then elaborate on the difficulty we > have to maintain the

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-17 Thread Eiichi Tsukata
Hi All, Firstly, thank you for your careful review and attention to my patch (and apologies for top-posting!). Let me first explain why our use case requires hugetlb over THP and then elaborate on the difficulty we have to maintain the correct number of hugepages in the pool, finally concluding

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-17 Thread David Hildenbrand
On 16.02.21 04:07, Eiichi Tsukata wrote: Hugepages can be preallocated to avoid unpredictable allocation latency. If we run into 4k page shortage, the kernel can trigger OOM even though there were free hugepages. When OOM is triggered by user address page fault handler, we can use oom notifier

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-16 Thread Michal Hocko
On Tue 16-02-21 14:30:15, Mike Kravetz wrote: [...] > However, this is an 'opt in' feature. So, I would not expect anyone who > carefully plans the size of their hugetlb pool to enable such a feature. > If there is a use case where hugetlb pages are used in a non-essential > application, this

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-16 Thread Michal Hocko
On Tue 16-02-21 13:53:12, David Rientjes wrote: > On Tue, 16 Feb 2021, Michal Hocko wrote: [...] > > Overall, I am not really happy about this feature even when above is > > fixed, but let's hear more the actual problem first. > > Shouldn't this behavior be possible as an oomd plugin instead,

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-16 Thread Mike Kravetz
On 2/16/21 12:12 AM, Michal Hocko wrote: > On Tue 16-02-21 03:07:13, Eiichi Tsukata wrote: >> Hugepages can be preallocated to avoid unpredictable allocation latency. >> If we run into 4k page shortage, the kernel can trigger OOM even though >> there were free hugepages. When OOM is triggered by

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-16 Thread David Rientjes
On Tue, 16 Feb 2021, Michal Hocko wrote: > > Hugepages can be preallocated to avoid unpredictable allocation latency. > > If we run into 4k page shortage, the kernel can trigger OOM even though > > there were free hugepages. When OOM is triggered by user address page > > fault handler, we can use

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-16 Thread Chris Down
Hi Eiichi, I agree with Michal's points, and I think there are also some other design questions which don't quite make sense to me. Perhaps you can clear them up? :-) Eiichi Tsukata writes: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 4bdb58ab14cb..e2d57200fd00 100644 --- a/mm/hugetlb.c

Re: [RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-16 Thread Michal Hocko
On Tue 16-02-21 03:07:13, Eiichi Tsukata wrote: > Hugepages can be preallocated to avoid unpredictable allocation latency. > If we run into 4k page shortage, the kernel can trigger OOM even though > there were free hugepages. When OOM is triggered by user address page > fault handler, we can use

[RFC PATCH] mm, oom: introduce vm.sacrifice_hugepage_on_oom

2021-02-15 Thread Eiichi Tsukata
Hugepages can be preallocated to avoid unpredictable allocation latency. If we run into 4k page shortage, the kernel can trigger OOM even though there were free hugepages. When OOM is triggered by user address page fault handler, we can use oom notifier to free hugepages in user space but if it's