Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-29 Thread Edward Chron
On Thu, Aug 29, 2019 at 11:44 AM Qian Cai wrote: > > On Thu, 2019-08-29 at 09:09 -0700, Edward Chron wrote: > > > > Feel like you are going in circles to "sell" without any new information. > > > If > > > you > > > need to deal with

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-29 Thread Edward Chron
On Thu, Aug 29, 2019 at 9:18 AM Michal Hocko wrote: > > On Thu 29-08-19 08:03:19, Edward Chron wrote: > > On Thu, Aug 29, 2019 at 4:56 AM Michal Hocko wrote: > [...] > > > Or simply provide a hook with the oom_control to be called to report > > > without replac

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-29 Thread Edward Chron
On Thu, Aug 29, 2019 at 8:42 AM Qian Cai wrote: > > On Thu, 2019-08-29 at 08:03 -0700, Edward Chron wrote: > > On Thu, Aug 29, 2019 at 4:56 AM Michal Hocko wrote: > > > > > > On Thu 29-08-19 19:14:46, Tetsuo Handa wrote: > > > > On 2019/08/29 16:11, Mic

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-29 Thread Edward Chron
On Thu, Aug 29, 2019 at 7:09 AM Tetsuo Handa wrote: > > On 2019/08/29 20:56, Michal Hocko wrote: > >> But please be aware that, I REPEAT AGAIN, I don't think neither eBPF nor > >> SystemTap will be suitable for dumping OOM information. OOM situation means > >> that even single page fault event can

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-29 Thread Edward Chron
On Thu, Aug 29, 2019 at 12:11 AM Michal Hocko wrote: > > On Wed 28-08-19 12:46:20, Edward Chron wrote: > [...] > > Our belief is if you really think eBPF is the preferred mechanism > > then move OOM reporting to an eBPF. > > I've said that all this additional i

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-29 Thread Edward Chron
On Thu, Aug 29, 2019 at 4:56 AM Michal Hocko wrote: > > On Thu 29-08-19 19:14:46, Tetsuo Handa wrote: > > On 2019/08/29 16:11, Michal Hocko wrote: > > > On Wed 28-08-19 12:46:20, Edward Chron wrote: > > >> Our belief is if you really think eBPF is the preferr

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-28 Thread Edward Chron
On Wed, Aug 28, 2019 at 1:04 PM Edward Chron wrote: > > On Wed, Aug 28, 2019 at 3:12 AM Tetsuo Handa > wrote: > > > > On 2019/08/28 16:08, Michal Hocko wrote: > > > On Tue 27-08-19 19:47:22, Edward Chron wrote: > > >> For production systems instal

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-28 Thread Edward Chron
On Wed, Aug 28, 2019 at 1:18 PM Qian Cai wrote: > > On Wed, 2019-08-28 at 12:46 -0700, Edward Chron wrote: > > But with the caveat that running a eBPF script that it isn't standard Linux > > operating procedure, at this point in time any way will not be well > >

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-28 Thread Edward Chron
On Wed, Aug 28, 2019 at 3:12 AM Tetsuo Handa wrote: > > On 2019/08/28 16:08, Michal Hocko wrote: > > On Tue 27-08-19 19:47:22, Edward Chron wrote: > >> For production systems installing and updating EBPF scripts may someday > >> be very common, but I wonder how data

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-27 Thread Edward Chron
On Tue, Aug 27, 2019 at 6:32 PM Qian Cai wrote: > > > > > On Aug 27, 2019, at 9:13 PM, Edward Chron wrote: > > > > On Tue, Aug 27, 2019 at 5:50 PM Qian Cai wrote: > >> > >> > >> > >>> On Aug 27, 2019, at 8:23 PM, Edward Chron w

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-27 Thread Edward Chron
On Tue, Aug 27, 2019 at 5:50 PM Qian Cai wrote: > > > > > On Aug 27, 2019, at 8:23 PM, Edward Chron wrote: > > > > > > > > On Tue, Aug 27, 2019 at 5:40 AM Qian Cai wrote: > > On Mon, 2019-08-26 at 12:36 -0700, Edward Chron wrote: > > >

Re: [PATCH 00/10] OOM Debug print selection and additional information

2019-08-27 Thread Edward Chron
On Tue, Aug 27, 2019 at 12:15 AM Michal Hocko wrote: > > On Mon 26-08-19 12:36:28, Edward Chron wrote: > [...] > > Extensibility using OOM debug options > > - > > What is needed is an extensible system to optionally configure > &g

[PATCH 03/10] mm/oom_debug: Add Tasks Summary

2019-08-26 Thread Edward Chron
ample Output: - Sample Tasks Summary message output: Aug 13 18:52:48 yoursystem kernel: Threads: 492 Processes: 248 forks_since_boot: 7786 procs_runable: 4 procs_iowait: 0 Signed-off-by: Edward Chron --- mm/Kconfig.debug| 16 mm/oom_kill_debug.c

[PATCH 09/10] mm/oom_debug: Add Enhanced Slab Print Information

2019-08-26 Thread Edward Chron
alObj ObjSize AlignSize Objs/Slab Pgs/Slab ActiveSlab TotalSlab Slab_Name Aug 13 18:52:47 mysrvr kernel: 403412 1613 1648 224 256161103103 skbuff_head.. Signed-off-by: Edward Chron --- mm/Kconfig.debug| 15 ++

[PATCH 07/10] mm/oom_debug: Add Select Process Entries Print

2019-08-26 Thread Edward Chron
output (minsize = 0.1% of totalpages): Aug 13 20:16:30 yourserver kernel: Summary: OOM Tasks considered:245 printed:33 minimum size:32576kB total-pages:32579084kB Signed-off-by: Edward Chron --- include/linux/oom.h | 1 + mm/Kconfig.debug| 34 ++ mm/oom_ki

[PATCH 10/10] mm/oom_debug: Add Enhanced Process Print Information

2019-08-26 Thread Edward Chron
Aug 6 09:37:21 egc103 kernel: [ 7707]7553 10383 10383 7707 S 0.132 0.350 1056804 1054040 1052796 2092 0 0 1944 684 1052860 136 4 0 0 0 0 0 1000 oomprocs Signed-off-by: E

[PATCH 06/10] mm/oom_debug: Add Select Vmalloc Entries Print

2019-08-26 Thread Edward Chron
print line output: Jul 22 20:16:09 yoursystem kernel: Vmalloc size=2625536 pages=640 caller=__do_sys_swapon+0x78e/0x1130 Sample summary print line output: Jul 22 19:03:26 yoursystem kernel: Summary: Vmalloc entries examined:1070 printed:989 minsize:0kB Signed-off-by: Edward Chron --- in

[PATCH 04/10] mm/oom_debug: Add ARP and ND Table Summary usage

2019-08-26 Thread Edward Chron
: 368 entries: 6 lastFlush: 1720s hGrows: 0 allocs: 7 destroys: 1 lookups: 0 hits: 0 resFailed: 0 gcRuns/Forced: 110 / 0 tblFull: 0 proxyQlen: 0 Signed-off-by: Edward Chron Cc: "David S. Miller" Cc: net...@vger.kernel.org --- include/net/neighbour.h |

[PATCH 08/10] mm/oom_debug: Add Slab Select Always Print Enable

2019-08-26 Thread Edward Chron
s set to enabled. Sample Output - There is no change to the standard OOM output with this option other than the stanrd Linux OOM report Unreclaimable info is output for every OOM Event, not just OOM Events where slab usage exceeds user process memory usage. Signed-off-by: Edward

[PATCH 05/10] mm/oom_debug: Add Select Slabs Print

2019-08-26 Thread Edward Chron
23 23:26:34 yoursystem kernel: Slabs Total: 151212kB Reclaim: 50632kB Unreclaim: 100580kB Signed-off-by: Edward Chron --- mm/Kconfig.debug| 30 + mm/oom_kill.c | 11 +++- mm/oom_kill_debug.c | 42 + mm/oom_kill_debug.h | 4 +++ m

[PATCH 00/10] OOM Debug print selection and additional information

2019-08-26 Thread Edward Chron
ize value in the appropriate tenthpercent file as needed. --------- Edward Chron (10): mm/oom_debug: Add Debug base code mm/oom_debug: Add System State Summary mm/oom_debug: Add Tasks Summary mm/oom_debug: Add ARP and ND Table Summary usage mm/oom_debug: A

[PATCH 01/10] mm/oom_debug: Add Debug base code

2019-08-26 Thread Edward Chron
and enabled. By default each configured Select Print OOM debug option has a default print limiting minimum entry size of 10 or 1% of memory. - Signed-off-by: Edward Chron --- mm/Kconfig.debug| 17 +++ mm/Makefile

[PATCH 02/10] mm/oom_debug: Add System State Summary

2019-08-26 Thread Edward Chron
mmary message: Jul 27 10:56:46 yoursystem kernel: System Uptime:0 days 00:17:27 CPUs:4 Machine:x86_64 Node:yoursystem Domain:localdomain Kernel Release:5.3.0-rc2+ Version: #49 SMP Mon Jul 27 10:35:32 PDT 2019 Signed-off-by: Edward Chron --- mm/Kconfig.debug| 15 + mm/oom_kill_debug.c

[PATCH] mm/oom: Add oom_score_adj and pgtables to Killed process message

2019-08-22 Thread Edward Chron
process was correctly targeted by OOM due to the miconfiguration. This can be quite helpful for triage and problem determination. The addition of the pgtables_bytes shows page table usage by the process and is a useful measure of the memory size of the process. Signed-off-by: Edward Chron Acked-by

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-22 Thread Edward Chron
On Wed, Aug 21, 2019 at 12:19 AM David Rientjes wrote: > > On Wed, 21 Aug 2019, Michal Hocko wrote: > > > > vm.oom_dump_tasks is pretty useful, however, so it's curious why you > > > haven't left it enabled :/ > > > > Because it generates a lot of output potentially. Think of a workload > > with t

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-22 Thread Edward Chron
On Thu, Aug 22, 2019 at 12:09 AM Michal Hocko wrote: > > On Wed 21-08-19 15:25:13, Edward Chron wrote: > > On Tue, Aug 20, 2019 at 8:25 PM David Rientjes wrote: > > > > > > On Tue, 20 Aug 2019, Edward Chron wrote: > > > > > > > For an OOM

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-22 Thread Edward Chron
On Thu, Aug 22, 2019 at 12:21 AM Michal Hocko wrote: > > On Wed 21-08-19 16:12:08, Edward Chron wrote: > [...] > > Additionally (which you know, but mentioning for reference) the OOM > > output used to look like this: > > > > Nov 14 15:23:48 oldserver kernel: [3

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-22 Thread Edward Chron
On Thu, Aug 22, 2019 at 12:15 AM Michal Hocko wrote: > > On Wed 21-08-19 15:22:07, Edward Chron wrote: > > On Wed, Aug 21, 2019 at 12:19 AM David Rientjes wrote: > > > > > > On Wed, 21 Aug 2019, Michal Hocko wrote: > > > > > > > > vm.oom_dum

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-21 Thread Edward Chron
certainly reassuring. My understanding now is that printing the oom_score is discouraged. This seems unfortunate. The oom_score_adj can be adjusted appropriately if oom_score is known. So It would be useful to have both. But at least if oom_score_adj is printed you can confirm the value at

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-21 Thread Edward Chron
On Tue, Aug 20, 2019 at 8:25 PM David Rientjes wrote: > > On Tue, 20 Aug 2019, Edward Chron wrote: > > > For an OOM event: print oom_score_adj value for the OOM Killed process to > > document what the oom score adjust value was at the time the process was > > OOM Kille

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-21 Thread Edward Chron
enabled. I don't see why that is a big deal. It is very useful to have all the information that is there. Wouldn't mind also having pgtables too but we would be able to get that from the output of dump_task if that is enabled. If it is acceptable to also add the dump_task for the killed process for !sysctl_oom_dump_tasks I can repost the patch including that as well. Thank-you, Edward Chron Arista Networks

Re: [PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-21 Thread Edward Chron
e and if you prefer a fresh submission, let me know and I'll do that. Thank-you for reviewing this patch. -Edward Chron Arista Networks On Tue, Aug 20, 2019 at 8:25 PM David Rientjes wrote: > > On Tue, 20 Aug 2019, Edward Chron wrote: > > > For an OOM event: print oom_sc

[PATCH] mm/oom: Add oom_score_adj value to oom Killed process message

2019-08-20 Thread Edward Chron
targeted by OOM due to the miconfiguration. Having the oom_score_adj on the Killed message ensures that it is documented. Signed-off-by: Edward Chron Acked-by: Michal Hocko --- mm/oom_kill.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index

Re: [PATCH] mm/oom: Add killed process selection information

2019-08-14 Thread Edward Chron
On Mon, Aug 12, 2019 at 4:42 AM Michal Hocko wrote: > > On Fri 09-08-19 15:15:18, Edward Chron wrote: > [...] > > So it is optimal if you only have to go and find the correct log and search > > or run your script(s) when you absolutely need to, not on every OOM even

[PATCH] mm/oom: Add killed process selection information

2019-08-14 Thread Edward Chron
output: Aug 14 23:00:02 testserver kernel: Out of memory: Killed process 2692 (oomprocs) total-vm:1056800kB, anon-rss:1052760kB, file-rss:4kB,i shmem-rss:0kB oom_score_adj:1000 Signed-off-by: Edward Chron --- mm/oom_kill.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a

Re: [PATCH] mm/oom: Add killed process selection information

2019-08-09 Thread Edward Chron
Sorry about top posting, responses inline. On Thu, Aug 8, 2019 at 11:40 PM Michal Hocko wrote: > > [Again, please do not top post - it makes a mess of any longer > discussion] > > On Thu 08-08-19 15:15:12, Edward Chron wrote: > > In our experience far more (99.9%+) OOM

Re: [PATCH] mm/oom: Add killed process selection information

2019-08-08 Thread Edward Chron
PM Michal Hocko wrote: > > [please do not top-post] > > On Thu 08-08-19 12:21:30, Edward Chron wrote: > > It is helpful to the admin that looks at the kill message and records this > > information. OOMs can come in bunches. > > Knowing how much resource the oom selecte

[PATCH] mm/oom: Add killed process selection information

2019-08-08 Thread Edward Chron
output: Jul 21 20:07:48 yoursystem kernel: Out of memory: Killed process 2826 (processname) total-vm:1056800kB, anon-rss:1052784kB, file-rss:4kB, shmem-rss:0kB memory-usage:3.2% oom_score:1032 oom_score_adj:1000 total-pages: 32791748kB Signed-off-by: Edward Chron --- fs/proc/base.c | 2