> On Feb 28, 2017, at 10:15 AM, Michal Hocko <mho...@kernel.org> wrote: > > On Tue 28-02-17 09:59:35, Robert Kudyba wrote: >> >>> On Feb 28, 2017, at 9:40 AM, Michal Hocko <mho...@kernel.org> wrote: >>> >>> On Tue 28-02-17 09:33:49, Robert Kudyba wrote: >>>> >>>>> On Feb 28, 2017, at 9:15 AM, Michal Hocko <mho...@kernel.org> wrote: >>>>> and this one is hitting the min watermark while there is not really >>>>> much to reclaim. Only the page cache which might be pinned and not >>>>> reclaimable from this context because this is GFP_NOFS request. It is >>>>> not all that surprising the reclaim context fights to get some memory. >>>>> There is a huge amount of the reclaimable slab which probably just makes >>>>> a slow progress. >>>>> >>>>> That is not something completely surprsing on 32b system I am afraid. >>>>> >>>>> Btw. is the stall repeating with the increased time or it gets resolved >>>>> eventually? >>>> >>>> Yes and if you mean by repeating it’s not only affecting rsync but >>>> you can see just now automount and NetworkManager get these page >>>> allocation stalls and kswapd0 is getting heavy CPU load, are there any >>>> other settings I can adjust? >>> >>> None that I am aware of. You might want to talk to FS guys, maybe they >>> can figure out who is pinning file pages so that they cannot be >>> reclaimed. They do not seem to be dirty or under writeback. It would be >>> also interesting to see whether that is a regression. The warning is >>> relatively new so you might have had this problem before just haven't >>> noticed it. >> >> We have been getting out of memory errors for a while but those seem >> to have gone away. > > this sounds suspicious. Are you really sure that this is a new problem? > Btw. is there any reason to use 32b kernel at all? It will always suffer > from a really small lowmem…
No this has been a problem for a while. Not sure if this server can handle 64b it’s a bit old. > >> We did just replace the controller in the VessRAID >> as there were some timeouts observed and multiple login/logout >> attempts. >> >> By FS guys do you mean the linux-fsdevel or linux-fsf list? > > yeah linux-fsdevel. No idea what linux-fsf is. It would be great if you > could collect some tracepoints before reporting the issue. At least > those in events/vmscan/*. Will do here’s a perf report: Children Self Command Shared Object Symbol ◆ 63.54% 0.06% swapper [kernel.kallsyms] [k] cpu_startup_entry 62.07% 0.01% swapper [kernel.kallsyms] [k] default_idle_call 62.06% 0.01% swapper [kernel.kallsyms] [k] arch_cpu_idle 54.73% 54.73% swapper [kernel.kallsyms] [k] mwait_idle 45.22% 0.00% swapper [kernel.kallsyms] [k] start_secondary 18.35% 0.00% swapper [kernel.kallsyms] [k] rest_init 18.35% 0.00% swapper [kernel.kallsyms] [k] start_kernel 18.35% 0.00% swapper [kernel.kallsyms] [k] i386_start_kernel 13.97% 0.01% rsync [vdso] [.] __kernel_vsyscall 13.93% 0.01% rsync [kernel.kallsyms] [k] sysenter_past_esp 13.92% 0.02% rsync [kernel.kallsyms] [k] do_fast_syscall_32 8.58% 0.01% rsync [kernel.kallsyms] [k] sys_write 8.56% 0.01% rsync [kernel.kallsyms] [k] vfs_write 8.55% 0.01% rsync [kernel.kallsyms] [k] __vfs_write 8.55% 0.00% rsync [kernel.kallsyms] [k] ext4_file_write_iter 8.54% 0.00% rsync [kernel.kallsyms] [k] __generic_file_write_iter 8.36% 0.04% rsync [kernel.kallsyms] [k] generic_perform_write 6.96% 0.00% swapper [kernel.kallsyms] [k] irq_exit 6.81% 0.01% swapper [kernel.kallsyms] [k] do_softirq_own_stack 6.80% 0.01% swapper [kernel.kallsyms] [k] __do_softirq 6.30% 0.00% swapper [kernel.kallsyms] [k] do_IRQ 6.30% 0.00% swapper [kernel.kallsyms] [k] common_interrupt 5.76% 0.01% swapper [kernel.kallsyms] [k] net_rx_action 5.75% 0.10% swapper [kernel.kallsyms] [k] e1000_clean 5.33% 0.01% rsync [kernel.kallsyms] [k] sys_read 5.30% 0.00% rsync [kernel.kallsyms] [k] vfs_read 5.28% 0.02% rsync [kernel.kallsyms] [k] __vfs_read 5.23% 0.10% rsync [kernel.kallsyms] [k] generic_file_read_iter 5.22% 0.00% swapper [kernel.kallsyms] [k] netif_receive_skb_internal 5.21% 0.01% swapper [kernel.kallsyms] [k] __netif_receive_skb 5.20% 0.01% swapper [kernel.kallsyms] [k] __netif_receive_skb_core 5.19% 0.02% swapper [kernel.kallsyms] [k] ip_rcv 4.93% 0.01% swapper [kernel.kallsyms] [k] ip_rcv_finish 4.90% 0.01% swapper [kernel.kallsyms] [k] ip_local_deliver 4.86% 0.05% swapper [kernel.kallsyms] [k] e1000_clean_rx_irq 4.80% 0.00% swapper [kernel.kallsyms] [k] ip_local_deliver_finish 4.80% 0.01% swapper [kernel.kallsyms] [k] tcp_v4_rcv 4.73% 0.01% swapper [kernel.kallsyms] [k] tcp_v4_do_rcv