[Devel] [PATCH rh7 3/3] net: core: attempt a single high order allocation

2016-11-01 Thread Anatoly Stepanov
This is a port of the following upstream commit: commit d9b2938aabf757da2d40153489b251d4fc3fdd18 net: attempt a single high order allocation In commit ed98df3361f0 ("net: use __GFP_NORETRY for high order allocations") we tried to address one issue caused by order-3 allocations. We still observe

[Devel] [PATCH rh7 1/3] net: core: use __GFP_NORETRY for high order allocations

2016-11-01 Thread Anatoly Stepanov
This is a backport of upstream (vanilla) change. Original commit: ed98df3361f059db42786c830ea96e2d18b8d4db sock_alloc_send_pskb() & sk_page_frag_refill() have a loop trying high order allocations to prepare skb with low number of fragments as this increases performance. Problem is that under mem

[Devel] [PATCH rh7 2/3] net: core: use atomic high-order allocations

2016-11-01 Thread Anatoly Stepanov
As we detected intensive direct reclaim activity in sk_page_frag_refill() it's reasonable to prevent it from trying so hard to allocate high-order blocks, just do it when it's effortless. This is a port of upstream (vanilla) change. Original commit: fb05e7a89f500cfc06ae277bdc911b281928995d We saw

[Devel] [PATCH rh7 0/3] net: core: optimize high-order allocations

2016-11-01 Thread Anatoly Stepanov
This patch set aims to improve high-order allocations in networking subsystem by reducing the impact on buddy allocator. This is a carbon-copy of the recent patch set, i've just set "Subject" field properly. Signed-off-by: Anatoly Stepanov Anatoly Stepanov (3): net: core: use __GFP_NORETRY f

[Devel] [PATCH rh7 14/21] ms/mm: memcontrol: catch root bypass in move precharge

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner When mem_cgroup_try_charge() returns -EINTR, it bypassed the charge to the root memcg. But move precharging does not catch this and treats this case as if no charge had happened, thus leaking a charge against root. Because of an old optimization, the root memcg's res_count

[Devel] [PATCH rh7 10/21] ms/mm: huge_memory: use GFP_TRANSHUGE when charging huge pages

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Transparent huge page charges prefer falling back to regular pages rather than spending a lot of time in direct reclaim. Desired reclaim behavior is usually declared in the gfp mask, but THP charges use GFP_KERNEL and then rely on the fact that OOM is disabled for THP charg

[Devel] [PATCH rh7 19/21] ms/mm: memcontrol: revert use of root_mem_cgroup res_counter

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Dave Hansen reports a massive scalability regression in an uncontained page fault benchmark with more than 30 concurrent threads, which he bisected down to 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") and pin-pointed on res_counter spinlock contention. T

[Devel] [PATCH rh7 17/21] ms/mm: memcontrol: rewrite uncharge API

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner The memcg uncharging code that is involved towards the end of a page's lifetime - truncation, reclaim, swapout, migration - is impressively complicated and fragile. Because anonymous and file pages were always charged before they had their page->mapping established, uncharg

[Devel] [PATCH rh7 18/21] ms/mm: memcontrol: use page lists for uncharge batching

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Pages are now uncharged at release time, and all sources of batched uncharges operate on lists of pages. Directly use those lists, and get rid of the per-task batching state. This also batches statistics accounting, in addition to the res counter charges, to reduce IRQ-dis

[Devel] [PATCH rh7 16/21] ms/mm: memcontrol: rewrite charge API

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner These patches rework memcg charge lifetime to integrate more naturally with the lifetime of user pages. This drastically simplifies the code and reduces charging and uncharging overhead. The most expensive part of charging and uncharging is the page_cgroup bit spinlock, wh

[Devel] [PATCH rh7 09/21] ms/mm: memcontrol: reclaim at least once for __GFP_NORETRY

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Currently, __GFP_NORETRY tries charging once and gives up before even trying to reclaim. Bring the behavior on par with the page allocator and reclaim at least once before giving up. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Cc: Hugh Dickins Cc: Tejun Heo C

[Devel] [PATCH rh7 04/21] ms/memcg: get_mem_cgroup_from_mm()

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Instead of returning NULL from try_get_mem_cgroup_from_mm() when the mm owner is exiting, just return root_mem_cgroup. This makes sense for all callsites and gets rid of some of them having to fallback manually. [fengguang...@intel.com: fix warnings] Signed-off-by: Johanne

[Devel] [PATCH rh7 15/21] ms/mm: memcontrol: remove ordering between pc->mem_cgroup and PageCgroupUsed

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner There is a write barrier between setting pc->mem_cgroup and PageCgroupUsed, which was added to allow LRU operations to lookup the memcg LRU list of a page without acquiring the page_cgroup lock. But ever since commit 38c5d72f3ebe ("memcg: simplify LRU handling by new rule")

[Devel] [PATCH rh7 21/21] ms/mm: memcontrol: only mark charged pages with PageKmemcg

2016-11-01 Thread Andrey Ryabinin
From: Vladimir Davydov To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg, which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated with __GFP_ACCOUNT, including those that aren't actually

[Devel] [PATCH rh7 11/21] ms/mm: memcontrol: retry reclaim for oom-disabled and __GFP_NOFAIL charges

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner There is no reason why oom-disabled and __GFP_NOFAIL charges should try to reclaim only once when every other charge tries several times before giving up. Make them all retry the same number of times. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Cc: Hugh Dickins

[Devel] [PATCH rh7 20/21] ms/mm: memcontrol: teach uncharge_list to deal with kmem pages

2016-11-01 Thread Andrey Ryabinin
From: Vladimir Davydov Page table pages are batched-freed in release_pages on most architectures. If we want to charge them to kmemcg (this is what is done later in this series), we need to teach mem_cgroup_uncharge_list to handle kmem pages. Link: http://lkml.kernel.org/r/18d5c09e97f80074ed25

[Devel] [PATCH rh7 07/21] ms/mm: memcontrol: fold mem_cgroup_do_charge()

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner These patches rework memcg charge lifetime to integrate more naturally with the lifetime of user pages. This drastically simplifies the code and reduces charging and uncharging overhead. The most expensive part of charging and uncharging is the page_cgroup bit spinlock, wh

[Devel] [PATCH rh7 06/21] ms/memcg: sanitize __mem_cgroup_try_charge() call protocol

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Some callsites pass a memcg directly, some callsites pass an mm that then has to be translated to a memcg. This makes for a terrible function interface. Just push the mm-to-memcg translation into the respective callsites and always pass a memcg to mem_cgroup_try_charge().

[Devel] [PATCH rh7 13/21] ms/mm: memcontrol: simplify move precharge function

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner The move precharge function does some baroque things: it tries raw res_counter charging of the entire amount first, and then falls back to a loop of one-by-one charges, with checks for pending signals and cond_resched() batching. Just use mem_cgroup_try_charge() without __G

[Devel] [PATCH rh7 08/21] ms/mm: memcontrol: rearrange charging fast path

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner The charging path currently starts out with OOM condition checks when OOM is the rarest possible case. Rearrange this code to run OOM/task dying checks only after trying the percpu charge and the res_counter charge and bail out before entering reclaim. Attempting a charge

[Devel] [PATCH rh7 12/21] ms/mm: memcontrol: remove explicit OOM parameter in charge path

2016-11-01 Thread Andrey Ryabinin
From: Michal Hocko For the page allocator, __GFP_NORETRY implies that no OOM should be triggered, whereas memcg has an explicit parameter to disable OOM. The only callsites that want OOM disabled are THP charges and charge moving. THP already uses __GFP_NORETRY and charge moving can use it as w

[Devel] [PATCH rh7 01/21] ms/mm: memcg: inline mem_cgroup_charge_common()

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner mem_cgroup_charge_common() is used by both cache and anon pages, but most of its body only applies to anon pages and the remainder is not worth having in a separate function. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by

[Devel] [PATCH rh7 05/21] ms/memcg: do not replicate get_mem_cgroup_from_mm in __mem_cgroup_try_charge

2016-11-01 Thread Andrey Ryabinin
From: Michal Hocko __mem_cgroup_try_charge duplicates get_mem_cgroup_from_mm for charges which came without a memcg. The only reason seems to be a tiny optimization when css_tryget is not called if the charge can be consumed from the stock. Nevertheless css_tryget is very cheap since it has bee

[Devel] [PATCH rh7 02/21] ms/mm: memcg: push !mm handling out to page cache charge function

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Only page cache charges can happen without an mm context, so push this special case out of the inner core and into the cache charge function. An ancient comment explains that the mm can also be NULL in case the task is currently being migrated, but that is not actually true

[Devel] [PATCH rh7 03/21] ms/memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()

2016-11-01 Thread Andrey Ryabinin
From: Johannes Weiner Users pass either a mm that has been established under task lock, or use a verified current->mm, which means the task can't be exiting. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds https://jira.sw.ru/b