[Devel] [PATCH RHEL7 COMMIT] fuse: fuse_writepage_locked must check for FUSE_INVALIDATE_FILES (v2)

2017-01-12 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-514.vz7.27.x-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-514.vz7.27.10 --> commit 0fe508ff27ae7efe0376c944ae7a63a4e9250243 Author: Maxim Patlasov Date: Thu Jan 12 19:26:55 2017 +0400 fuse: fuse_writepag

[Devel] [PATCH rh7 v2 08/21] ms/mm: memcontrol: rearrange charging fast path

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner The charging path currently starts out with OOM condition checks when OOM is the rarest possible case. Rearrange this code to run OOM/task dying checks only after trying the percpu charge and the res_counter charge and bail out before entering reclaim. Attempting a charge

[Devel] [PATCH rh7 v2 04/21] ms/memcg: get_mem_cgroup_from_mm()

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Instead of returning NULL from try_get_mem_cgroup_from_mm() when the mm owner is exiting, just return root_mem_cgroup. This makes sense for all callsites and gets rid of some of them having to fallback manually. [fengguang...@intel.com: fix warnings] Signed-off-by: Johanne

Re: [Devel] [PATCH vz7] fuse: fuse_writepage_locked must check for FUSE_INVALIDATE_FILES (v2)

2017-01-12 Thread Dmitry Monakhov
Maxim Patlasov writes: > The patch fixes another race dealing with fuse_invalidate_files, > this time when it races with truncate(2): > > Thread A: the flusher performs writeback as usual: > > fuse_writepages --> > fuse_send_writepages --> > end_page_writeback > > but before fuse_send

[Devel] [PATCH rh7 v2 18/21] ms/mm: memcontrol: use page lists for uncharge batching

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Pages are now uncharged at release time, and all sources of batched uncharges operate on lists of pages. Directly use those lists, and get rid of the per-task batching state. This also batches statistics accounting, in addition to the res counter charges, to reduce IRQ-dis

[Devel] [PATCH rh7 v2 16/21] ms/mm: memcontrol: rewrite charge API

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner These patches rework memcg charge lifetime to integrate more naturally with the lifetime of user pages. This drastically simplifies the code and reduces charging and uncharging overhead. The most expensive part of charging and uncharging is the page_cgroup bit spinlock, wh

[Devel] [PATCH rh7 v2 19/21] ms/mm: memcontrol: revert use of root_mem_cgroup res_counter

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Dave Hansen reports a massive scalability regression in an uncontained page fault benchmark with more than 30 concurrent threads, which he bisected down to 05b843012335 ("mm: memcontrol: use root_mem_cgroup res_counter") and pin-pointed on res_counter spinlock contention. T

[Devel] [PATCH rh7 v2 17/21] ms/mm: memcontrol: rewrite uncharge API

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner The memcg uncharging code that is involved towards the end of a page's lifetime - truncation, reclaim, swapout, migration - is impressively complicated and fragile. Because anonymous and file pages were always charged before they had their page->mapping established, uncharg

[Devel] [PATCH rh7 v2 06/21] ms/memcg: sanitize __mem_cgroup_try_charge() call protocol

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Some callsites pass a memcg directly, some callsites pass an mm that then has to be translated to a memcg. This makes for a terrible function interface. Just push the mm-to-memcg translation into the respective callsites and always pass a memcg to mem_cgroup_try_charge().

[Devel] [PATCH rh7 v2 20/21] ms/mm: memcontrol: teach uncharge_list to deal with kmem pages

2017-01-12 Thread Andrey Ryabinin
From: Vladimir Davydov Page table pages are batched-freed in release_pages on most architectures. If we want to charge them to kmemcg (this is what is done later in this series), we need to teach mem_cgroup_uncharge_list to handle kmem pages. Link: http://lkml.kernel.org/r/18d5c09e97f80074ed25

[Devel] [PATCH rh7 v2 09/21] ms/mm: memcontrol: reclaim at least once for __GFP_NORETRY

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Currently, __GFP_NORETRY tries charging once and gives up before even trying to reclaim. Bring the behavior on par with the page allocator and reclaim at least once before giving up. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Cc: Hugh Dickins Cc: Tejun Heo C

[Devel] [PATCH rh7 v2 13/21] ms/mm: memcontrol: simplify move precharge function

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner The move precharge function does some baroque things: it tries raw res_counter charging of the entire amount first, and then falls back to a loop of one-by-one charges, with checks for pending signals and cond_resched() batching. Just use mem_cgroup_try_charge() without __G

[Devel] [PATCH rh7 v2 14/21] ms/mm: memcontrol: catch root bypass in move precharge

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner When mem_cgroup_try_charge() returns -EINTR, it bypassed the charge to the root memcg. But move precharging does not catch this and treats this case as if no charge had happened, thus leaking a charge against root. Because of an old optimization, the root memcg's res_count

[Devel] [PATCH rh7 v2 15/21] ms/mm: memcontrol: remove ordering between pc->mem_cgroup and PageCgroupUsed

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner There is a write barrier between setting pc->mem_cgroup and PageCgroupUsed, which was added to allow LRU operations to lookup the memcg LRU list of a page without acquiring the page_cgroup lock. But ever since commit 38c5d72f3ebe ("memcg: simplify LRU handling by new rule")

[Devel] [PATCH rh7 v2 21/21] ms/mm: memcontrol: only mark charged pages with PageKmemcg

2017-01-12 Thread Andrey Ryabinin
From: Vladimir Davydov To distinguish non-slab pages charged to kmemcg we mark them PageKmemcg, which sets page->_mapcount to -512. Currently, we set/clear PageKmemcg in __alloc_pages_nodemask()/free_pages_prepare() for any page allocated with __GFP_ACCOUNT, including those that aren't actually

[Devel] [PATCH rh7 v2 12/21] ms/mm: memcontrol: remove explicit OOM parameter in charge path

2017-01-12 Thread Andrey Ryabinin
From: Michal Hocko For the page allocator, __GFP_NORETRY implies that no OOM should be triggered, whereas memcg has an explicit parameter to disable OOM. The only callsites that want OOM disabled are THP charges and charge moving. THP already uses __GFP_NORETRY and charge moving can use it as w

[Devel] [PATCH rh7 v2 10/21] ms/mm: huge_memory: use GFP_TRANSHUGE when charging huge pages

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Transparent huge page charges prefer falling back to regular pages rather than spending a lot of time in direct reclaim. Desired reclaim behavior is usually declared in the gfp mask, but THP charges use GFP_KERNEL and then rely on the fact that OOM is disabled for THP charg

[Devel] [PATCH rh7 v2 11/21] ms/mm: memcontrol: retry reclaim for oom-disabled and __GFP_NOFAIL charges

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner There is no reason why oom-disabled and __GFP_NOFAIL charges should try to reclaim only once when every other charge tries several times before giving up. Make them all retry the same number of times. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Cc: Hugh Dickins

[Devel] [PATCH rh7 v2 07/21] ms/mm: memcontrol: fold mem_cgroup_do_charge()

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner These patches rework memcg charge lifetime to integrate more naturally with the lifetime of user pages. This drastically simplifies the code and reduces charging and uncharging overhead. The most expensive part of charging and uncharging is the page_cgroup bit spinlock, wh

[Devel] [PATCH rh7 v2 03/21] ms/memcg: remove unnecessary !mm check from try_get_mem_cgroup_from_mm()

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Users pass either a mm that has been established under task lock, or use a verified current->mm, which means the task can't be exiting. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds https://jira.sw.ru/b

[Devel] [PATCH rh7 v2 05/21] ms/memcg: do not replicate get_mem_cgroup_from_mm in __mem_cgroup_try_charge

2017-01-12 Thread Andrey Ryabinin
From: Michal Hocko __mem_cgroup_try_charge duplicates get_mem_cgroup_from_mm for charges which came without a memcg. The only reason seems to be a tiny optimization when css_tryget is not called if the charge can be consumed from the stock. Nevertheless css_tryget is very cheap since it has bee

[Devel] [PATCH rh7 v2 02/21] ms/mm: memcg: push !mm handling out to page cache charge function

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner Only page cache charges can happen without an mm context, so push this special case out of the inner core and into the cache charge function. An ancient comment explains that the mm can also be NULL in case the task is currently being migrated, but that is not actually true

[Devel] [PATCH rh7 v2 01/21] ms/mm: memcg: inline mem_cgroup_charge_common()

2017-01-12 Thread Andrey Ryabinin
From: Johannes Weiner mem_cgroup_charge_common() is used by both cache and anon pages, but most of its body only applies to anon pages and the remainder is not worth having in a separate function. Signed-off-by: Johannes Weiner Acked-by: Michal Hocko Signed-off-by: Andrew Morton Signed-off-by