Re: [Devel] [PATCH v2] tcache: Repeat invalidation in tcache_invalidate_node_pages()

2017-12-01 Thread Andrey Ryabinin


On 12/01/2017 06:02 PM, Kirill Tkhai wrote:
> When there are more than 2 users of a page,  __tcache_page_tree_delete()
> fails to freeze it. We skip it and never try to freeze it again.
> 
> In this case the page remains not invalidated, and tcache_node->nr_pages
> never decremented. Later, we catch WARN_ON() reporting about this.
> 
> tcache_shrink_scan()   tcache_destroy_pool
>tcache_lru_isolate()
>   tcache_grab_pool()
>   ...
>   page_cache_get_speculative() -->cnt == 2
> 
>   ...
>   tcache_put_pool() --> pool cnt zero
>   ...  
> wait_for_completion(>completion);
>tcache_reclaim_pages
> tcache_invalidate_node_pages()
>   __tcache_reclaim_page()  tcache_lookup()
>   
> page_cache_get_speculative  --> cnt == 3
>
> __tcache_page_tree_delete
> page_ref_freeze(2) -->fail
> page_ref_freeze(2) -->fail
> 
> The patch fixes the problem. In case of we failed to invalidate a page,
> we remember this, and return to such pages after others are invalidated.
> 
> https://jira.sw.ru/browse/PSBM-78354
> 
> v2: Also fix tcache_detach_page()
> 
> Signed-off-by: Kirill Tkhai 
> ---

Acked-by: Andrey Ryabinin 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH v2] tcache: Repeat invalidation in tcache_invalidate_node_pages()

2017-12-01 Thread Kirill Tkhai
When there are more than 2 users of a page,  __tcache_page_tree_delete()
fails to freeze it. We skip it and never try to freeze it again.

In this case the page remains not invalidated, and tcache_node->nr_pages
never decremented. Later, we catch WARN_ON() reporting about this.

tcache_shrink_scan()   tcache_destroy_pool
   tcache_lru_isolate()
  tcache_grab_pool()
  ...
  page_cache_get_speculative() -->cnt == 2

  ...
  tcache_put_pool() --> pool cnt zero
  ...  
wait_for_completion(>completion);
   tcache_reclaim_pages
tcache_invalidate_node_pages()
  __tcache_reclaim_page()  tcache_lookup()
  
page_cache_get_speculative  --> cnt == 3
   
__tcache_page_tree_delete
page_ref_freeze(2) -->fail
page_ref_freeze(2) -->fail

The patch fixes the problem. In case of we failed to invalidate a page,
we remember this, and return to such pages after others are invalidated.

https://jira.sw.ru/browse/PSBM-78354

v2: Also fix tcache_detach_page()

Signed-off-by: Kirill Tkhai 
---
 mm/tcache.c |   21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/mm/tcache.c b/mm/tcache.c
index d1a2c53e11a3..760e417d491b 100644
--- a/mm/tcache.c
+++ b/mm/tcache.c
@@ -850,6 +850,14 @@ static struct page *tcache_detach_page(struct tcache_node 
*node, pgoff_t index,
if (page)
tcache_lru_del(node->pool, page, reused);
local_irq_restore(flags);
+   /*
+* Shrinker could isolated the page in parallel
+* with us. This case page_ref_freeze(page, 2)
+* in __tcache_page_tree_delete() fails, and
+* we have to repeat the cycle.
+*/
+   if (!page)
+   goto repeat;
}
 
return page;
@@ -903,13 +911,15 @@ tcache_invalidate_node_pages(struct tcache_node *node)
struct page *pages[TCACHE_PAGEVEC_SIZE];
pgoff_t index = 0;
unsigned nr_pages;
+   bool repeat;
int i;
 
/*
 * First forbid new page insertions - see tcache_page_tree_replace.
 */
node->invalidated = true;
-
+again:
+   repeat = false;
while ((nr_pages = tcache_lookup(pages, node, index,
TCACHE_PAGEVEC_SIZE, indices))) 
{
for (i = 0; i < nr_pages; i++) {
@@ -925,13 +935,20 @@ tcache_invalidate_node_pages(struct tcache_node *node)
tcache_lru_del(node->pool, page, false);
local_irq_enable();
tcache_put_page(page);
-   } else
+   } else {
local_irq_enable();
+   repeat = true;
+   }
}
cond_resched();
index++;
}
 
+   if (repeat) {
+   index = 0;
+   goto again;
+   }
+
WARN_ON(node->nr_pages != 0);
 }
 

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7] NFS: Don't call COMMIT in ->releasepage()

2017-12-01 Thread Andrey Ryabinin
From: Trond Myklebust 

While COMMIT has the potential to free up a lot of memory that is being
taken by unstable writes, it isn't guaranteed to free up this particular
page. Also, calling fsync() on the server is expensive and so we want to
do it in a more controlled fashion, rather than have it triggered at
random by the VM.

Signed-off-by: Trond Myklebust 

https://jira.sw.ru/browse/PSBM-77949
(cherry picked from commit 4f52b6bb8c57b9accafad526a429d6c0851cc62f)
Signed-off-by: Andrey Ryabinin 
---
 fs/nfs/file.c | 23 ---
 1 file changed, 23 deletions(-)

diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 7ad044976fd1..24d3d0c44bc4 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -470,31 +470,8 @@ static void nfs_invalidate_page(struct page *page, 
unsigned int offset,
  */
 static int nfs_release_page(struct page *page, gfp_t gfp)
 {
-   struct address_space *mapping = page->mapping;
-
dfprintk(PAGECACHE, "NFS: release_page(%p)\n", page);
 
-   /* Always try to initiate a 'commit' if relevant, but only
-* wait for it if __GFP_WAIT is set.  Even then, only wait 1
-* second and only if the 'bdi' is not congested.
-* Waiting indefinitely can cause deadlocks when the NFS
-* server is on this machine, when a new TCP connection is
-* needed and in other rare cases.  There is no particular
-* need to wait extensively here.  A short wait has the
-* benefit that someone else can worry about the freezer.
-*/
-   if (mapping) {
-   struct nfs_server *nfss = NFS_SERVER(mapping->host);
-   nfs_commit_inode(mapping->host, 0);
-   if ((gfp & __GFP_WAIT) &&
-   !bdi_write_congested(>backing_dev_info)) {
-   wait_on_page_bit_killable_timeout(page, PG_private,
- HZ);
-   if (PagePrivate(page))
-   set_bdi_congested(>backing_dev_info,
- BLK_RW_ASYNC);
-   }
-   }
/* If PagePrivate() is set, then the page is not freeable */
if (PagePrivate(page))
return 0;
-- 
2.13.6

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH] ve: fix container stopped state check

2017-12-01 Thread Stanislav Kinsburskiy
Checking for empty cgroup is not correct, because init process leaves cgroup
early in do_exit.
This leads to a situation, when container is treated as stopped but its
resources (VEIP for instance) are not yet released.
Which in turn leads to container restart failure due to non-releases VEIP
address.

https://jira.sw.ru/browse/PSBM-78078

Signed-off-by: Stanislav Kinsburskiy 
---
 kernel/ve/ve.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c
index b0188c3..c628516 100644
--- a/kernel/ve/ve.c
+++ b/kernel/ve/ve.c
@@ -846,7 +846,7 @@ static int ve_state_read(struct cgroup *cg, struct cftype 
*cft,
 
if (ve->is_running)
seq_puts(m, "RUNNING");
-   else if (!nr_threads_ve(ve))
+   else if (!ve->init_task)
seq_puts(m, "STOPPED");
else if (ve->ve_ns)
seq_puts(m, "STOPPING");

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel