[Devel] [PATCH vz7] fuse: fuse_send_writepage() must check FUSE_S_FAIL_IMMEDIATELY

2016-12-06 Thread Maxim Patlasov
The patch fixes the following race (leading to deadlock):

1. Thread A. Enter fuse_prepare_write() checks for FUSE_S_FAIL_IMMEDIATELY,
but it was not set yet, so it doesn't return -EIO.

2. Thread B. Enter fuse_invalidate_files(). It sets FUSE_S_FAIL_IMMEDIATELY,
calls filemap_write_and_wait() and fuse_kill_requests(), then release
fc->lock.

3. Thread A. fuse_commit_write() marks page as "Dirty", then
fuse_write_end() unlocks the page.

4. Thread B. fuse_invalidate_files() calls invalidate_inode_pages2(). The
page is dirty, so it ends up in fuse_launder_page() calling
fuse_writepage_locked(). The latter successfully proceeds queuing
fuse-write-back request, but then fuse_launder_page() calls
fuse_wait_on_page_writeback() that blocks forever because Thread A is still
blocked in fuse_invalidate_files().

Signed-off-by: Maxim Patlasov 
---
 fs/fuse/file.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 4fcf4f4..e21b8b7 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1827,7 +1827,8 @@ __acquires(fc->lock)
struct fuse_write_in *inarg = >misc.write.in;
__u64 data_size = req->num_pages * PAGE_CACHE_SIZE;
 
-   if (!fc->connected)
+   if (!fc->connected ||
+   test_bit(FUSE_S_FAIL_IMMEDIATELY, >ff->ff_state))
goto out_free;
 
if (inarg->offset + data_size <= size) {

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH rh7] seccomp, ptrace: Fix typo in filter fetching

2016-12-06 Thread Cyrill Gorcunov
In commit 42b65fd18057d64410a0519962cd0650c762c99f there
is a typo: we need to copy the complete filter chain
not only first number.

https://jira.sw.ru/browse/PSBM-55593

CC: Andrey Vagin 
Signed-off-by: Cyrill Gorcunov 
---

I am continue investigating the problem since tests
are not yet passed but this fix may be applied independently.

 kernel/seccomp.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-pcs7.git/kernel/seccomp.c
===
--- linux-pcs7.git.orig/kernel/seccomp.c
+++ linux-pcs7.git/kernel/seccomp.c
@@ -566,7 +566,7 @@ long seccomp_get_filter(struct task_stru
get_seccomp_filter(task);
spin_unlock_irq(>sighand->siglock);
 
-   if (copy_to_user(data, filter->insns, filter->len))
+   if (copy_to_user(data, filter->insns, filter->len * 
sizeof(filter->insns[0])))
ret = -EFAULT;
 
put_seccomp_filter(task);
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH 1/4] Revert: [fs] xfs: rework buffer dispose list tracking

2016-12-06 Thread Dmitry Monakhov
From: Dave Chinner 

35c0abc0c70cfb3b37505ec137beae7fabca6b79 Mon Sep 17 00:00:00 2001
Message-id: <1472129410-4267-1-git-send-email-bfos...@redhat.com>
Patchwork-id: 157287
O-Subject: [RHEL7 PATCH] xfs: rework buffer dispose list tracking
Bugzilla: 1349175
RH-Acked-by: Dave Chinner 
RH-Acked-by: Eric Sandeen 

- Retain the buffer lru helpers as rhel7 does not include built-in
  list_lru infrastructure.
- Some b_lock bits dropped as they were introduced by a previous
  selective backport.
- Backport use of dispose list from upstream list_lru-based
  xfs_wait_buftarg[_rele]() to downstream variant.

commit a408235726aa82c0358c9ec68124b6f4bc0a79df
Author: Dave Chinner 
Date:   Wed Aug 28 10:18:06 2013 +1000

xfs: rework buffer dispose list tracking

In converting the buffer lru lists to use the generic code, the locking
for marking the buffers as on the dispose list was lost.  This results in
confusion in LRU buffer tracking and acocunting, resulting in reference
counts being mucked up and filesystem beig unmountable.

To fix this, introduce an internal buffer spinlock to protect the state
field that holds the dispose list information.  Because there is now
locking needed around xfs_buf_lru_add/del, and they are used in exactly
one place each two lines apart, get rid of the wrappers and code the logic
directly in place.

Further, the LRU emptying code used on unmount is less than optimal.
Convert it to use a dispose list as per a normal shrinker walk, and repeat
the walk that fills the dispose list until the LRU is empty.  Thi avoids
needing to drop and regain the LRU lock for every item being freed, and
allows the same logic as the shrinker isolate call to be used.  Simpler,
easier to understand.

Signed-off-by: Dave Chinner 
Signed-off-by: Glauber Costa 
Cc: "Theodore Ts'o" 
Cc: Adrian Hunter 
Cc: Al Viro 
Cc: Artem Bityutskiy 
Cc: Arve Hjonnevag 
Cc: Carlos Maiolino 
Cc: Christoph Hellwig 
Cc: Chuck Lever 
Cc: Daniel Vetter 
Cc: David Rientjes 
Cc: Gleb Natapov 
Cc: Greg Thelen 
Cc: J. Bruce Fields 
Cc: Jan Kara 
Cc: Jerome Glisse 
Cc: John Stultz 
Cc: KAMEZAWA Hiroyuki 
Cc: Kent Overstreet 
Cc: Kirill A. Shutemov 
Cc: Marcelo Tosatti 
Cc: Mel Gorman 
Cc: Steven Whitehouse 
Cc: Thomas Hellstrom 
Cc: Trond Myklebust 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro 

Signed-off-by: Brian Foster 
Signed-off-by: Dmitry Monakhov 
---
 fs/xfs/xfs_buf.c | 57 
 fs/xfs/xfs_buf.h |  8 +++-
 2 files changed, 11 insertions(+), 54 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index e380398..c0de0e2 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -96,7 +96,7 @@ xfs_buf_lru_add(
atomic_inc(>b_hold);
list_add_tail(>b_lru, >bt_lru);
btp->bt_lru_nr++;
-   bp->b_state &= ~XFS_BSTATE_DISPOSE;
+   bp->b_lru_flags &= ~_XBF_LRU_DISPOSE;
}
spin_unlock(>bt_lru_lock);
 }
@@ -198,21 +198,19 @@ xfs_buf_stale(
 */
xfs_buf_ioacct_dec(bp);
 
-   spin_lock(>b_lock);
-   atomic_set(>b_lru_ref, 0);
+   atomic_set(&(bp)->b_lru_ref, 0);
if (!list_empty(>b_lru)) {
struct xfs_buftarg *btp = bp->b_target;
 
spin_lock(>bt_lru_lock);
if (!list_empty(>b_lru) &&
-   !(bp->b_state & XFS_BSTATE_DISPOSE)) {
+   !(bp->b_lru_flags & _XBF_LRU_DISPOSE)) {
list_del_init(>b_lru);
btp->bt_lru_nr--;
atomic_dec(>b_hold);
}
spin_unlock(>bt_lru_lock);
}
-   spin_unlock(>b_lock);
ASSERT(atomic_read(>b_hold) >= 1);
 }
 
@@ -1014,26 +1012,10 @@ xfs_buf_rele(
/* the last reference has been dropped ... */
xfs_buf_ioacct_dec(bp);
if (!(bp->b_flags & XBF_STALE) && atomic_read(>b_lru_ref)) {
-   /*
-* If the buffer is added to the LRU take a new
-* reference to the buffer for the 

[Devel] [PATCH 2/4] ms/xfs: convert buftarg LRU to generic code

2016-12-06 Thread Dmitry Monakhov
Convert the buftarg LRU to use the new generic LRU list and take advantage
of the functionality it supplies to make the buffer cache shrinker node
aware.

Signed-off-by: Glauber Costa 
Signed-off-by: Dave Chinner 
Cc: "Theodore Ts'o" 
Cc: Adrian Hunter 
Cc: Al Viro 
Cc: Artem Bityutskiy 
Cc: Arve Hjønnevåg 
Cc: Carlos Maiolino 
Cc: Christoph Hellwig 
Cc: Chuck Lever 
Cc: Daniel Vetter 
Cc: David Rientjes 
Cc: Gleb Natapov 
Cc: Greg Thelen 
Cc: J. Bruce Fields 
Cc: Jan Kara 
Cc: Jerome Glisse 
Cc: John Stultz 
Cc: KAMEZAWA Hiroyuki 
Cc: Kent Overstreet 
Cc: Kirill A. Shutemov 
Cc: Marcelo Tosatti 
Cc: Mel Gorman 
Cc: Steven Whitehouse 
Cc: Thomas Hellstrom 
Cc: Trond Myklebust 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro 
(cherry picked from commit e80dfa19976b884db1ac2bc5d7d6ca0a4027bd1c)
Signed-off-by: Dmitry Monakhov 
---
 fs/xfs/xfs_buf.c | 170 ++-
 fs/xfs/xfs_buf.h |   5 +-
 2 files changed, 81 insertions(+), 94 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index c0de0e2..87a314a 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -85,20 +85,14 @@ xfs_buf_vmap_len(
  * The LRU takes a new reference to the buffer so that it will only be freed
  * once the shrinker takes the buffer off the LRU.
  */
-STATIC void
+static void
 xfs_buf_lru_add(
struct xfs_buf  *bp)
 {
-   struct xfs_buftarg *btp = bp->b_target;
-
-   spin_lock(>bt_lru_lock);
-   if (list_empty(>b_lru)) {
-   atomic_inc(>b_hold);
-   list_add_tail(>b_lru, >bt_lru);
-   btp->bt_lru_nr++;
+   if (list_lru_add(>b_target->bt_lru, >b_lru)) {
bp->b_lru_flags &= ~_XBF_LRU_DISPOSE;
+   atomic_inc(>b_hold);
}
-   spin_unlock(>bt_lru_lock);
 }
 
 /*
@@ -107,24 +101,13 @@ xfs_buf_lru_add(
  * The unlocked check is safe here because it only occurs when there are not
  * b_lru_ref counts left on the inode under the pag->pag_buf_lock. it is there
  * to optimise the shrinker removing the buffer from the LRU and calling
- * xfs_buf_free(). i.e. it removes an unnecessary round trip on the
- * bt_lru_lock.
+ * xfs_buf_free().
  */
-STATIC void
+static void
 xfs_buf_lru_del(
struct xfs_buf  *bp)
 {
-   struct xfs_buftarg *btp = bp->b_target;
-
-   if (list_empty(>b_lru))
-   return;
-
-   spin_lock(>bt_lru_lock);
-   if (!list_empty(>b_lru)) {
-   list_del_init(>b_lru);
-   btp->bt_lru_nr--;
-   }
-   spin_unlock(>bt_lru_lock);
+   list_lru_del(>b_target->bt_lru, >b_lru);
 }
 
 /*
@@ -199,18 +182,10 @@ xfs_buf_stale(
xfs_buf_ioacct_dec(bp);
 
atomic_set(&(bp)->b_lru_ref, 0);
-   if (!list_empty(>b_lru)) {
-   struct xfs_buftarg *btp = bp->b_target;
-
-   spin_lock(>bt_lru_lock);
-   if (!list_empty(>b_lru) &&
-   !(bp->b_lru_flags & _XBF_LRU_DISPOSE)) {
-   list_del_init(>b_lru);
-   btp->bt_lru_nr--;
-   atomic_dec(>b_hold);
-   }
-   spin_unlock(>bt_lru_lock);
-   }
+   if (!(bp->b_lru_flags & _XBF_LRU_DISPOSE) &&
+   (list_lru_del(>b_target->bt_lru, >b_lru)))
+   atomic_dec(>b_hold);
+
ASSERT(atomic_read(>b_hold) >= 1);
 }
 
@@ -1597,11 +1572,14 @@ xfs_buf_iomove(
  * returned. These buffers will have an elevated hold count, so wait on those
  * while freeing all the buffers only held by the LRU.
  */
-void
-xfs_wait_buftarg(
-   struct xfs_buftarg  *btp)
+static enum lru_status
+xfs_buftarg_wait_rele(
+   struct list_head*item,
+   spinlock_t  *lru_lock,
+   void*arg)
+
 {
-   struct xfs_buf  *bp;
+   struct xfs_buf  *bp = container_of(item, struct xfs_buf, b_lru);
 
/*
 * First wait on the buftarg I/O count for all in-flight buffers to be
@@ -1619,23 +1597,18 @@ xfs_wait_buftarg(
delay(100);
flush_workqueue(btp->bt_mount->m_buf_workqueue);
 
-restart:
-   spin_lock(>bt_lru_lock);
-   while (!list_empty(>bt_lru)) {
-   bp = list_first_entry(>bt_lru, struct xfs_buf, b_lru);
-   if (atomic_read(>b_hold) > 1) {
- 

[Devel] [PATCH 3/4] From c70ded437bb646ace0dcbf3c7989d4edeed17f7e Mon Sep 17 00:00:00 2001 [PATCH 2/3] ms/xfs-convert-buftarg-lru-to-generic-code-fix

2016-12-06 Thread Dmitry Monakhov
From: Andrew Morton 

fix warnings

Cc: Dave Chinner 
Cc: Glauber Costa 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro 
(cherry picked from commit addbda40bed47d8942658fca93e14b5f1cbf009a)

Signed-off-by: Vladimir Davydov 
Signed-off-by: Dmitry Monakhov 
---
 fs/xfs/xfs_buf.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 87a314a..bf933d5 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1654,7 +1654,7 @@ xfs_buftarg_isolate(
return LRU_REMOVED;
 }
 
-static long
+static unsigned long
 xfs_buftarg_shrink_scan(
struct shrinker *shrink,
struct shrink_control   *sc)
@@ -1662,7 +1662,7 @@ xfs_buftarg_shrink_scan(
struct xfs_buftarg  *btp = container_of(shrink,
struct xfs_buftarg, bt_shrinker);
LIST_HEAD(dispose);
-   longfreed;
+   unsigned long   freed;
unsigned long   nr_to_scan = sc->nr_to_scan;
 
freed = list_lru_walk_node(>bt_lru, sc->nid, xfs_buftarg_isolate,
@@ -1678,7 +1678,7 @@ xfs_buftarg_shrink_scan(
return freed;
 }
 
-static long
+static unsigned long
 xfs_buftarg_shrink_count(
struct shrinker *shrink,
struct shrink_control   *sc)
-- 
2.7.4

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH 4/4] ms/xfs: rework buffer dispose list tracking

2016-12-06 Thread Dmitry Monakhov
In converting the buffer lru lists to use the generic code, the locking
for marking the buffers as on the dispose list was lost.  This results in
confusion in LRU buffer tracking and acocunting, resulting in reference
counts being mucked up and filesystem beig unmountable.

To fix this, introduce an internal buffer spinlock to protect the state
field that holds the dispose list information.  Because there is now
locking needed around xfs_buf_lru_add/del, and they are used in exactly
one place each two lines apart, get rid of the wrappers and code the logic
directly in place.

Further, the LRU emptying code used on unmount is less than optimal.
Convert it to use a dispose list as per a normal shrinker walk, and repeat
the walk that fills the dispose list until the LRU is empty.  Thi avoids
needing to drop and regain the LRU lock for every item being freed, and
allows the same logic as the shrinker isolate call to be used.  Simpler,
easier to understand.

Signed-off-by: Dave Chinner 
Signed-off-by: Glauber Costa 
Cc: "Theodore Ts'o" 
Cc: Adrian Hunter 
Cc: Al Viro 
Cc: Artem Bityutskiy 
Cc: Arve Hjønnevåg 
Cc: Carlos Maiolino 
Cc: Christoph Hellwig 
Cc: Chuck Lever 
Cc: Daniel Vetter 
Cc: David Rientjes 
Cc: Gleb Natapov 
Cc: Greg Thelen 
Cc: J. Bruce Fields 
Cc: Jan Kara 
Cc: Jerome Glisse 
Cc: John Stultz 
Cc: KAMEZAWA Hiroyuki 
Cc: Kent Overstreet 
Cc: Kirill A. Shutemov 
Cc: Marcelo Tosatti 
Cc: Mel Gorman 
Cc: Steven Whitehouse 
Cc: Thomas Hellstrom 
Cc: Trond Myklebust 
Signed-off-by: Andrew Morton 
Signed-off-by: Al Viro 
(cherry picked from commit a408235726aa82c0358c9ec68124b6f4bc0a79df)
Signed-off-by: Dmitry Monakhov 
---
 fs/xfs/xfs_buf.c | 147 +++
 fs/xfs/xfs_buf.h |   8 ++-
 2 files changed, 78 insertions(+), 77 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index bf933d5..8d8c9ce 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -80,37 +80,6 @@ xfs_buf_vmap_len(
 }
 
 /*
- * xfs_buf_lru_add - add a buffer to the LRU.
- *
- * The LRU takes a new reference to the buffer so that it will only be freed
- * once the shrinker takes the buffer off the LRU.
- */
-static void
-xfs_buf_lru_add(
-   struct xfs_buf  *bp)
-{
-   if (list_lru_add(>b_target->bt_lru, >b_lru)) {
-   bp->b_lru_flags &= ~_XBF_LRU_DISPOSE;
-   atomic_inc(>b_hold);
-   }
-}
-
-/*
- * xfs_buf_lru_del - remove a buffer from the LRU
- *
- * The unlocked check is safe here because it only occurs when there are not
- * b_lru_ref counts left on the inode under the pag->pag_buf_lock. it is there
- * to optimise the shrinker removing the buffer from the LRU and calling
- * xfs_buf_free().
- */
-static void
-xfs_buf_lru_del(
-   struct xfs_buf  *bp)
-{
-   list_lru_del(>b_target->bt_lru, >b_lru);
-}
-
-/*
  * Bump the I/O in flight count on the buftarg if we haven't yet done so for
  * this buffer. The count is incremented once per buffer (per hold cycle)
  * because the corresponding decrement is deferred to buffer release. Buffers
@@ -181,12 +150,14 @@ xfs_buf_stale(
 */
xfs_buf_ioacct_dec(bp);
 
-   atomic_set(&(bp)->b_lru_ref, 0);
-   if (!(bp->b_lru_flags & _XBF_LRU_DISPOSE) &&
+   spin_lock(>b_lock);
+   atomic_set(>b_lru_ref, 0);
+   if (!(bp->b_state & XFS_BSTATE_DISPOSE) &&
(list_lru_del(>b_target->bt_lru, >b_lru)))
atomic_dec(>b_hold);
 
ASSERT(atomic_read(>b_hold) >= 1);
+   spin_unlock(>b_lock);
 }
 
 static int
@@ -987,10 +958,28 @@ xfs_buf_rele(
/* the last reference has been dropped ... */
xfs_buf_ioacct_dec(bp);
if (!(bp->b_flags & XBF_STALE) && atomic_read(>b_lru_ref)) {
-   xfs_buf_lru_add(bp);
+   /*
+* If the buffer is added to the LRU take a new
+* reference to the buffer for the LRU and clear the
+* (now stale) dispose list state flag
+*/
+   if (list_lru_add(>b_target->bt_lru, >b_lru)) {
+   bp->b_state &= ~XFS_BSTATE_DISPOSE;
+   atomic_inc(>b_hold);
+   }
spin_unlock(>pag_buf_lock);
} else {
-   xfs_buf_lru_del(bp);
+   /*
+* most of the time buffers 

[Devel] [PATCH 0/4] [7.3] rebase xfs lru patches

2016-12-06 Thread Dmitry Monakhov
rh7-3.10.0-514 already has 'fs-xfs-rework-buffer-dispose-list-tracking', but
originally it depens on ms/xfs-convert-buftarg-LRU-to-generic, so
In order to preserve original logic I've revert rhel's patch (1'st one),
and reapply it later in natural order:
TOC:
0001-Revert-fs-xfs-rework-buffer-dispose-list-tracking.patch

0002-ms-xfs-convert-buftarg-LRU-to-generic-code.patch
0003-From-c70ded437bb646ace0dcbf3c7989d4edeed17f7e-Mon-Se.patch [not changed]
0004-ms-xfs-rework-buffer-dispose-list-tracking.patch
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] drop: ext4: resplit block_page_mkwrite: fix get-host convention

2016-12-06 Thread Konstantin Khorenko

It will be in 3.10.0-514.vz7.27.6,

thank you.

commit dfaafdf4efe49ceb8eeae350f25bddb6ae10ddd8
Author: Konstantin Khorenko 
Date:   Tue Nov 29 15:51:27 2016 +0400

Revert "ext4: resplit block_page_mkwrite: fix get-host convention"

This reverts commit 0555bb273c30fac49374950aaee4c938692fe1f6.

This is a leftover from vzfs code.

https://jira.sw.ru/browse/PSBM-54817

Signed-off-by: Dmitry Monakhov 

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 11/18/2016 04:17 PM, Dmitry Monakhov wrote:


We no longer needed vzfs crunches:
Please drop this patch:
ext4: resplit block_page_mkwrite: fix get-host convention
commit c97eaffbf6c9b909e324c59380962158185639bf
.


___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ploop: pio_nfs does not require PLOOP_REQ_ISSUE_FLUSH (v2)

2016-12-06 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-327.36.1.vz7.20.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-327.36.1.vz7.20.11
-->
commit cc3748db2c17d5805d9568d774cfe63b41195258
Author: Maxim Patlasov 
Date:   Tue Dec 6 15:48:22 2016 +0400

ploop: pio_nfs does not require PLOOP_REQ_ISSUE_FLUSH (v2)

The flag was introduced for local case: if we modify local
block-device directly, bypassing ext4, we cannot rely on
fsync() and must flush the device explicitly. It is not the
case for pio_nfs, so it's not necessary to set
PLOOP_REQ_ISSUE_FLUSH.

The patch is important because pio_nfs doesn't provide
issue_flush method.

Changed in v2:
 - rebase: after 34c7bf1755 moved set_bit and nullifying submit into
   ploop_entry_nullify_req(), the patch to be applied to this new
   place.

Signed-off-by: Maxim Patlasov 
---
 drivers/block/ploop/dev.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c
index 5aacd2a..eaf48c2 100644
--- a/drivers/block/ploop/dev.c
+++ b/drivers/block/ploop/dev.c
@@ -2037,9 +2037,11 @@ ploop_entry_nullify_req(struct ploop_request *preq)
 * (see dio_submit()). So fsync of EXT4 image doesnt help us.
 * We need to force sync of nullified blocks.
 */
+   if (top_delta->io.ops->issue_flush) {
+   preq->eng_io = _delta->io;
+   set_bit(PLOOP_REQ_ISSUE_FLUSH, >state);
+   }
 
-   preq->eng_io = _delta->io;
-   set_bit(PLOOP_REQ_ISSUE_FLUSH, >state);
top_delta->io.ops->submit(_delta->io, preq, preq->req_rw,
  , preq->iblock, 1

[Devel] [PATCH RHEL7 COMMIT] ms/packet: fix race condition in packet_set_ring

2016-12-06 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-327.36.1.vz7.20.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-327.36.1.vz7.20.11
-->
commit e8993d723121f3d9303db373849bcd01df2def48
Author: Philip Pettersson 
Date:   Wed Nov 30 14:55:36 2016 -0800

ms/packet: fix race condition in packet_set_ring

When packet_set_ring creates a ring buffer it will initialize a
struct timer_list if the packet version is TPACKET_V3. This value
can then be raced by a different thread calling setsockopt to
set the version to TPACKET_V1 before packet_set_ring has finished.

This leads to a use-after-free on a function pointer in the
struct timer_list when the socket is closed as the previously
initialized timer will not be deleted.

The bug is fixed by taking lock_sock(sk) in packet_setsockopt when
changing the packet version while also taking the lock at the start
of packet_set_ring.

Fixes: f6fb8f100b80 ("af-packet: TPACKET_V3 flexible buffer 
implementation.")
Signed-off-by: Philip Pettersson 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 


CVE-2016-8655: Linux af_packet.c race condition (local root)

Philip Pettersson discovered a race condition in the af_packet 
implementation
in the Linux kernel. A local unprivileged attacker could use this to cause a
denial of service (system crash) or run arbitrary code with administrative
privileges.

https://jira.sw.ru/browse/PSBM-56699
---
 net/packet/af_packet.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 2a1b15a..0e662e3 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3313,19 +3313,25 @@ packet_setsockopt(struct socket *sock, int level, int 
optname, char __user *optv
 
if (optlen != sizeof(val))
return -EINVAL;
-   if (po->rx_ring.pg_vec || po->tx_ring.pg_vec)
-   return -EBUSY;
if (copy_from_user(, optval, sizeof(val)))
return -EFAULT;
switch (val) {
case TPACKET_V1:
case TPACKET_V2:
case TPACKET_V3:
-   po->tp_version = val;
-   return 0;
+   break;
default:
return -EINVAL;
}
+   lock_sock(sk);
+   if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) {
+   ret = -EBUSY;
+   } else {
+   po->tp_version = val;
+   ret = 0;
+   }
+   release_sock(sk);
+   return ret;
}
case PACKET_RESERVE:
{
@@ -3781,6 +3787,7 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
/* Added to avoid minimal code churn */
struct tpacket_req *req = _u->req;
 
+   lock_sock(sk);
/* Opening a Tx-ring is NOT supported in TPACKET_V3 */
if (!closing && tx_ring && (po->tp_version > TPACKET_V2)) {
WARN(1, "Tx-ring is not supported.\n");
@@ -3869,7 +3876,6 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
goto out;
}
 
-   lock_sock(sk);
 
/* Detach socket from network */
spin_lock(>bind_lock);
@@ -3918,7 +3924,6 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
if (!tx_ring)
prb_shutdown_retire_blk_timer(po, tx_ring, rb_queue);
}
-   release_sock(sk);
 
if (pg_vec) {
if (psc)
@@ -3927,6 +3932,7 @@ static int packet_set_ring(struct sock *sk, union 
tpacket_req_u *req_u,
free_pg_vec(pg_vec, order, req->tp_block_nr);
}
 out:
+   release_sock(sk);
return err;
 }
 
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel