[Devel] [PATCH rh7] overlayfs: fix dentry reference leak

2016-07-29 Thread Maxim Patlasov
Without this patch it is easy to crash node by fiddling
with overlayfs dirs. Backport commit ab79efab0 from ms:

From: David Howells 

In ovl_copy_up_locked(), newdentry is leaked if the function exits through
out_cleanup as this just to out after calling ovl_cleanup() - which doesn't
actually release the ref on newdentry.

The out_cleanup segment should instead exit through out2 as certainly
newdentry leaks - and possibly upper does also, though this isn't caught
given the catch of newdentry.

Without this fix, something like the following is seen:

BUG: Dentry 880023e9eb20{i=f861,n=#880023e82d90} still in use (1) 
[unmount of tmpfs tmpfs]
BUG: Dentry 880023ece640{i=0,n=bigfile}  still in use (1) [unmount of 
tmpfs tmpfs]

when unmounting the upper layer after an error occurred in copyup.

An error can be induced by creating a big file in a lower layer with
something like:

dd if=/dev/zero of=/lower/a/bigfile bs=65536 count=1 seek=$((0xf000))

to create a large file (4.1G).  Overlay an upper layer that is too small
(on tmpfs might do) and then induce a copy up by opening it writably.

Reported-by: Ulrich Obergfell 
Signed-off-by: David Howells 
Signed-off-by: Miklos Szeredi 

https://jira.sw.ru/browse/PSBM-47981
---
 fs/overlayfs/copy_up.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 3f3d1b0..afed35c 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -299,7 +299,7 @@ out:
 
 out_cleanup:
ovl_cleanup(wdir, newdentry);
-   goto out;
+   goto out2;
 }
 
 /*

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7] ext4: ext4_mkdir must set S_IOPS_WRAPPER bit

2016-07-29 Thread Maxim Patlasov
Kostya, ms is not affected,  RedHat bz ticket: 
https://bugzilla.redhat.com/show_bug.cgi?id=1361682



On 07/29/2016 08:15 AM, Konstantin Khorenko wrote:

Maxim, will you send the patch to mainstream as well?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 07/26/2016 12:01 AM, Maxim Patlasov wrote:
ext4_iget() sets this bit for directories. Let's do the same in 
ext4_mkdir().
Otherwise, the behaviour of vfs_rename (on top of ext4) varies 
depending on

how the in-core inode was born: via lookup or mkdir.

The key place in vfs_rename sensible to the change is:


if (flags && !rename2)
return -EINVAL;


Signed-off-by: Maxim Patlasov 
---
 fs/ext4/namei.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 0adc6df..bebe698 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2413,6 +2413,7 @@ retry:

 inode->i_op = _dir_inode_operations.ops;
 inode->i_fop = _dir_operations;
+inode->i_flags |= S_IOPS_WRAPPER;
 err = ext4_init_new_dir(handle, dir, inode);
 if (err)
 goto out_clear_inode;

.



___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [NEW KERNEL] 3.10.0-327.22.2.vz7.16.2 (rhel7)

2016-07-29 Thread builder
Changelog:

OpenVZ kernel rh7-3.10.0-327.22.2.vz7.16.2

* kernel.spec: returns back build of kernel headers and debug kernels
  by default

* ext4: set S_IOPS_WRAPPER inode flag on directory creation via "mkdir"

* net: "bridge" CT feature must control creation of briges inside
  a Container in both ways: via ioctl and via netlink


Generated changelog:

* Fri Jul 29 2016 Konstantin Khorenko  
[3.10.0-327.22.2.vz7.16.2]
- ve/bridge: br_dev_init: check if "bridge" feature is enabled (Evgenii 
Shatokhin) [PSBM-50009]
- ext4: ext4_mkdir must set S_IOPS_WRAPPER bit (Maxim Patlasov)


Built packages: 
http://kojistorage.eng.sw.ru/packages/vzkernel/3.10.0/327.22.2.vz7.16.2/
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ve/bridge: br_dev_init: check if "bridge" feature is enabled

2016-07-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-327.22.2.vz7.16.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-327.22.2.vz7.16.1
-->
commit 420fc7211bffd87d83cd4c8877ea446d9bc9222a
Author: Evgenii Shatokhin 
Date:   Fri Jul 29 19:16:34 2016 +0400

ve/bridge: br_dev_init: check if "bridge" feature is enabled

Currently, the feature is checked in br_ioctl_deviceless_stub() which is
called when "brctl addbr" runs. However, "ip link add br1 type bridge"
goes a different path and still succeeds even if the feature is disabled
for a CT:
rtnl_newlink
  rtnl_create_link
br_dev_setup
  register_netdevice
br_dev_init
...

Let us check the "bridge" feature in br_dev_init() instead, to cover both
cases.

https://jira.sw.ru/browse/PSBM-50009

Signed-off-by: Evgenii Shatokhin 
Acked-by: Kirill Tkhai 
---
 net/bridge/br_device.c | 4 
 net/bridge/br_ioctl.c  | 3 ---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 5e3347b..db206a3 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -88,8 +88,12 @@ out:
 static int br_dev_init(struct net_device *dev)
 {
struct net_bridge *br = netdev_priv(dev);
+   struct net *net = dev_net(dev);
int err;
 
+   if (!(net->owner_ve->features & VE_FEATURE_BRIDGE))
+   return -EACCES;
+
br->stats = netdev_alloc_pcpu_stats(struct pcpu_sw_netstats);
if (!br->stats)
return -ENOMEM;
diff --git a/net/bridge/br_ioctl.c b/net/bridge/br_ioctl.c
index 98447b8..cd8c3a4 100644
--- a/net/bridge/br_ioctl.c
+++ b/net/bridge/br_ioctl.c
@@ -351,9 +351,6 @@ static int old_deviceless(struct net *net, void __user 
*uarg)
 
 int br_ioctl_deviceless_stub(struct net *net, unsigned int cmd, void __user 
*uarg)
 {
-   if (!(net->owner_ve->features & VE_FEATURE_BRIDGE))
-   return -ENOTTY;
-
switch (cmd) {
case SIOCGIFBR:
case SIOCSIFBR:
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


[Devel] [PATCH RHEL7 COMMIT] ext4: ext4_mkdir must set S_IOPS_WRAPPER bit

2016-07-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-327.22.2.vz7.16.x-ovz" and will 
appear at https://src.openvz.org/scm/ovz/vzkernel.git
after rh7-3.10.0-327.22.2.vz7.16.1
-->
commit e8421e9d99ccbc3c8d2b3e79e1ebf3c70f9ec43c
Author: Maxim Patlasov 
Date:   Fri Jul 29 19:16:33 2016 +0400

ext4: ext4_mkdir must set S_IOPS_WRAPPER bit

ext4_iget() sets this bit for directories. Let's do the same in 
ext4_mkdir().
Otherwise, the behaviour of vfs_rename (on top of ext4) varies depending on
how the in-core inode was born: via lookup or mkdir.

The key place in vfs_rename sensible to the change is:

>   if (flags && !rename2)
>   return -EINVAL;

Signed-off-by: Maxim Patlasov 
---
 fs/ext4/namei.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 0adc6df..bebe698 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2413,6 +2413,7 @@ retry:
 
inode->i_op = _dir_inode_operations.ops;
inode->i_fop = _dir_operations;
+   inode->i_flags |= S_IOPS_WRAPPER;
err = ext4_init_new_dir(handle, dir, inode);
if (err)
goto out_clear_inode;
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7] ext4: ext4_mkdir must set S_IOPS_WRAPPER bit

2016-07-29 Thread Konstantin Khorenko

Maxim, will you send the patch to mainstream as well?

--
Best regards,

Konstantin Khorenko,
Virtuozzo Linux Kernel Team

On 07/26/2016 12:01 AM, Maxim Patlasov wrote:

ext4_iget() sets this bit for directories. Let's do the same in ext4_mkdir().
Otherwise, the behaviour of vfs_rename (on top of ext4) varies depending on
how the in-core inode was born: via lookup or mkdir.

The key place in vfs_rename sensible to the change is:


if (flags && !rename2)
return -EINVAL;


Signed-off-by: Maxim Patlasov 
---
 fs/ext4/namei.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 0adc6df..bebe698 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2413,6 +2413,7 @@ retry:

inode->i_op = _dir_inode_operations.ops;
inode->i_fop = _dir_operations;
+   inode->i_flags |= S_IOPS_WRAPPER;
err = ext4_init_new_dir(handle, dir, inode);
if (err)
goto out_clear_inode;

.


___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH rh7 3/3] ploop: io_direct: delay f_op->fsync() until index_update for reloc requests (v3)

2016-07-29 Thread Dmitry Monakhov
Maxim Patlasov  writes:

> Dima,
>
>
> One week elapsed, still no feedback from you. Do you have something 
> against this patch?
Sorry for delay Max. I was overloaded by pended crap I've collected
before vacations, and lost your email. Again sorry.

Whole patch looks good. Thank you for your rede

BTW: We defenitely need regression testing for original bug (broken
barries and others). I'm working on that.
>
>
> Thanks,
>
> Maxim
>
>
> On 07/20/2016 11:21 PM, Maxim Patlasov wrote:
>> Commit 9f860e606 introduced an engine to delay fsync: doing
>> fallocate(FALLOC_FL_CONVERT_UNWRITTEN) dio_post_submit marks
>> io as PLOOP_IO_FSYNC_DELAYED to ensure that fsync happens
>> later, when incoming FLUSH|FUA comes.
>>
>> That was deemed as important because (PSBM-47026):
>>
>>> This optimization becomes more important due to the fact that customers 
>>> tend to use pcompact heavily => ploop images grow each day.
>> Now, we can easily re-use the engine to delay fsync for reloc
>> requests as well. As explained in the description of commit
>> 5aa3fe09:
>>
>>>  1->read_data_from_old_post
>>>  2->write_to_new_pos
>>>->sumbit_alloc
>>>   ->submit_pad
>>>   ->post_submit->convert_unwritten
>>>  3->update_index ->write_page with FLUSH|FUA
>>>  4->nullify_old_pos
>>> 5->issue_flush
>> by the time of step 3 extent coversion is not yet stable because
>> belongs to uncommitted transaction. But instead of doing fsync
>> inside ->post_submit, we can fsync later, as the very first step
>> of write_page for index_update.
>>
>> Changed in v2:
>>   - process delayed fsync asynchronously, via PLOOP_E_FSYNC_PENDED eng_state
>>
>> Changed in v3:
>>   - use extra arg for ploop_index_wb_proceed_or_delay() instead of ad-hoc 
>> PLOOP_REQ_FSYNC_IF_DELAYED
>>
>> https://jira.sw.ru/browse/PSBM-47026
>>
>> Signed-off-by: Maxim Patlasov 
>> ---
>>   drivers/block/ploop/dev.c   |9 +++--
>>   drivers/block/ploop/map.c   |   32 
>>   include/linux/ploop/ploop.h |1 +
>>   3 files changed, 36 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/block/ploop/dev.c b/drivers/block/ploop/dev.c
>> index df3eec9..ed60b1f 100644
>> --- a/drivers/block/ploop/dev.c
>> +++ b/drivers/block/ploop/dev.c
>> @@ -2720,6 +2720,11 @@ restart:
>>  ploop_index_wb_complete(preq);
>>  break;
>>   
>> +case PLOOP_E_FSYNC_PENDED:
>> +/* fsync done */
>> +ploop_index_wb_proceed(preq);
>> +break;
>> +
>>  default:
>>  BUG();
>>  }
>> @@ -4106,7 +4111,7 @@ static void ploop_relocate(struct ploop_device * plo)
>>  preq->bl.tail = preq->bl.head = NULL;
>>  preq->req_cluster = 0;
>>  preq->req_size = 0;
>> -preq->req_rw = WRITE_SYNC|REQ_FUA;
>> +preq->req_rw = WRITE_SYNC;
>>  preq->eng_state = PLOOP_E_ENTRY;
>>  preq->state = (1 << PLOOP_REQ_SYNC) | (1 << PLOOP_REQ_RELOC_A);
>>  preq->error = 0;
>> @@ -4410,7 +4415,7 @@ static void ploop_relocblks_process(struct 
>> ploop_device *plo)
>>  preq->bl.tail = preq->bl.head = NULL;
>>  preq->req_cluster = ~0U; /* uninitialized */
>>  preq->req_size = 0;
>> -preq->req_rw = WRITE_SYNC|REQ_FUA;
>> +preq->req_rw = WRITE_SYNC;
>>  preq->eng_state = PLOOP_E_ENTRY;
>>  preq->state = (1 << PLOOP_REQ_SYNC) | (1 << PLOOP_REQ_RELOC_S);
>>  preq->error = 0;
>> diff --git a/drivers/block/ploop/map.c b/drivers/block/ploop/map.c
>> index 5f7fd66..715dc15 100644
>> --- a/drivers/block/ploop/map.c
>> +++ b/drivers/block/ploop/map.c
>> @@ -915,6 +915,24 @@ void ploop_index_wb_proceed(struct ploop_request * preq)
>>  put_page(page);
>>   }
>>   
>> +static void ploop_index_wb_proceed_or_delay(struct ploop_request * preq,
>> +int do_fsync_if_delayed)
>> +{
>> +if (do_fsync_if_delayed) {
>> +struct map_node * m = preq->map;
>> +struct ploop_delta * top_delta = map_top_delta(m->parent);
>> +struct ploop_io * top_io = _delta->io;
>> +
>> +if (test_bit(PLOOP_IO_FSYNC_DELAYED, _io->io_state)) {
>> +preq->eng_state = PLOOP_E_FSYNC_PENDED;
>> +ploop_add_req_to_fsync_queue(preq);
>> +return;
>> +}
>> +}
>> +
>> +ploop_index_wb_proceed(preq);
>> +}
>> +
>>   /* Data write is commited. Now we need to update index. */
>>   
>>   void ploop_index_update(struct ploop_request * preq)
>> @@ -927,6 +945,7 @@ void ploop_index_update(struct ploop_request * preq)
>>  int old_level;
>>  struct page * page;
>>  unsigned long state = READ_ONCE(preq->state);
>> +int do_fsync_if_delayed = 0;
>>   
>>  /* No way back, we are going to initiate index write. */
>>   
>> @@ -985,10 +1004,12 @@ void ploop_index_update(struct ploop_request * preq)