Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-27 Thread Christoph Hellwig
Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
> const struct inode_operations ocfs2_file_iops = {
> .permission = ocfs2_permission,
> .get_acl= ocfs2_iop_get_acl,
> .set_acl= ocfs2_iop_set_acl,
> };
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] BUG_ON(le64_to_cpu(fe->i_size) != i_size_read(inode)) is triggered in ocfs2_truncate_file()

2016-10-27 Thread Gang He
Hi Zhen,

This should be file system consistency issue between inode in memory and inode 
in disk, 
You should look at the code that modify the file size in memory, and then 
whether the code also sync the change to the disk.


Thanks
Gang




>>> 
> Hi all,
> 
> Any one ever see this BUG_ON() assertion 
> (https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L460)
> triggered? (paste log message at the end). I cannot reproduced it so far.
> 
> fallocate with FALLOC_FL_KEEP_SIZE flag (man 2 fallocate) can result in  
> "le64_to_cpu(fe->i_size) != i_size_read(inode)"
> as explained in this commit message:
> 
> ```
> commit d62e74be1270c89fbaf7aada8218bfdf62d00a58
> Author: Younger Liu mailto:younger@huawei.com>>
> Date:   Mon Feb 10 14:25:51 2014 -0800
> 
>  ocfs2: fix issue that ocfs2_setattr() does not deal with 
> new_i_size==i_size
> 
> The issue scenario is as following:
> 
> - Create a small file and fallocate a large disk space for a file with
>FALLOC_FL_KEEP_SIZE option.
> 
> - ftruncate the file back to the original size again.  but the disk free
>space is not changed back.  This is a real bug that be fixed in this
>patch.
> 
> In order to solve the issue above, we modified ocfs2_setattr(), if
> attr->ia_size != i_size_read(inode), It calls ocfs2_truncate_file(), and
> truncate disk space to attr->ia_size.
> ```
> 
> I was thinking to remove this BUG_ON() assertion. But, the following steps 
> cannot
> trigger it:
> 
> $dd if=/dev/zero of=/mnt/ocfs2/test2 bs=512 count=1
> $fallocate --keep-size --length 102400 /mnt/ocfs2/test2
> $truncate --size=512 /mnt/ocfs2/test2
> 
> I'm wondering why the testing result goes against what I expect? Finally, I 
> find that:
> 
> ocfs2_inode_lock(inode, bh, 1) in ocfs2_setattr() will update the 
> inode->i_size filed from LVB value
> or ocfs2_dinode from disk.
> ---
> vfs_truncate
>   do_truncate
> inode_lock()
> notify_change
>  ocfs2_setattr
>   ocfs2_rw_lock()
>   ocfs2_inode_lock()
>ocfs2_truncate_file
>   ocfs2_rw_lock()
>   ocfs2_inode_unlock
> inode_unlock()
> 
> ---
> ocfs2_inode_lock()
>   ocfs2_inode_lock_full_nested() ocfs2_inode_lock_update() ==> 
> https://github.com/torvalds/linux/blob/master/fs/ocfs2/dlmglue.c#L2204 if 
> (ocfs2_meta_lvb_is_trustable()) ocfs2_refresh_inode_from_lvb() else { 
> ocfs2_read_inode_block() ocfs2_refresh_inode() } Since we update 
> inode->i_size under the protection of thus locks, how was the assertion 
> triggered?
> As always, any comments and suggestion will be appreciated!
> 
> Thanks,
> Eric
> 
> Log message:
> ```
> kernel-default-3.0.101-80
> ocfs2-kmp-default-1.6_3.0.101_68-0.25.6
> 
> [239098.534619] (dsmc,8244,6):ocfs2_truncate_file:466 ERROR: bug expression: 
> le64_to_cpu(fe->i_size) != i_size_read(inode)
> [239098.534633] (dsmc,8244,6):ocfs2_truncate_file:466 ERROR: Inode 10812, 
> inode i_size = 677 != di i_size = 738, i_flags = 0x1
> 
> ...
> 
> [239098.534724] kernel BUG at 
> /usr/src/packages/BUILD/ocfs2-1.6/default/ocfs2/file.c:466!
> 
> PID: 8244   TASK: 8801dfb862c0  CPU: 6   COMMAND: "dsmc"
>   #0 [8801f59618d0] machine_kexec at 8102c54e
>   #1 [8801f5961920] crash_kexec at 810ae858
>   #2 [8801f59619f0] oops_end at 8146b558
>   #3 [8801f5961a10] do_invalid_op at 810036c4
>   #4 [8801f5961ab0] invalid_op at 81472d5b
>  [exception RIP: ocfs2_truncate_file+165]
>  RIP: a0929ba5  RSP: 8801f5961b68  RFLAGS: 00010296
>  RAX: 0085  RBX: 8801dfb862c0  RCX: 7335
>  RDX:   RSI: 0007  RDI: 0246
>  RBP: 1000   R8: 81da3ac0   R9: 
>  R10: 0003  R11:   R12: 8801cbb864f8
>  R13: 8801dfb86930  R14: 8801bf5ad000  R15: 02a5
>  ORIG_RAX:   CS: 0010  SS: 0018
>   #5 [8801f5961bd0] ocfs2_setattr at a092c43e [ocfs2]
>   #6 [8801f5961c90] notify_change at 8117a0cf
>   #7 [8801f5961cf0] do_truncate at 8115e177
>   #8 [8801f5961d60] do_last at 8116d0c3
>   #9 [8801f5961dc0] path_openat at 8116df29
> #10 [8801f5961e50] do_filp_open at 8116e39c
> #11 [8801f5961f20] do_sys_open at 8115eabf
> #12 [8801f5961f80] system_call_fastpath at 81471d72
>  RIP: 7f7da23f67b0  RSP: 7fffae4a3470  RFLAGS: 00010246
>  RAX: 0002  RBX: 81471d72  RCX: 
>  RDX: 01b6  RSI: 0241  RDI: 00cff3df
>  RBP: 7fffae4a4ab0   R8: 0004   R9: 0001
>  R10: 0241  R11: 0246  R12: 
>  R13: 00d1d2f0  R14: 00930a68  R15: 0004
>  ORIG_RAX: 0002  CS: 0033  SS: 002b

___
Ocfs2

[Ocfs2-devel] BUG_ON(le64_to_cpu(fe->i_size) != i_size_read(inode)) is triggered in ocfs2_truncate_file()

2016-10-27 Thread Eric Ren

Hi all,

Any one ever see this BUG_ON() assertion 
(https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L460)
triggered? (paste log message at the end). I cannot reproduced it so far.

fallocate with FALLOC_FL_KEEP_SIZE flag (man 2 fallocate) can result in  
"le64_to_cpu(fe->i_size) != i_size_read(inode)"
as explained in this commit message:

```
commit d62e74be1270c89fbaf7aada8218bfdf62d00a58
Author: Younger Liu mailto:younger@huawei.com>>
Date:   Mon Feb 10 14:25:51 2014 -0800

ocfs2: fix issue that ocfs2_setattr() does not deal with new_i_size==i_size

The issue scenario is as following:

- Create a small file and fallocate a large disk space for a file with
  FALLOC_FL_KEEP_SIZE option.

- ftruncate the file back to the original size again.  but the disk free
  space is not changed back.  This is a real bug that be fixed in this
  patch.

In order to solve the issue above, we modified ocfs2_setattr(), if
attr->ia_size != i_size_read(inode), It calls ocfs2_truncate_file(), and
truncate disk space to attr->ia_size.
```

I was thinking to remove this BUG_ON() assertion. But, the following steps 
cannot
trigger it:

$dd if=/dev/zero of=/mnt/ocfs2/test2 bs=512 count=1
$fallocate --keep-size --length 102400 /mnt/ocfs2/test2
$truncate --size=512 /mnt/ocfs2/test2

I'm wondering why the testing result goes against what I expect? Finally, I 
find that:

ocfs2_inode_lock(inode, bh, 1) in ocfs2_setattr() will update the inode->i_size 
filed from LVB value
or ocfs2_dinode from disk.
---
vfs_truncate
 do_truncate
   inode_lock()
   notify_change
ocfs2_setattr
 ocfs2_rw_lock()
 ocfs2_inode_lock()
  ocfs2_truncate_file
 ocfs2_rw_lock()
 ocfs2_inode_unlock
   inode_unlock()

---
ocfs2_inode_lock()
 ocfs2_inode_lock_full_nested() ocfs2_inode_lock_update() ==> 
https://github.com/torvalds/linux/blob/master/fs/ocfs2/dlmglue.c#L2204 if 
(ocfs2_meta_lvb_is_trustable()) ocfs2_refresh_inode_from_lvb() else { 
ocfs2_read_inode_block() ocfs2_refresh_inode() } Since we update inode->i_size under the protection of thus locks, how was the assertion triggered?

As always, any comments and suggestion will be appreciated!

Thanks,
Eric

Log message:
```
kernel-default-3.0.101-80
ocfs2-kmp-default-1.6_3.0.101_68-0.25.6

[239098.534619] (dsmc,8244,6):ocfs2_truncate_file:466 ERROR: bug expression: 
le64_to_cpu(fe->i_size) != i_size_read(inode)
[239098.534633] (dsmc,8244,6):ocfs2_truncate_file:466 ERROR: Inode 10812, inode 
i_size = 677 != di i_size = 738, i_flags = 0x1

...

[239098.534724] kernel BUG at 
/usr/src/packages/BUILD/ocfs2-1.6/default/ocfs2/file.c:466!

PID: 8244   TASK: 8801dfb862c0  CPU: 6   COMMAND: "dsmc"
 #0 [8801f59618d0] machine_kexec at 8102c54e
 #1 [8801f5961920] crash_kexec at 810ae858
 #2 [8801f59619f0] oops_end at 8146b558
 #3 [8801f5961a10] do_invalid_op at 810036c4
 #4 [8801f5961ab0] invalid_op at 81472d5b
[exception RIP: ocfs2_truncate_file+165]
RIP: a0929ba5  RSP: 8801f5961b68  RFLAGS: 00010296
RAX: 0085  RBX: 8801dfb862c0  RCX: 7335
RDX:   RSI: 0007  RDI: 0246
RBP: 1000   R8: 81da3ac0   R9: 
R10: 0003  R11:   R12: 8801cbb864f8
R13: 8801dfb86930  R14: 8801bf5ad000  R15: 02a5
ORIG_RAX:   CS: 0010  SS: 0018
 #5 [8801f5961bd0] ocfs2_setattr at a092c43e [ocfs2]
 #6 [8801f5961c90] notify_change at 8117a0cf
 #7 [8801f5961cf0] do_truncate at 8115e177
 #8 [8801f5961d60] do_last at 8116d0c3
 #9 [8801f5961dc0] path_openat at 8116df29
#10 [8801f5961e50] do_filp_open at 8116e39c
#11 [8801f5961f20] do_sys_open at 8115eabf
#12 [8801f5961f80] system_call_fastpath at 81471d72
RIP: 7f7da23f67b0  RSP: 7fffae4a3470  RFLAGS: 00010246
RAX: 0002  RBX: 81471d72  RCX: 
RDX: 01b6  RSI: 0241  RDI: 00cff3df
RBP: 7fffae4a4ab0   R8: 0004   R9: 0001
R10: 0241  R11: 0246  R12: 
R13: 00d1d2f0  R14: 00930a68  R15: 0004
ORIG_RAX: 0002  CS: 0033  SS: 002b

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

[Ocfs2-devel] [PATCH] ocfs2: Delete redundant code and set the node bit into maybe_map directly.

2016-10-27 Thread Guozhonghua

The variables set_maybe is redundant while mle had been found in the map. So it 
is ok to set the node_idx into mle's maybe_map directly.
This patch is based on linux-4.9-rc2.

Signed-off-by: Guozhonghua 

--- ocfs2.orig/dlm/dlmmaster.c  2016-10-28 09:35:55.180696409 +0800
+++ ocfs2/dlm/dlmmaster.c   2016-10-28 09:40:19.020703391 +0800
@@ -1609,8 +1609,6 @@ way_up_top:
__dlm_insert_mle(dlm, mle);
response = DLM_MASTER_RESP_NO;
} else {
-   // mlog(0, "mle was found\n");
-   set_maybe = 1;
spin_lock(&tmpmle->spinlock);
if (tmpmle->master == dlm->node_num) {
mlog(ML_ERROR, "no lockres, but an mle with this node 
as master!\n");
@@ -1625,8 +1623,7 @@ way_up_top:
response = DLM_MASTER_RESP_NO;
} else
response = DLM_MASTER_RESP_MAYBE;
-   if (set_maybe)
-   set_bit(request->node_idx, tmpmle->maybe_map);
+   set_bit(request->node_idx, tmpmle->maybe_map);
spin_unlock(&tmpmle->spinlock);
}
spin_unlock(&dlm->master_lock);

-
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel