[Ocfs2-devel] [PATCH] MLE releases issue.

2016-10-28 Thread Gechangwei
Hi,
During my test on OCFS2 suffering a storage failure, a crash issue was found.
Below was the call trace when crashed.
From the call trace, we can see a MLE's reference count is going to be 
negative, which aroused a BUG_ON()

[143355.593258] Call Trace:
[143355.593268]  [] dlm_put_mle_inuse+0x47/0x70 [ocfs2_dlm]
[143355.593276]  [] dlm_get_lock_resource+0xac5/0x10d0 
[ocfs2_dlm]
[143355.593286]  [] ? ip_queue_xmit+0x14a/0x3d0
[143355.593292]  [] ? kmem_cache_alloc+0x1e4/0x220
[143355.593300]  [] ? dlm_wait_for_recovery+0x6c/0x190 
[ocfs2_dlm]
[143355.593311]  [] dlmlock+0x62d/0x16e0 [ocfs2_dlm]
[143355.593316]  [] ? __alloc_skb+0x9b/0x2b0
[143355.593323]  [] ? 0xc01f6000


I think I probably have found the root cause of this issue. Please

**Node 1**  **Node 2**
Storage failure
An assert master 
message is sent to Node 1
Treat Node2 as down
Assert master handler
Decrease MLE reference count
Clean blocked MLE
Decrease MLE reference count


In the above scenario, both dlm_assert_master_handler and dlm_clean_block_mle 
will decease MLE
reference count, thus, in the following get_resouce procedure, the reference 
count is going to be negative.

I propose a patch to solve this, please take review if you have any time.

Signed-off-by: gechangwei 
---
 dlm/dlmmaster.c | 8 +++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/dlm/dlmmaster.c b/dlm/dlmmaster.c
index b747854..0540414 100644
--- a/dlm/dlmmaster.c
+++ b/dlm/dlmmaster.c
@@ -2020,7 +2020,7 @@ ok:

spin_lock(>spinlock);
if (mle->type == DLM_MLE_BLOCK || mle->type == 
DLM_MLE_MIGRATION)
-   extra_ref = 1;
+   extra_ref = test_bit(assert->node_idx, mle->maybe_map) 
? 1 : 0;
else {
/* MASTER mle: if any bits set in the response map
 * then the calling node needs to re-assert to clear
@@ -3465,12 +3465,18 @@ static void dlm_clean_block_mle(struct dlm_ctxt *dlm,
mlog(0, "mle found, but dead node %u would not have been "
 "master\n", dead_node);
spin_unlock(>spinlock);
+   } else if(mle->master != O2NM_MAX_NODES){
+   mlog(ML_NOTICE, "mle found, master assert received, master has "
+"already set to %d.\n ", mle->master);
+   spin_unlock(>spinlock);
} else {
/* Must drop the refcount by one since the assert_master will
 * never arrive. This may result in the mle being unlinked and
 * freed, but there may still be a process waiting in the
 * dlmlock path which is fine. */
mlog(0, "node %u was expected master\n", dead_node);
+   clear_bit(bit, mle->maybe_map);
atomic_set(>woken, 1);
spin_unlock(>spinlock);
wake_up(>wq);
--


BR.

Changwei

-
本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出
的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、
或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本
邮件!
This e-mail and its attachments contain confidential information from H3C, 
which is
intended only for the person or entity whose address is listed above. Any use 
of the
information contained herein in any way (including, but not limited to, total 
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify 
the sender
by phone or email immediately and delete it!
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-28 Thread Eric Ren

Hi Christoph!

Thanks for your attention.

On 10/28/2016 02:20 PM, Christoph Hellwig wrote:

Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:

Mostly, we can avoid recursive locking by writing code carefully. However, as
the deadlock issues have proved out, it's very hard to handle the routines
that are called directly by vfs. For instance:

 const struct inode_operations ocfs2_file_iops = {
 .permission = ocfs2_permission,
 .get_acl= ocfs2_iop_get_acl,
 .set_acl= ocfs2_iop_set_acl,
 };


ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

Yes, it is.

https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L1321
---
ocfs2_permission
  ocfs2_inode_lock()
generic_permission
ocfs2_inode_unlock


I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.
But, from my understanding, the pair of ocfs2_inode_lock/unlock() is used to prevent any 
concurrent changes
to the permission of the inode on the other cluster node while we are checking on it. It's a 
common  case for cluster

filesystem, such as GFS2: 
https://github.com/torvalds/linux/blob/master/fs/gfs2/inode.c#L1777

Thanks for your suggestion again!
Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-28 Thread Christoph Hellwig
Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
> const struct inode_operations ocfs2_file_iops = {
> .permission = ocfs2_permission,
> .get_acl= ocfs2_iop_get_acl,
> .set_acl= ocfs2_iop_set_acl,
> };
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel