Re: [Ocfs2-devel] [PATCH v2 3/3] ocfs2/dlm: continue to purge recovery lockres when recovery master goes down
On 2017/1/6 10:08, piaojun wrote: > > On 2017/1/5 15:44, Gechangwei wrote: >> On 2017/1/5 15:28, gechangwei 12382 (Cloud) wrote: >> >> Hi Jun, >> I suppose that a defect hid your patch. >> >> >>> We found a dlm-blocked situation caused by continuous breakdown of recovery >>> masters described below. To solve this problem, we should purge recovery >>> lock once detecting recovery master goes down. >>> >>> N3 N2 N1(reco master) >>> go down >>> pick up recovery lock and >>> begin recoverying for N2 >>> >>> go down >>> >>> pick up recovery >>> lock failed, then >>> purge it: >>> dlm_purge_lockres >>> ->DROPPING_REF is set >>> >>> send deref to N1 failed, >>> recovery lock is not purged >>> >>> find N1 go down, begin >>> recoverying for N1, but >>> blocked in dlm_do_recovery >>> as DROPPING_REF is set: >>> dlm_do_recovery >>> ->dlm_pick_recovery_master >>> ->dlmlock >>> ->dlm_get_lock_resource >>> ->__dlm_wait_on_lockres_flags(tmpres, >>> DLM_LOCK_RES_DROPPING_REF); >>> >>> Fixes: 8c0343968163 ("ocfs2/dlm: clear DROPPING_REF flag when the master >>> goes down") >>> >>> Signed-off-by: Jun Piao <piao...@huawei.com> >>> Reviewed-by: Joseph Qi <joseph...@huawei.com> >>> Reviewed-by: Jiufei Xue <xuejiu...@huawei.com> >>> --- >>> fs/ocfs2/dlm/dlmcommon.h | 2 ++ >>> fs/ocfs2/dlm/dlmmaster.c | 38 +++-- >>> fs/ocfs2/dlm/dlmrecovery.c | 29 +++--- >>> fs/ocfs2/dlm/dlmthread.c | 52 >>> ++ >>> 4 files changed, 74 insertions(+), 47 deletions(-) >>> >>> diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h index >>> 004f2cb..3e3e9ba8 100644 >>> --- a/fs/ocfs2/dlm/dlmcommon.h >>> +++ b/fs/ocfs2/dlm/dlmcommon.h >>> @@ -1004,6 +1004,8 @@ int dlm_finalize_reco_handler(struct o2net_msg *msg, >>> u32 len, void *data, int dlm_do_master_requery(struct dlm_ctxt *dlm, >>> struct dlm_lock_resource *res, >>> u8 nodenum, u8 *real_master); >>> >>> +void __dlm_do_purge_lockres(struct dlm_ctxt *dlm, >>> + struct dlm_lock_resource *res); >>> >>> int dlm_dispatch_assert_master(struct dlm_ctxt *dlm, >>> struct dlm_lock_resource *res, diff --git >>> a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index >>> 311404f..1d87e0f 100644 >>> --- a/fs/ocfs2/dlm/dlmmaster.c >>> +++ b/fs/ocfs2/dlm/dlmmaster.c >>> @@ -2425,52 +2425,20 @@ int dlm_deref_lockres_done_handler(struct o2net_msg >>> *msg, u32 len, void *data, >>> mlog(ML_NOTICE, "%s:%.*s: node %u sends deref done " >>> "but it is already derefed!\n", dlm->name, >>> res->lockname.len, res->lockname.name, node); >>> - dlm_lockres_put(res); >>> ret = 0; >>> goto done; >>> } >>> - >>> - if (!list_empty(>purge)) { >>> - mlog(0, "%s: Removing res %.*s from purgelist\n", >>> - dlm->name, res->lockname.len, res->lockname.name); >>> - list_del_init(>purge); >>> - dlm_lockres_put(res); >>> - dlm->purge_count--; >>> - } >>> - >>> - if (!__dlm_lockres_unused(res)) { >>> - mlog(ML_ERROR, "%s: res %.*s in use after deref\n", >>> - dlm->name, res->lockname.len, res->lockname.name); >>> - __dlm_print_one_lock_resource(res); >>> - BUG(); >>> - } >>> - >>> - __dlm_unhash_lockres(dlm, res); >>> - >>> - spin_lock(>track_lock); >>> - if (!list_empty(>tracking)) >>> - list_del_init(>tracking); >>> - else { >>> - mlog(ML_ERROR, "%s: Resource %.*s not on the Tracking list\n", >>> -
Re: [Ocfs2-devel] [PATCH v2 3/3] ocfs2/dlm: continue to purge recovery lockres when recovery master goes down
On 2017/1/5 15:28, gechangwei 12382 (Cloud) wrote: Hi Jun, I suppose that a defect hid your patch. > We found a dlm-blocked situation caused by continuous breakdown of recovery > masters described below. To solve this problem, we should purge recovery lock > once detecting recovery master goes down. > > N3 N2 N1(reco master) > go down > pick up recovery lock and > begin recoverying for N2 > > go down > > pick up recovery > lock failed, then > purge it: > dlm_purge_lockres > ->DROPPING_REF is set > > send deref to N1 failed, > recovery lock is not purged > > find N1 go down, begin > recoverying for N1, but > blocked in dlm_do_recovery > as DROPPING_REF is set: > dlm_do_recovery > ->dlm_pick_recovery_master > ->dlmlock > ->dlm_get_lock_resource > ->__dlm_wait_on_lockres_flags(tmpres, > DLM_LOCK_RES_DROPPING_REF); > > Fixes: 8c0343968163 ("ocfs2/dlm: clear DROPPING_REF flag when the master goes > down") > > Signed-off-by: Jun Piao <piao...@huawei.com> > Reviewed-by: Joseph Qi <joseph...@huawei.com> > Reviewed-by: Jiufei Xue <xuejiu...@huawei.com> > --- > fs/ocfs2/dlm/dlmcommon.h | 2 ++ > fs/ocfs2/dlm/dlmmaster.c | 38 +++-- > fs/ocfs2/dlm/dlmrecovery.c | 29 +++--- > fs/ocfs2/dlm/dlmthread.c | 52 > ++ > 4 files changed, 74 insertions(+), 47 deletions(-) > > diff --git a/fs/ocfs2/dlm/dlmcommon.h b/fs/ocfs2/dlm/dlmcommon.h index > 004f2cb..3e3e9ba8 100644 > --- a/fs/ocfs2/dlm/dlmcommon.h > +++ b/fs/ocfs2/dlm/dlmcommon.h > @@ -1004,6 +1004,8 @@ int dlm_finalize_reco_handler(struct o2net_msg *msg, > u32 len, void *data, int dlm_do_master_requery(struct dlm_ctxt *dlm, struct > dlm_lock_resource *res, > u8 nodenum, u8 *real_master); > > +void __dlm_do_purge_lockres(struct dlm_ctxt *dlm, > + struct dlm_lock_resource *res); > > int dlm_dispatch_assert_master(struct dlm_ctxt *dlm, > struct dlm_lock_resource *res, diff --git > a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 311404f..1d87e0f > 100644 > --- a/fs/ocfs2/dlm/dlmmaster.c > +++ b/fs/ocfs2/dlm/dlmmaster.c > @@ -2425,52 +2425,20 @@ int dlm_deref_lockres_done_handler(struct o2net_msg > *msg, u32 len, void *data, > mlog(ML_NOTICE, "%s:%.*s: node %u sends deref done " > "but it is already derefed!\n", dlm->name, > res->lockname.len, res->lockname.name, node); > - dlm_lockres_put(res); > ret = 0; > goto done; > } > - > - if (!list_empty(>purge)) { > - mlog(0, "%s: Removing res %.*s from purgelist\n", > - dlm->name, res->lockname.len, res->lockname.name); > - list_del_init(>purge); > - dlm_lockres_put(res); > - dlm->purge_count--; > - } > - > - if (!__dlm_lockres_unused(res)) { > - mlog(ML_ERROR, "%s: res %.*s in use after deref\n", > - dlm->name, res->lockname.len, res->lockname.name); > - __dlm_print_one_lock_resource(res); > - BUG(); > - } > - > - __dlm_unhash_lockres(dlm, res); > - > - spin_lock(>track_lock); > - if (!list_empty(>tracking)) > - list_del_init(>tracking); > - else { > - mlog(ML_ERROR, "%s: Resource %.*s not on the Tracking list\n", > - dlm->name, res->lockname.len, res->lockname.name); > - __dlm_print_one_lock_resource(res); > - } > - spin_unlock(>track_lock); > - > - /* lockres is not in the hash now. drop the flag and wake up > - * any processes waiting in dlm_get_lock_resource. > - */ > - res->state &= ~DLM_LOCK_RES_DROPPING_REF; > + __dlm_do_purge_lockres(dlm, res); > spin_unlock(>spinlock); > wake_up(>wq); > > - dlm_lockres_put(res); > - > spin_unlock(>spinlock); > > ret = 0; > > done: > + if (res) > + dlm_lockres_put(res); > dlm_put(dlm); > return ret; > } > diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index > f6b3138..dd5cb8b 100644 > --- a/fs
[Ocfs2-devel] 答复: 答复: [PATCH] ocfs2/dlm: fix umount hang
Hi Joseph, Could you please point out the particular side effects? I'd like to get it improved. IMO, I don't see anything broken from this patch since the HD detachment will be done in dlm_mle_release(). Furthermore, I believe that will be the proper place to detach HB events in my case. Your advisement is very important to me. Thanks. Changwei. -邮件原件- 发件人: Joseph Qi [mailto:jiangqi...@gmail.com] 发送时间: 2016年11月17日 17:18 收件人: gechangwei 12382 (CCPL); a...@linux-foundation.org 抄送: mfas...@versity.com; ocfs2-devel@oss.oracle.com 主题: Re: 答复: [Ocfs2-devel] [PATCH] ocfs2/dlm: fix umount hang Any clue to confirm the case? I'm afraid your change will have side effects. Thanks, Joseph On 16/11/17 17:04, Gechangwei wrote: > Hi Joseph, > > I suppose it is because local heartbeat mode was applied in my test > environment and other nodes were still writing heartbeat to other LUNs > but not the LUN corresponding to 7DA412FEB1374366B0F3C70025EB14. > > Br. > Changwei. > > -邮件原件- > 发件人: Joseph Qi [mailto:jiangqi...@gmail.com] > 发送时间: 2016年11月17日 15:00 > 收件人: gechangwei 12382 (CCPL); a...@linux-foundation.org > 抄送: mfas...@versity.com; ocfs2-devel@oss.oracle.com > 主题: Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: fix umount hang > > Hi Changwei, > > Why are the dead nodes still in live map, according to your dlm_state file? > > Thanks, > > Joseph > > On 16/11/17 14:03, Gechangwei wrote: >> Hi >> >> During my recent test on OCFS2, an umount hang issue was found. >> Below clues can help us to analyze this issue. >> >> From the debug information, we can see some abnormal stats like >> only node 1 is in DLM domain map, however, node 3 - 9 are still in MLE's >> node map and vote map. >> The root cause of unchanging vote map I think is that HB events are detached >> too early! >> That caused no chance of transforming from BLOCK MLE into MASTER MLE. >> Thus NODE 1 can't master lock resource even other nodes are all dead. >> >> To fix this, I propose a patch. >> >> From 3163fa7024d96f8d6e6ec2b37ad44e2cc969abd9 Mon Sep 17 00:00:00 >> 2001 >> From: gechangwei <ge.chang...@h3c.com> >> Date: Thu, 17 Nov 2016 14:00:45 +0800 >> Subject: [PATCH] fix umount hang >> >> Signed-off-by: gechangwei <ge.chang...@h3c.com> >> --- >>fs/ocfs2/dlm/dlmmaster.c | 2 -- >>1 file changed, 2 deletions(-) >> >> diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c >> index >> 6ea06f8..3c46882 100644 >> --- a/fs/ocfs2/dlm/dlmmaster.c >> +++ b/fs/ocfs2/dlm/dlmmaster.c >> @@ -3354,8 +3354,6 @@ static void dlm_clean_block_mle(struct dlm_ctxt *dlm, >> spin_unlock(>spinlock); >> wake_up(>wq); >> >> - /* Do not need events any longer, so detach from heartbeat */ >> - __dlm_mle_detach_hb_events(dlm, mle); >> __dlm_put_mle(mle); >> } >>} >> -- >> 2.5.1.windows.1 >> >> >> root@HXY-CVK110:~# grep P00 bbb >> Lockres: P00 Owner: 255 State: 0x10 InProgress >> >> root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB1 >> 4 >> 37# cat dlm_state >> Domain: 7DA412FEB1374366B0F3C70025EB1437 Key: 0x8ff804a1 Protocol: >> 1.2 Thread Pid: 21679 Node: 1 State: JOINED Number of Joins: 1 >> Joining Node: 255 Domain Map: 1 Exit Domain Map: >> Live Map: 1 2 3 4 5 6 7 8 9 >> Lock Resources: 29 (116) >> MLEs: 1 (119) >> Blocking: 1 (4) >> Mastery: 0 (115) >> Migration: 0 (0) >> Lists: Dirty=Empty Purge=Empty PendingASTs=Empty >> PendingBASTs=Empty Purge Count: 0 Refs: 1 Dead Node: 255 Recovery Pid: >> 21680 Master: >> 255 State: INACTIVE Recovery Map: >> Recovery Node State: >> >> >> root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB1 >> 4 37# ls dlm_state locking_state mle_state purge_list >> root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB1 >> 4 37# cat mle_state Dumping MLEs for Domain: >> 7DA412FEB1374366B0F3C70025EB1437 >> P00 BLK mas=255 new=255 evt=0use=1 >> ref= 2 >> Maybe= >> Vote=3 4 5 6 7 8 9 >> Response= >> Node=3 4 5 6 7 8 9 >> - >> - >> --- >> 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 >>
[Ocfs2-devel] 答复: [PATCH] ocfs2/dlm: fix umount hang
Hi Joseph, I suppose it is because local heartbeat mode was applied in my test environment and other nodes were still writing heartbeat to other LUNs but not the LUN corresponding to 7DA412FEB1374366B0F3C70025EB14. Br. Changwei. -邮件原件- 发件人: Joseph Qi [mailto:jiangqi...@gmail.com] 发送时间: 2016年11月17日 15:00 收件人: gechangwei 12382 (CCPL); a...@linux-foundation.org 抄送: mfas...@versity.com; ocfs2-devel@oss.oracle.com 主题: Re: [Ocfs2-devel] [PATCH] ocfs2/dlm: fix umount hang Hi Changwei, Why are the dead nodes still in live map, according to your dlm_state file? Thanks, Joseph On 16/11/17 14:03, Gechangwei wrote: > Hi > > During my recent test on OCFS2, an umount hang issue was found. > Below clues can help us to analyze this issue. > > From the debug information, we can see some abnormal stats like only > node 1 is in DLM domain map, however, node 3 - 9 are still in MLE's node map > and vote map. > The root cause of unchanging vote map I think is that HB events are detached > too early! > That caused no chance of transforming from BLOCK MLE into MASTER MLE. > Thus NODE 1 can't master lock resource even other nodes are all dead. > > To fix this, I propose a patch. > > From 3163fa7024d96f8d6e6ec2b37ad44e2cc969abd9 Mon Sep 17 00:00:00 > 2001 > From: gechangwei <ge.chang...@h3c.com> > Date: Thu, 17 Nov 2016 14:00:45 +0800 > Subject: [PATCH] fix umount hang > > Signed-off-by: gechangwei <ge.chang...@h3c.com> > --- > fs/ocfs2/dlm/dlmmaster.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index > 6ea06f8..3c46882 100644 > --- a/fs/ocfs2/dlm/dlmmaster.c > +++ b/fs/ocfs2/dlm/dlmmaster.c > @@ -3354,8 +3354,6 @@ static void dlm_clean_block_mle(struct dlm_ctxt *dlm, > spin_unlock(>spinlock); > wake_up(>wq); > > - /* Do not need events any longer, so detach from heartbeat */ > - __dlm_mle_detach_hb_events(dlm, mle); > __dlm_put_mle(mle); > } > } > -- > 2.5.1.windows.1 > > > root@HXY-CVK110:~# grep P00 bbb > Lockres: P00 Owner: 255 State: 0x10 InProgress > > root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB14 > 37# cat dlm_state > Domain: 7DA412FEB1374366B0F3C70025EB1437 Key: 0x8ff804a1 Protocol: > 1.2 Thread Pid: 21679 Node: 1 State: JOINED Number of Joins: 1 > Joining Node: 255 Domain Map: 1 Exit Domain Map: > Live Map: 1 2 3 4 5 6 7 8 9 > Lock Resources: 29 (116) > MLEs: 1 (119) >Blocking: 1 (4) >Mastery: 0 (115) >Migration: 0 (0) > Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty > Purge Count: 0 Refs: 1 Dead Node: 255 Recovery Pid: 21680 Master: > 255 State: INACTIVE Recovery Map: > Recovery Node State: > > > root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB14 > 37# ls dlm_state locking_state mle_state purge_list > root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB14 > 37# cat mle_state Dumping MLEs for Domain: 7DA412FEB1374366B0F3C70025EB1437 > P00 BLK mas=255 new=255 evt=0use=1 > ref= 2 > Maybe= > Vote=3 4 5 6 7 8 9 > Response= > Node=3 4 5 6 7 8 9 > -- > --- > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > 邮件! > This e-mail and its attachments contain confidential information from > H3C, which is intended only for the person or entity whose address is > listed above. Any use of the information contained herein in any way > (including, but not limited to, total or partial disclosure, > reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, > please notify the sender by phone or email immediately and delete it! > ___ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH] ocfs2/dlm: fix umount hang
Hi During my recent test on OCFS2, an umount hang issue was found. Below clues can help us to analyze this issue. From the debug information, we can see some abnormal stats like only node 1 is in DLM domain map, however, node 3 - 9 are still in MLE's node map and vote map. The root cause of unchanging vote map I think is that HB events are detached too early! That caused no chance of transforming from BLOCK MLE into MASTER MLE. Thus NODE 1 can't master lock resource even other nodes are all dead. To fix this, I propose a patch. From 3163fa7024d96f8d6e6ec2b37ad44e2cc969abd9 Mon Sep 17 00:00:00 2001 From: gechangwei <ge.chang...@h3c.com> Date: Thu, 17 Nov 2016 14:00:45 +0800 Subject: [PATCH] fix umount hang Signed-off-by: gechangwei <ge.chang...@h3c.com> --- fs/ocfs2/dlm/dlmmaster.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs/ocfs2/dlm/dlmmaster.c index 6ea06f8..3c46882 100644 --- a/fs/ocfs2/dlm/dlmmaster.c +++ b/fs/ocfs2/dlm/dlmmaster.c @@ -3354,8 +3354,6 @@ static void dlm_clean_block_mle(struct dlm_ctxt *dlm, spin_unlock(>spinlock); wake_up(>wq); - /* Do not need events any longer, so detach from heartbeat */ - __dlm_mle_detach_hb_events(dlm, mle); __dlm_put_mle(mle); } } -- 2.5.1.windows.1 root@HXY-CVK110:~# grep P00 bbb Lockres: P00 Owner: 255 State: 0x10 InProgress root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB1437# cat dlm_state Domain: 7DA412FEB1374366B0F3C70025EB1437 Key: 0x8ff804a1 Protocol: 1.2 Thread Pid: 21679 Node: 1 State: JOINED Number of Joins: 1 Joining Node: 255 Domain Map: 1 Exit Domain Map: Live Map: 1 2 3 4 5 6 7 8 9 Lock Resources: 29 (116) MLEs: 1 (119) Blocking: 1 (4) Mastery: 0 (115) Migration: 0 (0) Lists: Dirty=Empty Purge=Empty PendingASTs=Empty PendingBASTs=Empty Purge Count: 0 Refs: 1 Dead Node: 255 Recovery Pid: 21680 Master: 255 State: INACTIVE Recovery Map: Recovery Node State: root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB1437# ls dlm_state locking_state mle_state purge_list root@HXY-CVK110:/sys/kernel/debug/o2dlm/7DA412FEB1374366B0F3C70025EB1437# cat mle_state Dumping MLEs for Domain: 7DA412FEB1374366B0F3C70025EB1437 P00 BLK mas=255 new=255 evt=0use=1 ref= 2 Maybe= Vote=3 4 5 6 7 8 9 Response= Node=3 4 5 6 7 8 9 - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH] MLE releases issue.
Hi, During my test on OCFS2 suffering a storage failure, a crash issue was found. Below was the call trace when crashed. From the call trace, we can see a MLE's reference count is going to be negative, which aroused a BUG_ON() [143355.593258] Call Trace: [143355.593268] [] dlm_put_mle_inuse+0x47/0x70 [ocfs2_dlm] [143355.593276] [] dlm_get_lock_resource+0xac5/0x10d0 [ocfs2_dlm] [143355.593286] [] ? ip_queue_xmit+0x14a/0x3d0 [143355.593292] [] ? kmem_cache_alloc+0x1e4/0x220 [143355.593300] [] ? dlm_wait_for_recovery+0x6c/0x190 [ocfs2_dlm] [143355.593311] [] dlmlock+0x62d/0x16e0 [ocfs2_dlm] [143355.593316] [] ? __alloc_skb+0x9b/0x2b0 [143355.593323] [] ? 0xc01f6000 I think I probably have found the root cause of this issue. Please **Node 1** **Node 2** Storage failure An assert master message is sent to Node 1 Treat Node2 as down Assert master handler Decrease MLE reference count Clean blocked MLE Decrease MLE reference count In the above scenario, both dlm_assert_master_handler and dlm_clean_block_mle will decease MLE reference count, thus, in the following get_resouce procedure, the reference count is going to be negative. I propose a patch to solve this, please take review if you have any time. Signed-off-by: gechangwei <ge.chang...@h3c.com> --- dlm/dlmmaster.c | 8 +++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/dlm/dlmmaster.c b/dlm/dlmmaster.c index b747854..0540414 100644 --- a/dlm/dlmmaster.c +++ b/dlm/dlmmaster.c @@ -2020,7 +2020,7 @@ ok: spin_lock(>spinlock); if (mle->type == DLM_MLE_BLOCK || mle->type == DLM_MLE_MIGRATION) - extra_ref = 1; + extra_ref = test_bit(assert->node_idx, mle->maybe_map) ? 1 : 0; else { /* MASTER mle: if any bits set in the response map * then the calling node needs to re-assert to clear @@ -3465,12 +3465,18 @@ static void dlm_clean_block_mle(struct dlm_ctxt *dlm, mlog(0, "mle found, but dead node %u would not have been " "master\n", dead_node); spin_unlock(>spinlock); + } else if(mle->master != O2NM_MAX_NODES){ + mlog(ML_NOTICE, "mle found, master assert received, master has " +"already set to %d.\n ", mle->master); + spin_unlock(>spinlock); } else { /* Must drop the refcount by one since the assert_master will * never arrive. This may result in the mle being unlinked and * freed, but there may still be a process waiting in the * dlmlock path which is fine. */ mlog(0, "node %u was expected master\n", dead_node); + clear_bit(bit, mle->maybe_map); atomic_set(>woken, 1); spin_unlock(>spinlock); wake_up(>wq); -- BR. Changwei - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] [PATCH] A bug in the end of DLM recovery
Hi, I found an issue in the end of DLM recovery. When DLM recovery comes to the end of recovery procedure, it will remaster all locks in other nodes. Right after a request message is sent to a node A (say), the new master node will wait for node A’s response forever. But node A may die just after receiving the remaster request, not responses to new master node yet. That causes new master node waiting forever. I think below patch can solve this problem. Please have a review! Subject: [PATCH] interrupt waiting for node's response if node dies Signed-off-by: gechangwei <ge.chang...@h3c.com> --- dlm/dlmrecovery.c | 4 1 file changed, 4 insertions(+) diff --git a/dlm/dlmrecovery.c b/dlm/dlmrecovery.c index 3d90ad7..5e455cb 100644 --- a/dlm/dlmrecovery.c +++ b/dlm/dlmrecovery.c @@ -679,6 +679,10 @@ static int dlm_remaster_locks(struct dlm_ctxt *dlm, u8 dead_node) dlm->name, ndata->node_num, ndata->state==DLM_RECO_NODE_DATA_RECEIVING ? "receiving" : "requested"); +if (dlm_is_node_dead(dlm, ndata->node_num)) { + mlog(0, "%s: node %u died after requesting all locks.\n"); + ndata->state = DLM_RECO_NODE_DATA_DONE; +} all_nodes_done = 0; break; case DLM_RECO_NODE_DATA_DONE: -- BR. Chauncey - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] remove a useless assignment before fence
Hi OCFS2 experts, I found a strange segment of code during FENCE procedure in function o2hb_stop_all_regions, cited as below: void o2hb_stop_all_regions(void) { struct o2hb_region *reg; mlog(ML_ERROR, "stopping heartbeat on all active regions.\n"); spin_lock(_live_lock); list_for_each_entry(reg, _all_regions, hr_all_item) reg->hr_unclean_stop = 1; spin_unlock(_live_lock); } In this preceding code segment, all o2hb regions' hr_unclean_stop is set to 1, just before an emergency reboot. So it's hard to figure out if other threads have a chance to work according to the value of it. Can we just remove the assignment? BR. Chauncey Ge H3C Technologies Co., Limited - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] what MLE wants to do?
Hi OCFS2 experts, Recently, I am working hard on OCFS2 DLM related code. I think it is an essential part of OCFS2 which must be mastered. But I was confused with MLE (stands for master list entry) related procedure. Would you please introduce me some study material covering MLE or just directly give me a clue on what MLE wants to do briefly via an email. Thanks a lot. Br. Gechangwei H3C Technologies Co., Limited > -- > -- > - - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] OCFS2 cache coherent quesiton
Hi OCFS2 experts, According to OCFS2 user guide, OCFS2 supports cache consistency, which means one node writes to shared disk while another node can get the newest data just written. I am working hard on OCFS2 kernel code. I am confused that OCFS2 also adopts page cache provided by Linux Kernel, however, page cache does not know anything about other nodes in cluster. Well, my question is that if data is cached by page cache in one node, how could another node read the newest data from it? At this scenario, the page cache is dirty and the disk data is old. How OCFS2 can support cache consistency? Thanks, Best regards. Gechangwei H3C Technologies Co., Limited - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] An issue on OCFS2 fsck tool
Hi OCFS2 experts, I encountered an issue during checking an OCFS2 volume via OCFS2 fsck tool aka fsck.ocfs2. I am not sure if I can get some help from you. I found that as long as there was some dirty data held by journal, fsck.ocfs2 tool would start an O2HB thread and tried to replay the journal. What bothers me is that even the file system checking and recovering procedure is done; the O2HB thread is still active. I think this is not reasonable. I doubt this was a BUG in fsck.ocfs2. Do you have any comments on this issue? I am looking forward to getting a little help from you. Many thanks. Best regards. Gechangwei H3C Technologies Co., Limited - 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 邮件! This e-mail and its attachments contain confidential information from H3C, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it! ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel