Hi Jiang,
I think there is another scenario about slot overwritten issue.
There are three nodes in the ocfs2 cluster. Node 3 had mounted with slot 1.
Node 1 and node 2 execute mounting volume operation at the same time.
N1 N2
mount ocfs2 volume mount ocfs2 volume
ocfs2_fill_super() ocfs2_fill_super()
ocfs2_initialize_super ocfs2_initialize_super
... ... ... ...
ocfs2_init_slot_info(osb); ocfs2_init_slot_info(osb);
ocfs2_mount_volume ocfs2_mount_volume
ocfs2_super_lock ocfs2_super_lock
Gotten the super lock Waiting for the super lock
Find slot 0 unused
from memory
update the slot 0 with 1
... ...
locked journal 0
mount finished.
Gotten super lock and
Also find slot 0 unused
from memory,
update the slot 0 with node num 2
But Journal 0 is locked by N1
Mounted hang up.
... ... ... ...
umount volume ... ...
cleare the slot 0 ... ...
Gotten joural 0 lock
mount finished.
But
here, the slot 0 is cleare by N1
IF N1 mount again
Same condition with N2
and will hang up.
In the function of ocfs2_mount_volume, I think the slot info should be
refreshed after ocfs2_super_lock called.
static int ocfs2_mount_volume(struct super_block *sb)
{
status = ocfs2_super_lock(osb, 1);
......
+ status = ocfs2_refresh_slot_info(osb);
+ if (status < 0) {
+ mlog_errno(status);
+ goto leave;
+ }
... ...
}
Another way is to move ocfs2_init_slot_info() function from
ocfs2_initialize_super to replace ocfs2_refresh_slot_info as above.
Message: 5
Date: Wed, 23 Dec 2015 18:23:36 +0800
From: jiangyiwen <[email protected]<mailto:[email protected]>>
Subject: [Ocfs2-devel] [PATCH] ocfs2: fix slot overwritten if storage
link down during mount
To: Andrew Morton <[email protected]<mailto:[email protected]>>
Cc: Mark Fasheh <[email protected]<mailto:[email protected]>>,
[email protected]<mailto:[email protected]>
Message-ID: <[email protected]<mailto:[email protected]>>
Content-Type: text/plain; charset="utf-8"
The following case will lead to slot overwritten.
N1 N2
mount ocfs2 volume, find and
allocate slot 0, then set
osb->slot_num to 0, begin to
write slot info to disk
mount ocfs2 volume, wait for super lock
write block fail because of
storage link down, unlock
super lock
got super lock and also allocate slot 0
then unlock super lock
mount fail and then dismount,
since osb->slot_num is 0, try to
put invalid slot to disk. And it
will succeed if storage link
restores.
N2 slot info is now overwritten
-------------------------------------------------------------------------------------------------------------------------------------
????????????????????????????????????????
????????????????????????????????????????
????????????????????????????????????????
???
This e-mail and its attachments contain confidential information from H3C,
which is
intended only for the person or entity whose address is listed above. Any use
of the
information contained herein in any way (including, but not limited to, total
or partial
disclosure, reproduction, or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please notify
the sender
by phone or email immediately and delete it!
_______________________________________________
Ocfs2-devel mailing list
[email protected]
https://oss.oracle.com/mailman/listinfo/ocfs2-devel