Hi Yauhen,

 Thanks ! more below..

On 03/19/2016 03:39 AM, Yauhen Kharuzhy wrote:
Hi all,

I try to get Anand's patchset for global hotspare functionality working.

Now it's working for me but I have met number of issues while applying
and patches testing.

I took latest versions of patchset and its dependencies (latest at two
weeks ago):

1) Anand's hotspare patchset:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/49985
2) Device delete by id series:
http://thread.gmane.org/gmane.comp.file-systems.btrfs/53208

 V2 is sent out based on cleanups-4.6 branch. Plus preparatory
 patch found in the ML.

fc72066b27f3 btrfs: refactor btrfs_dev_replace_start for reuse
0b2322126a95 btrfs: keep sysfs target add in the last
3ecbc05149e0 btrfs: use fs_info directly


3) Two Anand's patches about sysfs attributes (hotspare series seems to be
depended on it):
http://thread.gmane.org/gmane.comp.file-systems.btrfs/48943

 The sysfs patches we need it only to see the device state or some
 enterprise scripts may need it. But auto replace hot spare as such
 don't depend on that.

My kernel is 4.4.5 stable version (I had tried integration-4.6 branch
of btrfs-next first and had same troubles as for 4.4.5).

 Wiki needs an update, pls don't use btrfs-next.

So, good result: hotspare functionality works!

 Thanks for testing.

Bad result: it works for me after some patching only :)

 Thanks for working on it. Let me review.

General notice: we are definitely need FS-specific hotspares, because
common case is to have few RAID with different drives size (system root and
data RAIDs, for instance).

 Yep.

I have published my git tree with working set of patches here:
https://bitbucket.org/jekhor/linux-btrfs/branch/4.4.5%2Bhotspare-without_degradable_check
And corresponding btrfs-progs tree:
https://bitbucket.org/jekhor/btrfs-progs/commits/branch/devel-hotspare

 Hm, generally posting the independent patches to the ML will help.

This trees contain some RAID state monitoring related changes, just ignore
them (I am going to start another discussion about of RAID status monitoring
soon).


Issue 1.

First, kernel oopsed at FS mounting after unmounting. Unfortunately, I
don't have saved logs for this. I found that fsid_kobj was corrupted (has
NULL ktype field) before invocation of btrfs_sysfs_add_fsid(). I cannot
found the source of corruption – no 'kobject release' events before,
state_initialized field remains true, ktype just is cleaned
(btrfs_ktype.release() wasn't called before this too).

My printk-based trace looks like this but exactly place of value changing
was not permanent, so this is can be some kind of race condition:

Mar 11 01:07:31 grack12 kernel: [   33.694074] btrfs_commit_transaction:2133: 
fsid_kobj=ffff88001f020cd8, ktype=ffffffffa0219840
Mar 11 01:07:31 grack12 kernel: [   33.697967] btrfs_commit_transaction:2142: 
fsid_kobj=ffff88001f020cd8, ktype=ffffffffa0219840
Mar 11 01:07:31 grack12 kernel: [   33.697972] write_all_supers:3672: 
fsid_kobj=ffff88001f020cd8, ktype=ffffffffa0219840
Mar 11 01:07:31 grack12 kernel: [   33.697973] write_all_supers:3677: 
fsid_kobj=ffff88001f020cd8, ktype=ffffffffa0219840
Mar 11 01:07:31 grack12 kernel: [   33.697974] write_all_supers:3679: 
fsid_kobj=ffff88001f020cd8, ktype=ffffffffa0219840
Mar 11 01:07:31 grack12 kernel: [   33.702881] write_all_supers:3690: 
fsid_kobj=ffff88001f020cd8, ktype=          (null)
Mar 11 01:07:31 grack12 kernel: [   33.702884] write_all_supers:3699: 
fsid_kobj=ffff88001f020cd8, ktype=          (null)
Mar 11 01:07:31 grack12 kernel: [   33.702885] write_all_supers:3701: 
fsid_kobj=ffff88001f020cd8, ktype=          (null)

Bisecting pointed me to simple commit 'b0f398c btrfs: optimize
btrfs_check_degradable() for calls outside of barrier' but I have no idea how
it may cause or trigger this issue...

 dev is stale here, my bad, that was a crap patch. Also we don't need
 this patch as part of hot spare / auto replace code. I have removed it.

So, after spending some time for debugging, I decided to remove second
patchset entirely except of 'btrfs: create a helper function to read the disk
super' commit and problem had gone out.


Issue 2.
At start of autoreplacig drive by hotspare, kernel craches in transaction
handling code (inside of btrfs_commit_transaction() called by autoreplace 
initiating
routines). I 'fixed' this by removing of closing of bdev in 
btrfs_close_one_device_dont_free(), see
https://bitbucket.org/jekhor/linux-btrfs/commits/dfa441c9ec7b3833f6a5e4d0b6f8c678faea29bb?at=master
(oops text is attached also). Bdev is closed after replacing by
btrfs_dev_replace_finishing(), so this is safe but doesn't seem
to be right way.

 I have sent out V2. I don't see that issue with this,
 could you pls try ?

Issue 3.
btrfs_auto_replace_start() doesn't check and doesn't set the
fs_info->mutually_exclusive_operation_running flag as ioctl handler for
DEV_REPLACE_START does, this cause race conditions in some cases, see
https://bitbucket.org/jekhor/linux-btrfs/commits/834bebb96a2f6b5ef5856836839e5ce7830ec745?at=master

 There were some fixes to the main btrfs_auto_replace_start() before,
 (not the v2). So to avoid such a disconnect, I have sent out a patch
 set which shall not v2 the function, instead it re-factors the original

    btrfs: refactor btrfs_dev_replace_start for reuse

 With this the hot spare V2 will apply nicely, and I have found it
 to be stable.

Issue 4.
Autoreplacement code doesn't start replacing at mounting in degraded mode,
even if hotspare exists. We need this feature, so I added check for missing
drives also, not only for failed, to checking if replacement needed.

 No. No. No please don't do that, it would lead to trouble in handing
 slow devices. I purposely didn't do it.

 Also kindly note that, in volume manage / storage context things
 should continue to work in degraded mode automatically, and it
 shouldn't wait for user's opinion. If it don't do that, then
 there is no point in having a volume manager. But as of now btrfs has
 already made degraded as non default choice. There is something else
 new which is needed and it can be a separate RFC, not part of this
 patch set.


 Please try. V2 sent out.

Thanks, Anand

See
https://bitbucket.org/jekhor/linux-btrfs/commits/4c9ddb58d979ae5a232aeaa1fbe3d26373210768?at=master
and
https://bitbucket.org/jekhor/linux-btrfs/commits/be5e2524c10f2b4047da80f9f85b54c6006d4273?at=master




--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to