Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace

Anand Jain Thu, 14 Apr 2016 02:58:22 -0700


On 04/14/2016 05:22 PM, Yauhen Kharuzhy wrote:

On Thu, Apr 14, 2016 at 04:45:11PM +0800, Anand Jain wrote:




Thanks for the report ! more below..


  You may use simpler devmgt tool, https://github.com/asj/devmgt


Thanks, will try.


  You are failing the replace-target, presumably when the replace is
  still running, however note that this patch-set does not fail the
  replace-target for errors (as of now I have no idea how to do that
  without leading to a messy situation), and so it would follow the
  original code as without this patch.
  Next, originally with-out this patch-set we won't close any device
  for errors. So when you delete the device at the block-layer and
  re-attach (scan) most probably you are having a newer device path
  to the block device. (which kind of defeats the idea of testing
  an intermittently disappearing device), so I doubt, if the test
  case is reliable,  and above panic is btrfs related and if its
  this patch-set related.


No, It is fixed by my latest patch (about of s_bdev field in
superblock). Actual sequence which leads to oops is:
1) FS is mounted, s_bdev is NULL
2) failed device is closed, s_bdev untouched

3) missing device is replaced, s_bdev is set to non-NULL – bdev of
the replaced device
4) at second device closing, s_bdev is "changed" to first device from
the device list but it is... some device because closed dev still
didn't delete from the list!
5) after device closing, s_bdev points to invalid bdev.
6) umount -> sync_filesystem() -> sync_blokdev(s_bdev) -> OOPS.


 This is wrong. It should be other way around. That is s_bdev
 should continue to be NULL. And if s_bdev continues to be NULL
 the sync thread will fail-safe.

 The diff sent in the other thread will fix.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 00/13] Introduce device state 'failed', spare device and auto replace

Reply via email to