在 2025/7/8 08:32, Dave Chinner 写道:
On Fri, Jul 04, 2025 at 10:12:29AM +0930, Qu Wenruo wrote:
Currently all the filesystems implementing the
super_opearations::shutdown() callback can not afford losing a device.

Thus fs_bdev_mark_dead() will just call the shutdown() callback for the
involved filesystem.

But it will no longer be the case, with multi-device filesystems like
btrfs and bcachefs the filesystem can handle certain device loss without
shutting down the whole filesystem.

To allow those multi-device filesystems to be integrated to use
fs_holder_ops:

- Replace super_opearation::shutdown() with
   super_opearations::remove_bdev()
   To better describe when the callback is called.

This conflates cause with action.

The shutdown callout is an action that the filesystem must execute,
whilst "remove bdev" is a cause notification that might require an
action to be take.

Yes, the cause could be someone doing hot-unplug of the block
device, but it could also be something going wrong in software
layers below the filesystem. e.g. dm-thinp having an unrecoverable
corruption or ENOSPC errors.

We already have a "cause" notification: blk_holder_ops->mark_dead().

The generic fs action that is taken by this notification is
fs_bdev_mark_dead().  That action is to invalidate caches and shut
down the filesystem.

btrfs needs to do something different to a blk_holder_ops->mark_dead
notification. i.e. it needs an action that is different to
fs_bdev_mark_dead().

Indeed, this is how bcachefs already handles "single device
died" events for multi-device filesystems - see
bch2_fs_bdev_mark_dead().

I do not think it's the correct way to go, especially when there is already fs_holder_ops.

We're always going towards a more generic solution, other than letting the individual fs to do the same thing slightly differently.

Yes, the naming is not perfect and mixing cause and action, but the end result is still a more generic and less duplicated code base.


Hence Btrfs should be doing the same thing as bcachefs. The
bdev_handle_ops structure exists precisly because it allows the
filesystem to handle block device events in the exact manner they
require....

- Add a new @bdev parameter to remove_bdev() callback
   To allow the fs to determine which device is missing, and do the
   proper handling when needed.

For the existing shutdown callback users, the change is minimal.

Except for the change in API semantics. ->shutdown is an external
shutdown trigger for the filesystem, not a generic "block device
removed" notification.

The problem is, there is no one utilizing ->shutdown() out of fs_bdev_mark_dead().

If shutdown ioctl is handled through super_operations::shutdown, it will be more meaningful to split shutdown and dev removal.

But that's not the case, and different fses even have slightly different handling for the shutdown flags (not all fses even utilize journal to protect their metadata).

Thanks,
Qu



Hooking blk_holder_ops->mark_dead means that btrfs can also provide
a ->shutdown implementation for when something external other than a
block device removal needs to shut down the filesystem....

-Dave.


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

Reply via email to