On 4.10.19 г. 10:50 ч., Anand Jain wrote:
> In open_fs_devices() we identify alien device but we don't reset its
> the device::name. So progs device list does not show the device missing
> as shown in the script below.
> 
> mkfs.btrfs -fq /dev/sdd && mount /dev/sdd /btrfs
> mkfs.btrfs -fq -draid1 -mraid1 /dev/sdc /dev/sdb
> sleep 3 # avoid racing with udev's useless scans if needed
> btrfs dev add -f /dev/sdb /btrfs
> mount -o degraded /dev/sdc /btrfs1
> 
> No missing device:
> btrfs fi show -m /btrfs1
> Label: none  uuid: 3eb7cd50-4594-458f-9d68-c243cc49954d
>       Total devices 2 FS bytes used 128.00KiB
>       devid    1 size 12.00GiB used 1.26GiB path /dev/sdc
>       devid    2 size 12.00GiB used 1.26GiB path /dev/sdb
> 
> Signed-off-by: Anand Jain <anand.j...@oracle.com>
> ---
> PS: Fundamentally its wrong approach that btrfs-progs deduces the device
> missing state in the userland instead of obtaining it from the kernel.
> I objected on the patch, but still those patches got merged, this bug is
> one of its side effects. Ironically I wrote patches to read device_state
> from the kernel using ioctl, procfs and sysfs but didn't get the due
> attention till a merger.
> 
>  fs/btrfs/volumes.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 06ec3577c6b4..05ade8c7342b 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -803,10 +803,10 @@ static int btrfs_open_one_device(struct 
> btrfs_fs_devices *fs_devices,
>       disk_super = (struct btrfs_super_block *)bh->b_data;
>       devid = btrfs_stack_device_id(&disk_super->dev_item);
>       if (devid != device->devid)
> -             goto error_brelse;
> +             goto free_alien;
>  
>       if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
> -             goto error_brelse;
> +             goto free_alien;
>  

Imo a better approach is to return a particular error code and do the
deletion in open_fs_devices. Otherwise it's not apparent why you use
list_for_each_entry_safe in one function to delete something in a
different one (whose name by the way doesn't suggest a deletion is going
on). Looking at the error I think enodev/enxio is appropriate.

>       device->generation = btrfs_super_generation(disk_super);
>  
> @@ -845,6 +845,11 @@ static int btrfs_open_one_device(struct btrfs_fs_devices 
> *fs_devices,
>  
>       return 0;
>  
> +free_alien:
> +     fs_devices->num_devices--;
> +     list_del(&device->dev_list);
> +     btrfs_free_device(device);
> +
>  error_brelse:
>       brelse(bh);
>       blkdev_put(bdev, flags);
> @@ -1329,11 +1334,13 @@ static int open_fs_devices(struct btrfs_fs_devices 
> *fs_devices,
>                               fmode_t flags, void *holder)
>  {
>       struct btrfs_device *device;
> +     struct btrfs_device *tmp_device;
>       struct btrfs_device *latest_dev = NULL;
>  
>       flags |= FMODE_EXCL;
>  
> -     list_for_each_entry(device, &fs_devices->devices, dev_list) {
> +     list_for_each_entry_safe(device, tmp_device, &fs_devices->devices,
> +                              dev_list) {
>               /* Just open everything we can; ignore failures here */
>               if (btrfs_open_one_device(fs_devices, device, flags, holder))
>                       continue;
> 

Reply via email to