On Mon, Feb 05, 2024 at 11:41:04AM +0900, Damien Le Moal wrote:
> On 2/5/24 11:19, Ming Lei wrote:
> >>>> +static inline void blk_zone_wplug_add_bio(struct blk_zone_wplug *zwplug,
> >>>> +                                          struct bio *bio, unsigned int 
> >>>> nr_segs)
> >>>> +{
> >>>> +        /*
> >>>> +         * Keep a reference on the BIO request queue usage. This 
> >>>> reference will
> >>>> +         * be dropped either if the BIO is failed or after it is issued 
> >>>> and
> >>>> +         * completes.
> >>>> +         */
> >>>> +        percpu_ref_get(&bio->bi_bdev->bd_disk->queue->q_usage_counter);
> >>>
> >>> It is fragile to get nested usage_counter, and same with 
> >>> grabbing/releasing it
> >>> from different contexts or even functions, and it could be much better to 
> >>> just
> >>> let block layer maintain it.
> >>>
> >>> From patch 23's change:
> >>>
> >>> +  * Zoned block device information. Reads of this information must be
> >>> +  * protected with blk_queue_enter() / blk_queue_exit(). Modifying this
> >>>
> >>> Anytime if there is in-flight bio, the block device is opened, so both 
> >>> gendisk and
> >>> request_queue are live, so not sure if this .q_usage_counter protection
> >>> is needed.
> >>
> >> Hannes also commented about this. Let me revisit this.
> > 
> > I think only queue re-configuration(blk_revalidate_zone) requires the
> > queue usage counter. Otherwise, bdev open()/close() should work just
> > fine.
> 
> I want to check FS case though. No clear if mounting FS that supports zone
> (btrfs) also uses bdev open ?

btrfs '-O zoned' shouldn't be one exception:

mount -O zoned /dev/ublkb0 /mnt

  b'blkdev_get_whole'
  b'bdev_open_by_dev'
  b'bdev_open_by_path'
  b'btrfs_scan_one_device'
  b'btrfs_get_tree'
  b'vfs_get_tree'
  b'fc_mount'
  b'btrfs_get_tree'
  b'vfs_get_tree'
  b'path_mount'
  b'__x64_sys_mount'
  b'do_syscall_64'
  b'entry_SYSCALL_64_after_hwframe'
  b'[unknown]'
    1

> 
> >>>> +        /*
> >>>> +         * blk-mq devices will reuse the reference on the request queue 
> >>>> usage
> >>>> +         * we took when the BIO was plugged, but the submission path for
> >>>> +         * BIO-based devices will not do that. So drop this reference 
> >>>> here.
> >>>> +         */
> >>>> +        if (bio->bi_bdev->bd_has_submit_bio)
> >>>> +                blk_queue_exit(bio->bi_bdev->bd_disk->queue);
> >>>
> >>> But I don't see where this reference is reused for blk-mq in this patch,
> >>> care to point it out?
> >>
> >> This patch modifies blk_mq_submit_bio() to add a "goto new_request" at the 
> >> top
> >> for any BIO flagged with BIO_FLAG_ZONE_WRITE_PLUGGING. So when a plugged 
> >> BIO is
> >> unplugged and submitted again, the reference that was taken in
> >> blk_zone_wplug_add_bio() is reused for the new request for that BIO.
> > 
> > OK, this reference reuse may be worse, because queue freeze can't prevent 
> > new
> > write zoned bio from being submitted any more given only percpu_ref_get() is
> > called for all write zoned bios.
> 
> New BIOs (BIOS that have never been plugged yet) will go through the normal
> blk_queue_enter() in blk_mq_submit_bio(), so they will be stopped there if
> another context asked for a queue freeze. I do not think there is any issue 
> with
> how things are currently done (we tested that *a lot* with many different 
> drives
> and drive configs with DM etc). Reference counting as it is is OK, even though
> it most likely be simplified. I am looking at that now.

Indeed, new zoned write bio is still covered by blk-mq's queue reference, and 
the
trick is just played on old plugged bio.

Thanks,
Ming


Reply via email to