Damien Le Moal <damien.lem...@opensource.wdc.com> 于2022年9月11日周日 14:48写道: > > On 2022/09/11 15:33, Sam Li wrote: > > Damien Le Moal <damien.lem...@opensource.wdc.com> 于2022年9月11日周日 13:31写道: > [...] > >>> +/* > >>> + * zone management operations - Execute an operation on a zone > >>> + */ > >>> +static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, > >>> BlockZoneOp op, > >>> + int64_t offset, int64_t len) { > >>> +#if defined(CONFIG_BLKZONED) > >>> + BDRVRawState *s = bs->opaque; > >>> + RawPosixAIOData acb; > >>> + int64_t zone_sector, zone_sector_mask; > >>> + const char *zone_op_name; > >>> + unsigned long zone_op; > >>> + bool is_all = false; > >>> + > >>> + zone_sector = bs->bl.zone_sectors; > >>> + zone_sector_mask = zone_sector - 1; > >>> + if (offset & zone_sector_mask) { > >>> + error_report("sector offset %" PRId64 " is not aligned to zone > >>> size " > >>> + "%" PRId64 "", offset, zone_sector); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + if (len & zone_sector_mask) { > >> > >> Linux allows SMR drives to have a smaller last zone. So this needs to be > >> accounted for here. Otherwise, a zone operation that includes the last > >> smaller > >> zone would always fail. Something like this would work: > >> > >> if (((offset + len) < capacity && > >> len & zone_sector_mask) || > >> offset + len > capacity) { > >> > > > > I see. I think the offset can be removed, like: > > if (((len < capacity && len & zone_sector_mask) || len > capacity) { > > Then if we use the previous zone's len for the last smaller zone, it > > will be greater than its capacity. > > Nope, you cannot remove the offset since the zone operation may be for that > last > zone only, that is, offset == last zone start and len == last zone smaller > size. > In that case, len is alwats smaller than capacity.
Ok, I was mixing opening one zone with opening several zones. > > > > > I will also include "opening the last zone" as a test case later. > > Note that you can create such smaller last zone on the host with null_blk by > specifying a device capacity that is *not* a multiple of the zone size. > > > > >>> + error_report("number of sectors %" PRId64 " is not aligned to > >>> zone size" > >>> + " %" PRId64 "", len, zone_sector); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + switch (op) { > >>> + case BLK_ZO_OPEN: > >>> + zone_op_name = "BLKOPENZONE"; > >>> + zone_op = BLKOPENZONE; > >>> + break; > >>> + case BLK_ZO_CLOSE: > >>> + zone_op_name = "BLKCLOSEZONE"; > >>> + zone_op = BLKCLOSEZONE; > >>> + break; > >>> + case BLK_ZO_FINISH: > >>> + zone_op_name = "BLKFINISHZONE"; > >>> + zone_op = BLKFINISHZONE; > >>> + break; > >>> + case BLK_ZO_RESET: > >>> + zone_op_name = "BLKRESETZONE"; > >>> + zone_op = BLKRESETZONE; > >>> + break; > >>> + default: > >>> + g_assert_not_reached(); > >>> + } > >>> + > >>> + acb = (RawPosixAIOData) { > >>> + .bs = bs, > >>> + .aio_fildes = s->fd, > >>> + .aio_type = QEMU_AIO_ZONE_MGMT, > >>> + .aio_offset = offset, > >>> + .aio_nbytes = len, > >>> + .zone_mgmt = { > >>> + .zone_op = zone_op, > >>> + .zone_op_name = zone_op_name, > >>> + .all = is_all, > >>> + }, > >>> + }; > >>> + > >>> + return raw_thread_pool_submit(bs, handle_aiocb_zone_mgmt, &acb); > >>> +#else > >>> + return -ENOTSUP; > >>> +#endif > >>> +} > > -- > Damien Le Moal > Western Digital Research >