Re: [PATCH v7 2/4] qcow2: add configurations for zoned format extension

2024-02-19 Thread Damien Le Moal
ly is. So I would be tempted to say that defaulting to 0 to force the user to specify a zone size would be safer. But if you really want a non-0 default, then maybe 256MB or so may be OK as that is the most commonly used value out there for zoned storage (there are literally millions of SMR drives running in production systems out there and NVMe ZNS devices are not widely used yet). -- Damien Le Moal Western Digital Research

Re: [PATCH v6 3/4] qcow2: add zoned emulation capability

2023-11-26 Thread Damien Le Moal
+uint32_t index; > +int ret; > +int64_t zone_size_mask = bs->bl.zone_size - 1; > +int64_t iov_len = 0; > +int64_t len = 0; > + > +if (*offset >= capacity) { > +error_report("*offset %" PRId64 " is equal to or greater than the" > + "device capacity %" PRId64 "", *offset, capacity); > +return -EINVAL; > +} > + > +/* offset + len should not pass the end of that zone starting from > offset */ > +if (*offset & zone_size_mask) { > +error_report("sector offset %" PRId64 " is not aligned to zone size " > + "%" PRId32 "", *offset / 512, bs->bl.zone_size / 512); > +return -EINVAL; > +} > + > +int64_t wg = bs->bl.write_granularity; > +int64_t wg_mask = wg - 1; > +for (int i = 0; i < qiov->niov; i++) { > +iov_len = qiov->iov[i].iov_len; > +if (iov_len & wg_mask) { > +error_report("len of IOVector[%d] %" PRId64 " is not aligned to " > + "block size %" PRId64 "", i, iov_len, wg); > +return -EINVAL; > +} > +} > +len = qiov->size; > +index = *offset / bs->bl.zone_size; > + > +if ((len >> BDRV_SECTOR_BITS) > bs->bl.max_append_sectors) { > +return -ENOTSUP; > +} > + > +qemu_co_mutex_lock(&bs->wps->colock); > +uint64_t wp_i = bs->wps->wp[index]; > +ret = qcow2_co_pwritev_part(bs, wp_i, len, qiov, 0, 0); > +if (ret == 0) { > +*offset = wp_i; > +} else { > +error_report("qcow2: zap failed"); > +} > + > +qemu_co_mutex_unlock(&bs->wps->colock); > +return ret; > +} > + > static int coroutine_fn GRAPH_RDLOCK > qcow2_co_copy_range_from(BlockDriverState *bs, > BdrvChild *src, int64_t src_offset, > @@ -6383,6 +7116,10 @@ BlockDriver bdrv_qcow2 = { > .bdrv_co_pwritev_compressed_part= qcow2_co_pwritev_compressed_part, > .bdrv_make_empty= qcow2_make_empty, > > +.bdrv_co_zone_report= qcow2_co_zone_report, > +.bdrv_co_zone_mgmt= qcow2_co_zone_mgmt, > +.bdrv_co_zone_append= qcow2_co_zone_append, > + > .bdrv_snapshot_create = qcow2_snapshot_create, > .bdrv_snapshot_goto = qcow2_snapshot_goto, > .bdrv_snapshot_delete = qcow2_snapshot_delete, > diff --git a/block/trace-events b/block/trace-events > index 8e789e1f12..e35222e079 100644 > --- a/block/trace-events > +++ b/block/trace-events > @@ -82,6 +82,8 @@ qcow2_writev_data(void *co, uint64_t offset) "co %p offset > 0x%" PRIx64 > qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int64_t bytes) "co > %p offset 0x%" PRIx64 " bytes %" PRId64 > qcow2_pwrite_zeroes(void *co, int64_t offset, int64_t bytes) "co %p offset > 0x%" PRIx64 " bytes %" PRId64 > qcow2_skip_cow(void *co, uint64_t offset, int nb_clusters) "co %p offset > 0x%" PRIx64 " nb_clusters %d" > +qcow2_wp_tracking(int index, uint64_t wp) "wps[%d]: 0x%" PRIx64 > +qcow2_imp_open_zones(uint8_t op, int nrz) "nr_imp_open_zones after op 0x%x: > %d" > > # qcow2-cluster.c > qcow2_alloc_clusters_offset(void *co, uint64_t offset, int bytes) "co %p > offset 0x%" PRIx64 " bytes %d" > diff --git a/include/qemu/queue.h b/include/qemu/queue.h > index e029e7bf66..3f0a48740e 100644 > --- a/include/qemu/queue.h > +++ b/include/qemu/queue.h > @@ -179,6 +179,7 @@ struct { > \ > #define QLIST_EMPTY(head)((head)->lh_first == NULL) > #define QLIST_FIRST(head)((head)->lh_first) > #define QLIST_NEXT(elm, field) ((elm)->field.le_next) > +#define QLIST_LAST(head, field) (*(head)->lh_first->field.le_prev) > > > /* -- Damien Le Moal Western Digital Research

Re: [PATCH v5 1/4] docs/qcow2: add the zoned format feature

2023-10-30 Thread Damien Le Moal
rivers may be written with sectors in mind, but any time we mix units > in the public interface, it gets awkward. I'd lean towards having > bytes here, with a requirement that it be a multiple of 512. Agreed. Let's use bytes to avoid confusion. -- Damien Le Moal Western Digital Research

Re: [PATCH v2 3/4] qcow2: add zoned emulation capability

2023-08-29 Thread Damien Le Moal
On 8/29/23 15:27, Sam Li wrote: > Damien Le Moal 于2023年8月29日周二 14:06写道: >> >> On 8/28/23 20:55, Sam Li wrote: >>>>> +/* close one implicitly open zones to make it available */ >>>>> +for (int i = s->zoned_header.zone_nr_c

Re: [PATCH v2 3/4] qcow2: add zoned emulation capability

2023-08-28 Thread Damien Le Moal
> FULL zone wp > zone start -> CLOSED And make sure that all closed zones are counted as the initial number of active zones. The initial number of open zones will always be 0. So it is easy :) -- Damien Le Moal Western Digital Research

Re: [PATCH v2 2/4] qcow2: add configurations for zoned format extension

2023-08-28 Thread Damien Le Moal
On 8/28/23 19:18, Sam Li wrote: > Damien Le Moal 于2023年8月28日周一 18:13写道: >> >> On 8/28/23 18:22, Sam Li wrote: >>> Stefan Hajnoczi 于2023年8月21日周一 21:31写道: >>>> >>>> On Mon, Aug 14, 2023 at 04:58:00PM +0800, Sam Li wrote: >>>>> di

Re: [PATCH v2 2/4] qcow2: add configurations for zoned format extension

2023-08-28 Thread Damien Le Moal
ones. For this particular case though, given that this is QCow2 emulation, limiting ourselves to the same zone capacity for all zones is I think fine. But that should be clearly stated somewhere may be... > >> >>> +uint32_t nr_zones; >> >> Is this field necessary since it can be derived from other image >> options: nr_zones = DIV_ROUND_UP(total_length, zone_capacity)? > > It can be dropped. I added this for reducing duplication. Thanks! -- Damien Le Moal Western Digital Research

Re: [PATCH] block/file-posix: fix update_zones_wp() caller

2023-08-24 Thread Damien Le Moal
On 8/25/23 12:05, Sam Li wrote: > Damien Le Moal 于2023年8月25日周五 07:49写道: >> >> On 8/25/23 02:39, Sam Li wrote: >>> When the zoned requests that may change wp fail, it needs to >>> update only wps of the zones within the range of the requests >>> for not di

Re: [PATCH] block/file-posix: fix update_zones_wp() caller

2023-08-24 Thread Damien Le Moal
y change there correct in-memory wp to a wrong value. I think this also should be: update_zones_wp(bs, s->fd, offset, 1); > error_report("ioctl %s failed %d", op_name, ret); > return ret; > } -- Damien Le Moal Western Digital Research

Re: [PATCH v2] block/file-posix: fix wps checking in raw_co_prw

2023-06-07 Thread Damien Le Moal
complete(bs, *s->offset >> BDRV_SECTOR_BITS); } /* Advance the wp if needed */ if (offset + bytes > *wp) { *wp = offset + bytes; } unlock: qemu_co_mutex_unlock(&wps->colock); } #endif And making this entire b

Re: [PATCH v6 1/4] file-posix: add tracking of the zone write pointers

2023-03-15 Thread Damien Le Moal
On 3/15/23 21:59, Sam Li wrote: > Damien Le Moal 于2023年3月14日周二 11:49写道: >> >> On 3/14/23 11:23, Dmitry Fomichev wrote: >>>> @@ -3339,10 +3473,27 @@ static int coroutine_fn >>>> raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, >>>&g

Re: [PATCH v6 1/4] file-posix: add tracking of the zone write pointers

2023-03-13 Thread Damien Le Moal
zones that may exist on the > device and make the data stored in those zones unreadable. R/O zones need to > be > skipped in this loop. And offline zones need to be skipped as well. -- Damien Le Moal Western Digital Research

Re: Zoned storage support in libvirt

2023-01-30 Thread Damien Le Moal
asis. > > We could still have it "do what I mean" by default though. eg the > virtio-blk setting defaults could imply "match the host", so we get > effectively a tri-state (zoned=on/off/auto) What would zoned=on mean ? If the backend is not zoned, virtio will expose a regular block device to the guest as it should. For zoned=auto, same, I am not sure what that would achieve. If the backend is zoned, it will be seen as zoned by the guest. If the backend is a regular disk, it will be exposed as a regular disk. So what would this option achieve ? And for zoned=off, I guess you would want to ignore a backend drive if it is zoned ? > > With regards, > Daniel -- Damien Le Moal Western Digital Research

Re: [PATCH v4 2/3] block: introduce zone append write for zoned devices

2022-10-16 Thread Damien Le Moal
\ > QEMU_AIO_ZONE_REPORT | \ > - QEMU_AIO_ZONE_MGMT) > + QEMU_AIO_ZONE_MGMT | \ > + QEMU_AIO_ZONE_APPEND) > > /* AIO flags */ > #define QEMU_AIO_MISALIGNED 0x1000 > diff --git a/include/sysemu/block-backend-io.h > b/include/sysemu/block-backend-io.h > index 1b5fc7db6b..ff9f777f52 100644 > --- a/include/sysemu/block-backend-io.h > +++ b/include/sysemu/block-backend-io.h > @@ -52,6 +52,9 @@ BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t > offset, > BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op, >int64_t offset, int64_t len, >BlockCompletionFunc *cb, void *opaque); > +BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset, > +QEMUIOVector *qiov, BdrvRequestFlags flags, > +BlockCompletionFunc *cb, void *opaque); > BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int64_t > bytes, > BlockCompletionFunc *cb, void *opaque); > void blk_aio_cancel_async(BlockAIOCB *acb); > @@ -173,6 +176,12 @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, > BlockZoneOp op, >int64_t offset, int64_t len); > int generated_co_wrapper blk_zone_mgmt(BlockBackend *blk, BlockZoneOp op, > int64_t offset, int64_t len); > +int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset, > +QEMUIOVector *qiov, > +BdrvRequestFlags flags); > +int generated_co_wrapper blk_zone_append(BlockBackend *blk, int64_t *offset, > + QEMUIOVector *qiov, > + BdrvRequestFlags flags); > > int generated_co_wrapper blk_pdiscard(BlockBackend *blk, int64_t offset, >int64_t bytes); -- Damien Le Moal Western Digital Research

Re: [PATCH v4 1/3] file-posix: add the tracking of the zones write pointers

2022-10-16 Thread Damien Le Moal
mmon.h > @@ -92,6 +92,14 @@ typedef struct BlockZoneDescriptor { > BlockZoneCondition cond; > } BlockZoneDescriptor; > > +/* > + * Track write pointers of a zone in bytes. > + */ > +typedef struct BlockZoneWps { > +QemuMutex lock; > +uint64_t wp[]; > +} BlockZoneWps; > + > typedef struct BlockDriverInfo { > /* in bytes, 0 if irrelevant */ > int cluster_size; > @@ -205,6 +213,12 @@ typedef enum { > #define BDRV_SECTOR_BITS 9 > #define BDRV_SECTOR_SIZE (1ULL << BDRV_SECTOR_BITS) > > +/* > + * Get the first most significant bit of wp. If it is zero, then > + * the zone type is SWR. > + */ > +#define BDRV_ZT_IS_CONV(wp)(wp & (1ULL << 63)) > + > #define BDRV_REQUEST_MAX_SECTORS MIN_CONST(SIZE_MAX >> BDRV_SECTOR_BITS, \ > INT_MAX >> BDRV_SECTOR_BITS) > #define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS) > diff --git a/include/block/block_int-common.h > b/include/block/block_int-common.h > index 37dddc603c..e3136146aa 100644 > --- a/include/block/block_int-common.h > +++ b/include/block/block_int-common.h > @@ -857,6 +857,9 @@ typedef struct BlockLimits { > > /* device capacity expressed in bytes */ > int64_t capacity; > + > +/* array of write pointers' location of each zone in the zoned device. */ > +BlockZoneWps *wps; > } BlockLimits; > > typedef struct BdrvOpBlocker BdrvOpBlocker; -- Damien Le Moal Western Digital Research

Re: [PATCH v12 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-10-16 Thread Damien Le Moal
L; > +zones = g_new(BlockZoneDescriptor, nr_zones); > +ret = blk_zone_report(blk, offset, &nr_zones, zones); > +if (ret < 0) { > +printf("zone report failed: %s\n", strerror(-ret)); > +} else { > +for (int i = 0; i < nr_zones; ++i) { &g

Re: [RFC v3 1/2] include: update virtio_blk headers from Linux 5.19-rc2+

2022-10-16 Thread Damien Le Moal
y Open */ >> +#define VIRTIO_BLK_ZS_IOPEN    2 >> +/* Explicitly Open */ >> +#define VIRTIO_BLK_ZS_EOPEN    3 >> +/* Closed */ >> +#define VIRTIO_BLK_ZS_CLOSED   4 >> +/* Read-Only */ >> +#define VIRTIO_BLK_ZS_RDONLY   13 >> +/* Full */ >> +#define VIRTIO_BLK_ZS_FULL 14 >> +/* Offline */ >> +#define VIRTIO_BLK_ZS_OFFLINE  15 >> + >>  /* Unmap this range (only valid for write zeroes command) */ >>  #define VIRTIO_BLK_WRITE_ZEROES_FLAG_UNMAP 0x0001 >>   >> @@ -198,4 +300,11 @@ struct virtio_scsi_inhdr { >>  #define VIRTIO_BLK_S_OK0 >>  #define VIRTIO_BLK_S_IOERR 1 >>  #define VIRTIO_BLK_S_UNSUPP2 >> + >> +/* Error codes that are specific to zoned block devices */ >> +#define VIRTIO_BLK_S_ZONE_INVALID_CMD 3 >> +#define VIRTIO_BLK_S_ZONE_UNALIGNED_WP    4 >> +#define VIRTIO_BLK_S_ZONE_OPEN_RESOURCE   5 >> +#define VIRTIO_BLK_S_ZONE_ACTIVE_RESOURCE 6 >> + >>  #endif /* _LINUX_VIRTIO_BLK_H */ > -- Damien Le Moal Western Digital Research

Re: [PATCH v3 1/3] file-posix:add the tracking of the zones write pointers

2022-10-13 Thread Damien Le Moal
On 2022/10/13 16:08, Sam Li wrote: > Damien Le Moal 于2022年10月13日周四 13:13写道: >> >> On 10/10/22 11:33, Sam Li wrote: >>> Since Linux doesn't have a user API to issue zone append operations to >>> zoned devices from user space, the file-posix driver is modified

Re: [PATCH v3 2/3] block: introduce zone append write for zoned devices

2022-10-13 Thread Damien Le Moal
On 2022/10/13 16:27, Sam Li wrote: > Damien Le Moal 于2022年10月13日周四 13:55写道: >> >> On 10/10/22 11:33, Sam Li wrote: >>> A zone append command is a write operation that specifies the first >>> logical block of a zone as the write position. When writing to a zoned &

Re: [PATCH v3 2/3] block: introduce zone append write for zoned devices

2022-10-12 Thread Damien Le Moal
ackend-io.h > +++ b/include/sysemu/block-backend-io.h > @@ -51,6 +51,9 @@ BlockAIOCB *blk_aio_zone_report(BlockBackend *blk, int64_t > offset, > BlockAIOCB *blk_aio_zone_mgmt(BlockBackend *blk, BlockZoneOp op, >int64_t offset, int64_t len, >BlockCompletionFunc *cb, void *opaque); > +BlockAIOCB *blk_aio_zone_append(BlockBackend *blk, int64_t *offset, > +QEMUIOVector *qiov, BdrvRequestFlags flags, > +BlockCompletionFunc *cb, void *opaque); > BlockAIOCB *blk_aio_pdiscard(BlockBackend *blk, int64_t offset, int64_t > bytes, > BlockCompletionFunc *cb, void *opaque); > void blk_aio_cancel_async(BlockAIOCB *acb); > @@ -172,6 +175,12 @@ int coroutine_fn blk_co_zone_mgmt(BlockBackend *blk, > BlockZoneOp op, >int64_t offset, int64_t len); > int generated_co_wrapper blk_zone_mgmt(BlockBackend *blk, BlockZoneOp op, > int64_t offset, int64_t len); > +int coroutine_fn blk_co_zone_append(BlockBackend *blk, int64_t *offset, > +QEMUIOVector *qiov, > +BdrvRequestFlags flags); > +int generated_co_wrapper blk_zone_append(BlockBackend *blk, int64_t *offset, > + QEMUIOVector *qiov, > + BdrvRequestFlags flags); > > int generated_co_wrapper blk_pdiscard(BlockBackend *blk, int64_t offset, >int64_t bytes); -- Damien Le Moal Western Digital Research

Re: [PATCH v11 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-10-12 Thread Damien Le Moal
On 10/13/22 14:33, Sam Li wrote: > Damien Le Moal 于2022年10月13日周四 12:41写道: >> >> On 10/10/22 11:21, Sam Li wrote: >>> Add a new zoned_host_device BlockDriver. The zoned_host_device option >>> accepts only zoned host block devices. By adding zone management >&

Re: [PATCH v3 1/3] file-posix:add the tracking of the zones write pointers

2022-10-12 Thread Damien Le Moal
, > }; > > diff --git a/include/block/block-common.h b/include/block/block-common.h > index 882de6825e..b8b2dba64a 100644 > --- a/include/block/block-common.h > +++ b/include/block/block-common.h > @@ -93,6 +93,14 @@ typedef struct BlockZoneDescriptor { > BlockZoneCondition cond; > } BlockZoneDescriptor; > > +/* > + * Track write pointers of a zone in bytes. > + */ > +typedef struct BlockZoneWps { > +QemuMutex lock; > +uint64_t wp[]; > +} BlockZoneWps; > + > typedef struct BlockDriverInfo { > /* in bytes, 0 if irrelevant */ > int cluster_size; > @@ -206,6 +214,12 @@ typedef enum { > #define BDRV_SECTOR_BITS 9 > #define BDRV_SECTOR_SIZE (1ULL << BDRV_SECTOR_BITS) > > +/* > + * Get the first most significant bit of wp. If it is zero, then > + * the zone type is SWR. > + */ > +#define BDRV_ZT_IS_CONV(wp)(wp & (1ULL << 63)) > + > #define BDRV_REQUEST_MAX_SECTORS MIN_CONST(SIZE_MAX >> BDRV_SECTOR_BITS, \ > INT_MAX >> BDRV_SECTOR_BITS) > #define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS) > diff --git a/include/block/block_int-common.h > b/include/block/block_int-common.h > index 37dddc603c..59c2d1316d 100644 > --- a/include/block/block_int-common.h > +++ b/include/block/block_int-common.h > @@ -857,6 +857,11 @@ typedef struct BlockLimits { > > /* device capacity expressed in bytes */ > int64_t capacity; > + > +/* array of write pointers' location of each zone in the zoned device. */ > +BlockZoneWps *wps; > + > +int64_t write_granularity; > } BlockLimits; > > typedef struct BdrvOpBlocker BdrvOpBlocker; -- Damien Le Moal Western Digital Research

Re: [PATCH v11 7/7] docs/zoned-storage: add zoned device documentation

2022-10-12 Thread Damien Le Moal
n or modify the device permissions accordingly). > > +Zoned block devices > + Zoned block devices can be passed through to the guest if the emulated > storage > + controller supports zoned storage. Use ``--blockdev zoned_host_device, > + node-name=drive0,filename=/dev/nullb0`` to pass through ``/dev/nullb0`` > + as ``drive0``. > + > Windows > ^^^ > With the above nits fixed, feel free to add: Reviewed-by: Damien Le Moal -- Damien Le Moal Western Digital Research

Re: [PATCH v11 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-10-12 Thread Damien Le Moal
et < 0) { > +printf("zone report failed: %s\n", strerror(-ret)); > +} else { > +for (int i = 0; i < nr_zones; ++i) { > +printf("start: 0x%" PRIx64 ", len 0x%" PRIx64 ", " > + "cap"" 0x%" PRIx64 ", wptr 0x%" PRIx64 ", " > + "zcond:%u, [type: %u]\n", > +tosector(zones[i].start), tosector(zones[i].length), > +tosector(zones[i].cap), tosector(zones[i].wp), > +zones[i].cond, zones[i].type); > +} > +} > +return ret; > +} > + > +static const cmdinfo_t zone_report_cmd = { > +.name = "zone_report", > +.altname = "zrp", > +.cfunc = zone_report_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset number", > +.oneline = "report zone information", > +}; > + > +static int zone_open_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len); > +if (ret < 0) { > +printf("zone open failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_open_cmd = { > +.name = "zone_open", > +.altname = "zo", > +.cfunc = zone_open_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "explicit open a range of zones in zone block device", > +}; > + > +static int zone_close_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len); > +if (ret < 0) { > +printf("zone close failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_close_cmd = { > +.name = "zone_close", > +.altname = "zc", > +.cfunc = zone_close_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "close a range of zones in zone block device", > +}; > + > +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len); > +if (ret < 0) { > +printf("zone finish failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_finish_cmd = { > +.name = "zone_finish", > +.altname = "zf", > +.cfunc = zone_finish_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "finish a range of zones in zone block device", > +}; > + > +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len); > +if (ret < 0) { > +printf("zone reset failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_reset_cmd = { > +.name = "zone_reset", > +.altname = "zrs", > +.cfunc = zone_reset_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "reset a zone write pointer in zone block device", > +}; > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2504,6 +2647,11 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&aio_write_cmd); > qemuio_add_command(&aio_flush_cmd); > qemuio_add_command(&flush_cmd); > +qemuio_add_command(&zone_report_cmd); > +qemuio_add_command(&zone_open_cmd); > +qemuio_add_command(&zone_close_cmd); > +qemuio_add_command(&zone_finish_cmd); > +qemuio_add_command(&zone_reset_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); -- Damien Le Moal Western Digital Research

Re: [PATCH v10 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-10-06 Thread Damien Le Moal
On 10/7/22 00:17, Stefan Hajnoczi wrote: > On Tue, Oct 04, 2022 at 08:23:15AM +0900, Damien Le Moal wrote: >> On 2022/10/04 2:47, Stefan Hajnoczi wrote: >>> On Thu, Sep 29, 2022 at 04:36:27PM +0800, Sam Li wrote: >>>> Add a new zoned_host_device BlockDriver

Re: [PATCH v2 1/2] file-posix: add the tracking of the zones wp

2022-10-05 Thread Damien Le Moal
On 10/5/22 17:30, Sam Li wrote: > Damien Le Moal 于2022年10月5日周三 15:34写道: >> >> On 10/5/22 10:44, Damien Le Moal wrote: >>> On 9/29/22 18:31, Sam Li wrote: >>>> Since Linux doesn't have a user API to issue zone append operations to >>>> zo

Re: [PATCH v2 1/2] file-posix: add the tracking of the zones wp

2022-10-05 Thread Damien Le Moal
On 10/5/22 10:44, Damien Le Moal wrote: > On 9/29/22 18:31, Sam Li wrote: >> Since Linux doesn't have a user API to issue zone append operations to >> zoned devices from user space, the file-posix driver is modified to add >> zone append emulation using regular writes.

Re: [PATCH v2 1/2] file-posix: add the tracking of the zones wp

2022-10-04 Thread Damien Le Moal
On 10/5/22 10:44, Damien Le Moal wrote: > On 9/29/22 18:31, Sam Li wrote: >> Since Linux doesn't have a user API to issue zone append operations to >> zoned devices from user space, the file-posix driver is modified to add >> zone append emulation using regular writes.

Re: [PATCH v2 2/2] block: introduce zone append write for zoned devices

2022-10-04 Thread Damien Le Moal
zone_reset_cmd); > +qemuio_add_command(&zone_append_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); > diff --git a/tests/qemu-iotests/tests/zoned.out > b/tests/qemu-iotests/tests/zoned.out > index 0c8f96deb9..b3b139b4ec 100644 > --- a/tests/qemu-iotests/tests/zoned.out > +++ b/tests/qemu-iotests/tests/zoned.out > @@ -50,4 +50,11 @@ start: 0x8, len 0x8, cap 0x8, wptr 0x10, > zcond:14, [type: 2] > (5) resetting the second zone > After resetting a zone: > start: 0x8, len 0x8, cap 0x8, wptr 0x8, zcond:1, [type: 2] > + > + > +(6) append write > +After appending the first zone: > +start: 0x0, len 0x8, cap 0x8, wptr 0x18, zcond:2, [type: 2] > +After appending the second zone: > +start: 0x8, len 0x8, cap 0x8, wptr 0x80018, zcond:2, [type: 2] > *** done > diff --git a/tests/qemu-iotests/tests/zoned.sh > b/tests/qemu-iotests/tests/zoned.sh > index fced0194c5..888711eef2 100755 > --- a/tests/qemu-iotests/tests/zoned.sh > +++ b/tests/qemu-iotests/tests/zoned.sh > @@ -79,6 +79,15 @@ echo "(5) resetting the second zone" > sudo $QEMU_IO $IMG -c "zrs 268435456 268435456" > echo "After resetting a zone:" > sudo $QEMU_IO $IMG -c "zrp 268435456 1" > +echo > +echo > +echo "(6) append write" # physical block size of the device is 4096 > +sudo $QEMU_IO $IMG -c "zap 0 0x1000 0x2000" > +echo "After appending the first zone:" > +sudo $QEMU_IO $IMG -c "zrp 0 1" > +sudo $QEMU_IO $IMG -c "zap 268435456 0x1000 0x2000" > +echo "After appending the second zone:" > +sudo $QEMU_IO $IMG -c "zrp 268435456 1" > > # success, all done > echo "*** done" -- Damien Le Moal Western Digital Research

Re: [PATCH v2 1/2] file-posix: add the tracking of the zones wp

2022-10-04 Thread Damien Le Moal
; 63)) But since this must be used for both SWR and SWP zones, I would reverse this into: #define BDRV_ZONE_IS_CONV(wp) ((wp) & (1ULL << 63)) Which is a lot simpler. > + > #define BDRV_REQUEST_MAX_SECTORS MIN_CONST(SIZE_MAX >> BDRV_SECTOR_BITS, \ > INT_MAX >> BDRV_SECTOR_BITS) > #define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS) > diff --git a/include/block/block_int-common.h > b/include/block/block_int-common.h > index 37dddc603c..59c2d1316d 100644 > --- a/include/block/block_int-common.h > +++ b/include/block/block_int-common.h > @@ -857,6 +857,11 @@ typedef struct BlockLimits { > > /* device capacity expressed in bytes */ > int64_t capacity; > + > +/* array of write pointers' location of each zone in the zoned device. */ > +BlockZoneWps *wps; > + > +int64_t write_granularity; > } BlockLimits; > > typedef struct BdrvOpBlocker BdrvOpBlocker; > diff --git a/include/block/raw-aio.h b/include/block/raw-aio.h > index 3d26929cdd..f13cc1887b 100644 > --- a/include/block/raw-aio.h > +++ b/include/block/raw-aio.h > @@ -31,6 +31,7 @@ > #define QEMU_AIO_TRUNCATE 0x0080 > #define QEMU_AIO_ZONE_REPORT 0x0100 > #define QEMU_AIO_ZONE_MGMT0x0200 > +#define QEMU_AIO_ZONE_APPEND 0x0400 > #define QEMU_AIO_TYPE_MASK \ > (QEMU_AIO_READ | \ > QEMU_AIO_WRITE | \ > @@ -41,7 +42,8 @@ > QEMU_AIO_COPY_RANGE | \ > QEMU_AIO_TRUNCATE | \ > QEMU_AIO_ZONE_REPORT | \ > - QEMU_AIO_ZONE_MGMT) > + QEMU_AIO_ZONE_MGMT | \ > + QEMU_AIO_ZONE_APPEND) This should be introduced in patch 2. This patch should be only about zone wp tracking with regular writes and zone management ops. The second patch can implement zone append emulation thanks to this patch. So separate. > > /* AIO flags */ > #define QEMU_AIO_MISALIGNED 0x1000 -- Damien Le Moal Western Digital Research

Re: [PATCH v10 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-10-03 Thread Damien Le Moal
-write: Since 6.2 >> # @snapshot-access: Since 7.0 >> +# @zoned_host_device: Since 7.2 >> # >> # Since: 2.9 >> ## >> @@ -2955,7 +2956,8 @@ >> 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', >> 'parallels', >> 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd', >> { 'name': 'replication', 'if': 'CONFIG_REPLICATION' }, >> -'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] } >> +'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', >> +{ 'name': 'zoned_host_device', 'if': 'CONFIG_BLKZONED' } ] } >> >> ## >> # @BlockdevOptionsFile: >> @@ -4329,7 +4331,9 @@ >>'vhdx': 'BlockdevOptionsGenericFormat', >>'vmdk': 'BlockdevOptionsGenericCOWFormat', >>'vpc':'BlockdevOptionsGenericFormat', >> - 'vvfat': 'BlockdevOptionsVVFAT' >> + 'vvfat': 'BlockdevOptionsVVFAT', >> + 'zoned_host_device': { 'type': 'BlockdevOptionsFile', >> + 'if': 'CONFIG_BLKZONED' } >>} } >> >> ## >> diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c >> index 952dc940f1..e56c8d1c30 100644 >> --- a/qemu-io-cmds.c >> +++ b/qemu-io-cmds.c >> @@ -1712,6 +1712,149 @@ static const cmdinfo_t flush_cmd = { >> .oneline= "flush all in-core file state to disk", >> }; >> >> +static inline int64_t tosector(int64_t bytes) { >> +return bytes >> BDRV_SECTOR_BITS; >> +} >> + >> +static int zone_report_f(BlockBackend *blk, int argc, char **argv) >> +{ >> +int ret; >> +int64_t offset; >> +unsigned int nr_zones; >> + >> +++optind; >> +offset = cvtnum(argv[optind]); >> +++optind; >> +nr_zones = cvtnum(argv[optind]); >> + >> +g_autofree BlockZoneDescriptor *zones = NULL; >> +zones = g_new(BlockZoneDescriptor, nr_zones); >> +ret = blk_zone_report(blk, offset, &nr_zones, zones); >> +if (ret < 0) { >> +printf("zone report failed: %s\n", strerror(-ret)); >> +} else { >> +for (int i = 0; i < nr_zones; ++i) { >> +printf("start: 0x%" PRIx64 ", len 0x%" PRIx64 ", " >> + "cap"" 0x%" PRIx64 ", wptr 0x%" PRIx64 ", " >> + "zcond:%u, [type: %u]\n", >> +tosector(zones[i].start), tosector(zones[i].length), >> +tosector(zones[i].cap), tosector(zones[i].wp), >> +zones[i].cond, zones[i].type); >> +} >> +} >> +return ret; >> +} >> + >> +static const cmdinfo_t zone_report_cmd = { >> +.name = "zone_report", >> +.altname = "zrp", >> +.cfunc = zone_report_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset number", >> +.oneline = "report zone information", >> +}; >> + >> +static int zone_open_f(BlockBackend *blk, int argc, char **argv) >> +{ >> +int ret; >> +int64_t offset, len; >> +++optind; >> +offset = cvtnum(argv[optind]); >> +++optind; >> +len = cvtnum(argv[optind]); >> +ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len); >> +if (ret < 0) { >> +printf("zone open failed: %s\n", strerror(-ret)); >> +} >> +return ret; >> +} >> + >> +static const cmdinfo_t zone_open_cmd = { >> +.name = "zone_open", >> +.altname = "zo", >> +.cfunc = zone_open_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset len", >> +.oneline = "explicit open a range of zones in zone block device", >> +}; >> + >> +static int zone_close_f(BlockBackend *blk, int argc, char **argv) >> +{ >> +int ret; >> +int64_t offset, len; >> +++optind; >> +offset = cvtnum(argv[optind]); >> +++optind; >> +len = cvtnum(argv[optind]); >> +ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len); >> +if (ret < 0) { >> +printf("zone close failed: %s\n", strerror(-ret)); >> +} >> +return ret; >> +} >> + >> +static const cmdinfo_t zone_close_cmd = { >> +.name = "zone_close", >> +.altname = "zc", >> +.cfunc = zone_close_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset len", >> +.oneline = "close a range of zones in zone block device", >> +}; >> + >> +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) >> +{ >> +int ret; >> +int64_t offset, len; >> +++optind; >> +offset = cvtnum(argv[optind]); >> +++optind; >> +len = cvtnum(argv[optind]); >> +ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len); >> +if (ret < 0) { >> +printf("zone finish failed: %s\n", strerror(-ret)); >> +} >> +return ret; >> +} >> + >> +static const cmdinfo_t zone_finish_cmd = { >> +.name = "zone_finish", >> +.altname = "zf", >> +.cfunc = zone_finish_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset len", >> +.oneline = "finish a range of zones in zone block device", >> +}; >> + >> +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) >> +{ >> +int ret; >> +int64_t offset, len; >> +++optind; >> +offset = cvtnum(argv[optind]); >> +++optind; >> +len = cvtnum(argv[optind]); >> +ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len); >> +if (ret < 0) { >> +printf("zone reset failed: %s\n", strerror(-ret)); >> +} >> +return ret; >> +} >> + >> +static const cmdinfo_t zone_reset_cmd = { >> +.name = "zone_reset", >> +.altname = "zrs", >> +.cfunc = zone_reset_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset len", >> +.oneline = "reset a zone write pointer in zone block device", >> +}; >> + >> static int truncate_f(BlockBackend *blk, int argc, char **argv); >> static const cmdinfo_t truncate_cmd = { >> .name = "truncate", >> @@ -2504,6 +2647,11 @@ static void __attribute((constructor)) >> init_qemuio_commands(void) >> qemuio_add_command(&aio_write_cmd); >> qemuio_add_command(&aio_flush_cmd); >> qemuio_add_command(&flush_cmd); >> +qemuio_add_command(&zone_report_cmd); >> +qemuio_add_command(&zone_open_cmd); >> +qemuio_add_command(&zone_close_cmd); >> +qemuio_add_command(&zone_finish_cmd); >> +qemuio_add_command(&zone_reset_cmd); >> qemuio_add_command(&truncate_cmd); >> qemuio_add_command(&length_cmd); >> qemuio_add_command(&info_cmd); >> -- >> 2.37.3 >> -- Damien Le Moal Western Digital Research

Re: [PATCH 2/2] virtio-blk: add zoned storage emulation for zoned devices

2022-09-28 Thread Damien Le Moal
>> virtio_add_feature(&features, VIRTIO_BLK_F_GEOMETRY); >>> virtio_add_feature(&features, VIRTIO_BLK_F_TOPOLOGY); >>> virtio_add_feature(&features, VIRTIO_BLK_F_BLK_SIZE); >>> +if (state->bl.zoned != BLK_Z_NONE) { >>> +virtio_add_feature(&s->host_features, VIRTIO_BLK_F_ZONED); >>> +if (state->bl.zoned == BLK_Z_HM) { >>> +virtio_clear_feature(&features, VIRTIO_BLK_F_DISCARD); >> >> Why is features modified here but s->host_features is modified in the >> line above? > > Because F_DISCARD should not be offered by the driver when the device > offers F_ZONED with the HM model. > >> >>> +} >>> +} >> >> This detects VIRTIO_BLK_F_ZONED based on the BlockDriverState... > (This part can be removed.) > >> >>> if (virtio_has_feature(features, VIRTIO_F_VERSION_1)) { >>> if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_SCSI)) { >>> error_setg(errp, "Please set scsi=off for virtio-blk devices >>> in order to use virtio 1.0"); >>> @@ -1286,6 +1609,9 @@ static Property virtio_blk_properties[] = { >>> #ifdef __linux__ >>> DEFINE_PROP_BIT64("scsi", VirtIOBlock, host_features, >>>VIRTIO_BLK_F_SCSI, false), >>> +#endif >>> +#ifdef CONFIG_BLKZONED >>> +DEFINE_PROP_BIT64("zoned", VirtIOBlock, host_features, >>> VIRTIO_BLK_F_ZONED, true), >>> #endif >> >> ...but this allows users to enable/disable the flag from the >> command-line. >> >> I think DEFINE_PROP_BIT64() can be removed, but please check that the >> config space size is still correct. It may be necessary to move the >> bs->bl.zoned check into virtio_blk_device_realize() and set >> the VIRTIO_BLK_F_ZONED s->host_features bit there instead. That will >> allow the s->config_size calculation to work correctly. > > Ok, it works. Thanks! > > Additionally, the DEFINE_PROP_BIT here is to declare that the supports > zones. Then the virtio-blk driver will not need to look at that > feature. So the former part detecting the F_ZONED feature based on > BlockDriverState can be removed. If removing this macro, then the > virio-blk driver must set the F_ZONED feature by itself. -- Damien Le Moal Western Digital Research

Re: [PATCH v9 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-09-20 Thread Damien Le Moal
ath. But given that zone management operations are not performance critical, generalizing the code should be fine. However, once we start adding the code for full zone emulation on top of a regular file or qcow image, sector-to-zone conversions requiring divisions will hurt. So I really would prefer the code be left as-is for now. -- Damien Le Moal Western Digital Research

Re: [PATCH v9 1/7] include: add zoned device structs

2022-09-19 Thread Damien Le Moal
t;>>> Signed-off-by: Sam Li >>>>> Reviewed-by: Stefan Hajnoczi >>>>> Reviewed-by: Damien Le Moal >>>>> --- >>>>> include/block/block-common.h | 43 >>>>> 1 file changed, 43 in

Re: [PATCH] block: use the request length for iov alignment

2022-09-13 Thread Damien Le Moal
On 2022/09/13 15:51, Keith Busch wrote: > On Tue, Sep 13, 2022 at 03:20:23PM +0100, Damien Le Moal wrote: >> On 2022/09/13 15:12, Keith Busch wrote: >>> On Thu, Sep 08, 2022 at 09:45:26AM -0700, Keith Busch wrote: >>>> From: Keith Busch >>>> >>>

Re: [PATCH] block: use the request length for iov alignment

2022-09-13 Thread Damien Le Moal
On 2022/09/13 15:36, Keith Busch wrote: > On Tue, Sep 13, 2022 at 03:20:23PM +0100, Damien Le Moal wrote: >> On 2022/09/13 15:12, Keith Busch wrote: >>> On Thu, Sep 08, 2022 at 09:45:26AM -0700, Keith Busch wrote: >>>> From: Keith Busch >>>> >>>

Re: [PATCH] block: use the request length for iov alignment

2022-09-13 Thread Damien Le Moal
t; +size_t len = bs->bl.request_alignment; >> IO_CODE(); >> >> for (i = 0; i < qiov->niov; i++) { >> if ((uintptr_t) qiov->iov[i].iov_base % alignment) { >> return false; >> } >> -if (qiov->iov[i].iov_len % alignment) { >> +if (qiov->iov[i].iov_len % len) { >> return false; >> } >> } >> -- >> 2.30.2 >> -- Damien Le Moal Western Digital Research

Re: [PATCH] block: introduce zone append write for zoned devices

2022-09-11 Thread Damien Le Moal
removable device specific */ >>> bool (*bdrv_is_inserted)(BlockDriverState *bs); >>> @@ -854,6 +857,12 @@ typedef struct BlockLimits { >>> >>> /* maximum number of active zones */ >>> int64_t max_active_zones; >>> + >>> +/* array of zones in the zoned block device. Only tracks write >>> pointer's >>> + * location of each zone as a helper for zone_append API */ >>> +BlockZoneDescriptor *zones; >> >> This is a lot of memory for only tracking the wp... Why not reduce this to an >> array of int64 values, for the wp only ? As you may need the zone type too >> (conventional vs sequential), you can use the most significant bit (a zone wp >> value will never use all 64 bits !). >> >> Or device another zone structure with zone type, zone wp and mutex only, so >> smaller than the regular zone report structure. > > I was just trying to reuse do_zone_report. It is better to only track > the wp only. Then a new struct and smaller zone_report should be > introduced for it. Yes, this will use less memory, which is always good. -- Damien Le Moal Western Digital Research

Re: [PATCH v9 5/7] config: add check to block layer

2022-09-11 Thread Damien Le Moal
On 2022/09/11 15:54, Sam Li wrote: > Damien Le Moal 于2022年9月11日周日 13:34写道: >> >> On 2022/09/10 14:27, Sam Li wrote: >>> Putting zoned/non-zoned BlockDrivers on top of each other is not >>> allowed. >>> >>> Signed-off-by: Sam Li >

Re: [PATCH v9 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-09-11 Thread Damien Le Moal
} > +} > +return ret; > +} > + > +static const cmdinfo_t zone_report_cmd = { > +.name = "zone_report", > +.altname = "zrp", > +.cfunc = zone_report_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset number", > +.oneline = "report zone information", > +}; > + > +static int zone_open_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len); > +if (ret < 0) { > +printf("zone open failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_open_cmd = { > +.name = "zone_open", > +.altname = "zo", > +.cfunc = zone_open_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "explicit open a range of zones in zone block device", > +}; > + > +static int zone_close_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len); > +if (ret < 0) { > +printf("zone close failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_close_cmd = { > +.name = "zone_close", > +.altname = "zc", > +.cfunc = zone_close_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "close a range of zones in zone block device", > +}; > + > +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len); > +if (ret < 0) { > +printf("zone finish failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_finish_cmd = { > +.name = "zone_finish", > +.altname = "zf", > +.cfunc = zone_finish_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "finish a range of zones in zone block device", > +}; > + > +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len); > +if (ret < 0) { > +printf("zone reset failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_reset_cmd = { > +.name = "zone_reset", > +.altname = "zrs", > +.cfunc = zone_reset_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "reset a zone write pointer in zone block device", > +}; > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2504,6 +2642,11 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&aio_write_cmd); > qemuio_add_command(&aio_flush_cmd); > qemuio_add_command(&flush_cmd); > +qemuio_add_command(&zone_report_cmd); > +qemuio_add_command(&zone_open_cmd); > +qemuio_add_command(&zone_close_cmd); > +qemuio_add_command(&zone_finish_cmd); > +qemuio_add_command(&zone_reset_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); -- Damien Le Moal Western Digital Research

Re: [PATCH] block: introduce zone append write for zoned devices

2022-09-10 Thread Damien Le Moal
On 2022/09/11 15:06, Damien Le Moal wrote: > On 2022/09/10 15:38, Sam Li wrote: >> A zone append command is a write operation that specifies the first >> logical block of a zone as the write position. When writing to a zoned >> block device using zone append, the byte o

Re: [PATCH v9 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-09-10 Thread Damien Le Moal
On 2022/09/11 15:33, Sam Li wrote: > Damien Le Moal 于2022年9月11日周日 13:31写道: [...] >>> +/* >>> + * zone management operations - Execute an operation on a zone >>> + */ >>> +static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp >>&

Re: [PATCH] block: introduce zone append write for zoned devices

2022-09-10 Thread Damien Le Moal
+optind++; > +nr_iov = argc - optind; > +buf = create_iovec(blk, &qiov, &argv[optind], nr_iov, pattern); > +if (buf == NULL) { > +return -EINVAL; > +} > +ret = do_aio_zone_append(blk, &qiov, &offset, flags, &total); > +if (ret < 0) { > +printf("zone append failed: %s\n", strerror(-ret)); > +goto out; > +} > + > +out: > +qemu_iovec_destroy(&qiov); > +qemu_io_free(buf); > +return ret; > +} > + > +static const cmdinfo_t zone_append_cmd = { > +.name = "zone_append", > +.altname = "zap", > +.cfunc = zone_append_f, > +.argmin = 3, > +.argmax = 3, > +.args = "offset len [len..]", > +.oneline = "append write a number of bytes at a specified offset", > +}; > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2647,6 +2708,7 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&zone_close_cmd); > qemuio_add_command(&zone_finish_cmd); > qemuio_add_command(&zone_reset_cmd); > +qemuio_add_command(&zone_append_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); > diff --git a/tests/qemu-iotests/tests/zoned.out > b/tests/qemu-iotests/tests/zoned.out > index 0c8f96deb9..b3b139b4ec 100644 > --- a/tests/qemu-iotests/tests/zoned.out > +++ b/tests/qemu-iotests/tests/zoned.out > @@ -50,4 +50,11 @@ start: 0x8, len 0x8, cap 0x8, wptr 0x10, > zcond:14, [type: 2] > (5) resetting the second zone > After resetting a zone: > start: 0x8, len 0x8, cap 0x8, wptr 0x8, zcond:1, [type: 2] > + > + > +(6) append write > +After appending the first zone: > +start: 0x0, len 0x8, cap 0x8, wptr 0x18, zcond:2, [type: 2] > +After appending the second zone: > +start: 0x8, len 0x8, cap 0x8, wptr 0x80018, zcond:2, [type: 2] > *** done > diff --git a/tests/qemu-iotests/tests/zoned.sh > b/tests/qemu-iotests/tests/zoned.sh > index 871f47efed..b4dc89aaac 100755 > --- a/tests/qemu-iotests/tests/zoned.sh > +++ b/tests/qemu-iotests/tests/zoned.sh > @@ -79,6 +79,15 @@ echo "(5) resetting the second zone" > sudo $QEMU_IO $IMG -c "zrs 268435456 268435456" > echo "After resetting a zone:" > sudo $QEMU_IO $IMG -c "zrp 268435456 1" > +echo > +echo > +echo "(6) append write" # assuming block size of device is 4096 > +sudo $QEMU_IO $IMG -c "zap 0 0x1000 0x2000" > +echo "After appending the first zone:" > +sudo $QEMU_IO $IMG -c "zrp 0 1" > +sudo $QEMU_IO $IMG -c "zap 268435456 0x1000 0x2000" > +echo "After appending the second zone:" > +sudo $QEMU_IO $IMG -c "zrp 268435456 1" > # success, all done > echo "*** done" > rm -f $seq.full -- Damien Le Moal Western Digital Research

Re: [PATCH v9 7/7] docs/zoned-storage: add zoned device documentation

2022-09-10 Thread Damien Le Moal
tion or modify the device permissions accordingly). > > +Zoned block devices > + Zoned block devices can be passed through to the guest if the emulated > storage > + controller supports zoned storage. Use ``--blockdev zoned_host_device, > + node-name=drive0,filename=/dev/nullb0`` to pass through ``/dev/nullb0`` > + as ``drive0``. > + > Windows > ^^^ > -- Damien Le Moal Western Digital Research

Re: [PATCH v9 4/7] raw-format: add zone operations to pass through requests

2022-09-10 Thread Damien Le Moal
On 2022/09/10 14:27, Sam Li wrote: > raw-format driver usually sits on top of file-posix driver. It needs to > pass through requests of zone commands. > > Signed-off-by: Sam Li > Reviewed-by: Stefan Hajnoczi Reviewed-by: Damien Le Moal > --- > block/raw-format.c | 13 ++

Re: [PATCH v9 5/7] config: add check to block layer

2022-09-10 Thread Damien Le Moal
078ddd7e67..043aa161a0 100644 > --- a/include/block/block_int-common.h > +++ b/include/block/block_int-common.h > @@ -127,6 +127,11 @@ struct BlockDriver { > */ > bool is_format; > > +/* > + * Set to true if the BlockDriver supports zoned children. > + */ > +bool supports_zoned_children; > + > /* > * Drivers not implementing bdrv_parse_filename nor bdrv_open should have > * this field set to true, except ones that are defined only by their -- Damien Le Moal Western Digital Research

Re: [PATCH v9 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-09-10 Thread Damien Le Moal
zone_report_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset number", > +.oneline = "report zone information", > +}; > + > +static int zone_open_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len); > +if (ret < 0) { > +printf("zone open failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_open_cmd = { > +.name = "zone_open", > +.altname = "zo", > +.cfunc = zone_open_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "explicit open a range of zones in zone block device", > +}; > + > +static int zone_close_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len); > +if (ret < 0) { > +printf("zone close failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_close_cmd = { > +.name = "zone_close", > +.altname = "zc", > +.cfunc = zone_close_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "close a range of zones in zone block device", > +}; > + > +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len); > +if (ret < 0) { > +printf("zone finish failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_finish_cmd = { > +.name = "zone_finish", > +.altname = "zf", > +.cfunc = zone_finish_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "finish a range of zones in zone block device", > +}; > + > +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len); > +if (ret < 0) { > +printf("zone reset failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_reset_cmd = { > +.name = "zone_reset", > +.altname = "zrs", > +.cfunc = zone_reset_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "reset a zone write pointer in zone block device", > +}; > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2504,6 +2642,11 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&aio_write_cmd); > qemuio_add_command(&aio_flush_cmd); > qemuio_add_command(&flush_cmd); > +qemuio_add_command(&zone_report_cmd); > +qemuio_add_command(&zone_open_cmd); > +qemuio_add_command(&zone_close_cmd); > +qemuio_add_command(&zone_finish_cmd); > +qemuio_add_command(&zone_reset_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); -- Damien Le Moal Western Digital Research

Re: [PATCH v9 2/7] file-posix: introduce helper functions for sysfs attributes

2022-09-10 Thread Damien Le Moal
t; Signed-off-by: Sam Li > Reviewed-by: Hannes Reinecke > Reviewed-by: Stefan Hajnoczi Looks good to me. Reviewed-by: Damien Le Moal > --- > block/file-posix.c | 121 ++- > include/block/block_int-common.h | 3 + > 2 fil

Re: [PATCH v8 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-09-01 Thread Damien Le Moal
uest can boot >>>> but has no emulated block device. In some cases, QEMU just terminates >>>> because the block device has not met the alignment requirements. >>> >>> I'm not sure I understand all of this. I'm also not sure I have to :) >> >> Maybe I didn't explain it very well. Which part would you like to know >> more about? > > Let's try more specific questions. Say I configure a zoned_host_device > backed by a host device that isn't zoned. > > 1. Is this configuration accepted? If we assume we have the special zoned_host_device driver, with the associated command line zoned_host_device option explicitly calling for it, then no, I do not think this should be allowed at all and an error should be returned on startup. That would be consistent with the fact that the options zoned_host_device and host_device are different to make sure we can check that the user knows what he/she/them is doing. If we have only host_device as a setup option and driver, the driver methods can be trivially adjusted to do the right thing based on the device type (i.e. zoned vs regular/not zoned). However, that would prevent an interesting future extension of this work to implement a full zone emulation on top of a regular (not zoned) host block device. With this in mind, we currently have the following: 1) host_device option -> accept only regular non-zoned host block devices 2) zoned_host_device option -> accept only zoned host block devices And in the future, we can have: 1) host_device option -> accept only regular non-zoned host block devices 2) zoned_host_device option -> accept any host block device type a) Use native zone kernel API for zoned host block devices b) Use full zone emulation for regular host block devices But sure, internally, we could have a single driver structure with methods adjusted to do the correct thing based on the device type and option specified. Having a 1:1 mapping between the driver name and driver structure does clarify things I think (even though there are indeed a lot of methods that are identical). > > 2. Would a guest work as long as it doesn't touch this device? > > 3. Would a guest using this device work as long as it uses no zoned >features? > > 4. What happens when a guest tries to use zoned features? > > [...] > -- Damien Le Moal Western Digital Research

Re: [PATCH v7 4/8] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-08-24 Thread Damien Le Moal
On 2022/08/24 16:46, Damien Le Moal wrote: > On 2022/08/22 21:12, Sam Li wrote: >> Stefan Hajnoczi 于2022年8月23日周二 08:49写道: >>> >>> On Tue, Aug 16, 2022 at 02:25:18PM +0800, Sam Li wrote: [...]>>>> +blkz = (struct blk_zone *)(rep + 1); >>>>

Re: [PATCH v7 4/8] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-08-24 Thread Damien Le Moal
_fn blk_co_pdiscard(BlockBackend *blk, int64_t offset, >>> diff --git a/meson.build b/meson.build >>> index 294e9a8f32..c3219b0e87 100644 >>> --- a/meson.build >>> +++ b/meson.build >>> @@ -1883,6 +1883,7 @@ config_host_data.set('CONFIG_REPLICATION', >>> get_option('live_block_migration').al >>> # has_header >>> config_host_data.set('CONFIG_EPOLL', cc.has_header('sys/epoll.h')) >>> config_host_data.set('CONFIG_LINUX_MAGIC_H', >>> cc.has_header('linux/magic.h')) >>> +config_host_data.set('CONFIG_BLKZONED', cc.has_header('linux/blkzoned.h')) >>> config_host_data.set('CONFIG_VALGRIND_H', >>> cc.has_header('valgrind/valgrind.h')) >>> config_host_data.set('HAVE_BTRFS_H', cc.has_header('linux/btrfs.h')) >>> config_host_data.set('HAVE_DRM_H', cc.has_header('libdrm/drm.h')) >>> diff --git a/qapi/block-core.json b/qapi/block-core.json >>> index 2173e7734a..c6bbb7a037 100644 >>> --- a/qapi/block-core.json >>> +++ b/qapi/block-core.json >>> @@ -2942,6 +2942,7 @@ >>> # @compress: Since 5.0 >>> # @copy-before-write: Since 6.2 >>> # @snapshot-access: Since 7.0 >>> +# @zoned_host_device: Since 7.2 >>> # >>> # Since: 2.9 >>> ## >>> @@ -2955,7 +2956,8 @@ >>> 'luks', 'nbd', 'nfs', 'null-aio', 'null-co', 'nvme', >>> 'parallels', >>> 'preallocate', 'qcow', 'qcow2', 'qed', 'quorum', 'raw', 'rbd', >>> { 'name': 'replication', 'if': 'CONFIG_REPLICATION' }, >>> -'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat' ] } >>> +'ssh', 'throttle', 'vdi', 'vhdx', 'vmdk', 'vpc', 'vvfat', >>> +{ 'name': 'zoned_host_device', 'if': 'CONFIG_BLKZONED' } ] } >>> >>> ## >>> # @BlockdevOptionsFile: >>> @@ -4329,7 +4331,9 @@ >>>'vhdx': 'BlockdevOptionsGenericFormat', >>>'vmdk': 'BlockdevOptionsGenericCOWFormat', >>>'vpc':'BlockdevOptionsGenericFormat', >>> - 'vvfat': 'BlockdevOptionsVVFAT' >>> + 'vvfat': 'BlockdevOptionsVVFAT', >>> + 'zoned_host_device': { 'type': 'BlockdevOptionsFile', >>> + 'if': 'CONFIG_BLKZONED' } >>>} } >>> >>> ## >>> diff --git a/qemu-io-cmds.c b/qemu-io-cmds.c >>> index 952dc940f1..687c3a624c 100644 >>> --- a/qemu-io-cmds.c >>> +++ b/qemu-io-cmds.c >>> @@ -1712,6 +1712,144 @@ static const cmdinfo_t flush_cmd = { >>> .oneline= "flush all in-core file state to disk", >>> }; >>> >>> +static int zone_report_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +int ret; >>> +int64_t offset; >>> +unsigned int nr_zones; >>> + >>> +++optind; >>> +offset = cvtnum(argv[optind]); >>> +++optind; >>> +nr_zones = cvtnum(argv[optind]); >>> + >>> +g_autofree BlockZoneDescriptor *zones = NULL; >>> +zones = g_new(BlockZoneDescriptor, nr_zones); >>> +ret = blk_zone_report(blk, offset, &nr_zones, zones); >>> +if (ret < 0) { >>> +printf("zone report failed: %s\n", strerror(-ret)); >>> +} else { >>> +for (int i = 0; i < nr_zones; ++i) { >>> +printf("start: 0x%" PRIx64 ", len 0x%" PRIx64 ", " >>> + "cap"" 0x%" PRIx64 ",wptr 0x%" PRIx64 ", " >>> + "zcond:%u, [type: %u]\n", >>> + zones[i].start, zones[i].length, zones[i].cap, >>> zones[i].wp, >>> + zones[i].cond, zones[i].type); >>> +} >>> +} >>> +return ret; >>> +} >>> + >>> +static const cmdinfo_t zone_report_cmd = { >>> +.name = "zone_report", >>> +.altname = "zrp", >>> +.cfunc = zone_report_f, >>> +.argmin = 2, >>> +.argmax = 2, >>> +.args = "offset number", >>> +.oneline = "report zone information", >>> +}; >>> + >>> +static int zone_open_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +int ret; >>> +int64_t offset, len; >>> +++optind; >>> +offset = cvtnum(argv[optind]); >>> +++optind; >>> +len = cvtnum(argv[optind]); >>> +ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len); >>> +if (ret < 0) { >>> +printf("zone open failed: %s\n", strerror(-ret)); >>> +} >>> +return ret; >>> +} >>> + >>> +static const cmdinfo_t zone_open_cmd = { >>> +.name = "zone_open", >>> +.altname = "zo", >>> +.cfunc = zone_open_f, >>> +.argmin = 2, >>> +.argmax = 2, >>> +.args = "offset len", >>> +.oneline = "explicit open a range of zones in zone block device", >>> +}; >>> + >>> +static int zone_close_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +int ret; >>> +int64_t offset, len; >>> +++optind; >>> +offset = cvtnum(argv[optind]); >>> +++optind; >>> +len = cvtnum(argv[optind]); >>> +ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len); >>> +if (ret < 0) { >>> +printf("zone close failed: %s\n", strerror(-ret)); >>> +} >>> +return ret; >>> +} >>> + >>> +static const cmdinfo_t zone_close_cmd = { >>> +.name = "zone_close", >>> +.altname = "zc", >>> +.cfunc = zone_close_f, >>> +.argmin = 2, >>> +.argmax = 2, >>> +.args = "offset len", >>> +.oneline = "close a range of zones in zone block device", >>> +}; >>> + >>> +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +int ret; >>> +int64_t offset, len; >>> +++optind; >>> +offset = cvtnum(argv[optind]); >>> +++optind; >>> +len = cvtnum(argv[optind]); >>> +ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len); >>> +if (ret < 0) { >>> +printf("zone finish failed: %s\n", strerror(-ret)); >>> +} >>> +return ret; >>> +} >>> + >>> +static const cmdinfo_t zone_finish_cmd = { >>> +.name = "zone_finish", >>> +.altname = "zf", >>> +.cfunc = zone_finish_f, >>> +.argmin = 2, >>> +.argmax = 2, >>> +.args = "offset len", >>> +.oneline = "finish a range of zones in zone block device", >>> +}; >>> + >>> +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +int ret; >>> +int64_t offset, len; >>> +++optind; >>> +offset = cvtnum(argv[optind]); >>> +++optind; >>> +len = cvtnum(argv[optind]); >>> +ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len); >>> +if (ret < 0) { >>> +printf("zone reset failed: %s\n", strerror(-ret)); >>> +} >>> +return ret; >>> +} >>> + >>> +static const cmdinfo_t zone_reset_cmd = { >>> +.name = "zone_reset", >>> +.altname = "zrs", >>> +.cfunc = zone_reset_f, >>> +.argmin = 2, >>> +.argmax = 2, >>> +.args = "offset len", >>> +.oneline = "reset a zone write pointer in zone block device", >>> +}; >>> + >>> static int truncate_f(BlockBackend *blk, int argc, char **argv); >>> static const cmdinfo_t truncate_cmd = { >>> .name = "truncate", >>> @@ -2504,6 +2642,11 @@ static void __attribute((constructor)) >>> init_qemuio_commands(void) >>> qemuio_add_command(&aio_write_cmd); >>> qemuio_add_command(&aio_flush_cmd); >>> qemuio_add_command(&flush_cmd); >>> +qemuio_add_command(&zone_report_cmd); >>> +qemuio_add_command(&zone_open_cmd); >>> +qemuio_add_command(&zone_close_cmd); >>> +qemuio_add_command(&zone_finish_cmd); >>> +qemuio_add_command(&zone_reset_cmd); >>> qemuio_add_command(&truncate_cmd); >>> qemuio_add_command(&length_cmd); >>> qemuio_add_command(&info_cmd); >>> -- >>> 2.37.1 >>> -- Damien Le Moal Western Digital Research

Re: [PATCH v7 3/8] file-posix: introduce get_sysfs_long_val for the long sysfs attribute

2022-08-16 Thread Damien Le Moal
On 2022/08/16 10:53, Sam Li wrote: > Damien Le Moal 于2022年8月17日周三 01:35写道: >> >> On 2022/08/15 23:25, Sam Li wrote: >>> Use sysfs attribute files to get the long value of zoned device >>> information. >>> >>> Signed-off-by: Sam Li >&g

Re: [PATCH v7 3/8] file-posix: introduce get_sysfs_long_val for the long sysfs attribute

2022-08-16 Thread Damien Le Moal
with '\n', pass 'end' to accept that. */ > +ret = qemu_strtol(str, &end, 10, &val); > +if (ret == 0 && end && *end == '\n') { > +ret = val; > +} > +return ret; > +#else > +return -ENOTSUP; > +#endif > +} > + > static int hdev_get_max_segments(int fd, struct stat *st) { > int ret; > if (S_ISCHR(st->st_mode)) { -- Damien Le Moal Western Digital Research

Re: [PATCH v7 4/8] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-08-16 Thread Damien Le Moal
+} > +} > +return ret; > +} > + > +static const cmdinfo_t zone_report_cmd = { > +.name = "zone_report", > +.altname = "zrp", > +.cfunc = zone_report_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset number", > +.oneline = "report zone information", > +}; > + > +static int zone_open_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_OPEN, offset, len); > +if (ret < 0) { > +printf("zone open failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_open_cmd = { > +.name = "zone_open", > +.altname = "zo", > +.cfunc = zone_open_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "explicit open a range of zones in zone block device", > +}; > + > +static int zone_close_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_CLOSE, offset, len); > +if (ret < 0) { > +printf("zone close failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_close_cmd = { > +.name = "zone_close", > +.altname = "zc", > +.cfunc = zone_close_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "close a range of zones in zone block device", > +}; > + > +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_FINISH, offset, len); > +if (ret < 0) { > +printf("zone finish failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_finish_cmd = { > +.name = "zone_finish", > +.altname = "zf", > +.cfunc = zone_finish_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "finish a range of zones in zone block device", > +}; > + > +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) > +{ > +int ret; > +int64_t offset, len; > +++optind; > +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, BLK_ZO_RESET, offset, len); > +if (ret < 0) { > +printf("zone reset failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_reset_cmd = { > +.name = "zone_reset", > +.altname = "zrs", > +.cfunc = zone_reset_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "reset a zone write pointer in zone block device", > +}; > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2504,6 +2642,11 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&aio_write_cmd); > qemuio_add_command(&aio_flush_cmd); > qemuio_add_command(&flush_cmd); > +qemuio_add_command(&zone_report_cmd); > +qemuio_add_command(&zone_open_cmd); > +qemuio_add_command(&zone_close_cmd); > +qemuio_add_command(&zone_finish_cmd); > +qemuio_add_command(&zone_reset_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); -- Damien Le Moal Western Digital Research

Re: [PATCH v7 2/8] file-posix: introduce get_sysfs_str_val for device zoned model

2022-08-16 Thread Damien Le Moal
raw_refresh_limits(BlockDriverState *bs, > Error **errp) > bs->bl.max_hw_iov = ret; > } > } > + > +ret = get_sysfs_zoned_model(s->fd, &st, &zoned); > + if (ret < 0) { > +zoned = BLK_Z_NONE; > +} > +bs->bl.zoned = zoned; > } > > static int check_for_dasd(int fd) > diff --git a/include/block/block_int-common.h > b/include/block/block_int-common.h > index 8947abab76..7f7863cc9e 100644 > --- a/include/block/block_int-common.h > +++ b/include/block/block_int-common.h > @@ -825,6 +825,9 @@ typedef struct BlockLimits { > > /* maximum number of iovec elements */ > int max_iov; > + > +/* device zone model */ > +BlockZoneModel zoned; > } BlockLimits; > > typedef struct BdrvOpBlocker BdrvOpBlocker; -- Damien Le Moal Western Digital Research

Re: [PATCH v7 1/8] include: add zoned device structs

2022-08-16 Thread Damien Le Moal
On 2022/08/15 23:25, Sam Li wrote: > Signed-off-by: Sam Li > Reviewed-by: Stefan Hajnoczi Looks good. Reviewed-by: Damien Le Moal > --- > include/block/block-common.h | 43 > 1 file changed, 43 insertions(+) > > diff --git a/include

Re: [RFC v4 2/9] qemu-io: add zoned block device operations.

2022-07-27 Thread Damien Le Moal
printf("start: 0x%" PRIx64 ", len 0x%" PRIx64 ", " >> + "cap"" 0x%" PRIx64 ",wptr 0x%" PRIx64 ", " >> + "zcond:%u, [type: %u]\n", >> + zones[i].start, zones[i].length, zones[i].cap, >> zones[i].wp, >> + zones[i].cond, zones[i].type); >> +} >> +} >> +return ret; >> +} >> + >> +static const cmdinfo_t zone_report_cmd = { >> +.name = "zone_report", >> +.altname = "zp", >> +.cfunc = zone_report_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset number", >> +.oneline = "report zone information", >> +}; >> + >> +static int zone_open_f(BlockBackend *blk, int argc, char **argv) >> +{ >> +int ret; >> +int64_t offset, len; >> +++optind; >> +offset = cvtnum(argv[optind]); >> +++optind; >> +len = cvtnum(argv[optind]); >> +ret = blk_zone_mgmt(blk, zone_open, offset, len); > > Please name enum constants in all caps: BDRV_ZONE_OPEN or BDRV_ZO_OPEN. -- Damien Le Moal Western Digital Research

Re: [RFC v4 1/9] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls.

2022-07-12 Thread Damien Le Moal
> >> @@ -49,6 +48,48 @@ typedef struct BlockDriver BlockDriver; >> typedef struct BdrvChild BdrvChild; >> typedef struct BdrvChildClass BdrvChildClass; >> >> +typedef enum zone_op { >> +zone_open, >> +zone_close, >> +zone_finish, >> +zone_reset, >> +} zone_op; > > QEMU coding style: > > typedef enum { > BLK_ZO_OPEN, > BLK_ZO_CLOSE, > BLK_ZO_FINISH, > BLK_ZO_RESET, > } BlockZoneOp; > > Please also reformat the enums below. > >> + >> +typedef enum zone_model { >> +BLK_Z_HM, >> +BLK_Z_HA, >> +} zone_model; >> + >> +typedef enum BlkZoneCondition { >> +BLK_ZS_NOT_WP = 0x0, >> +BLK_ZS_EMPTY = 0x1, >> +BLK_ZS_IOPEN = 0x2, >> +BLK_ZS_EOPEN = 0x3, >> +BLK_ZS_CLOSED = 0x4, >> +BLK_ZS_RDONLY = 0xD, >> +BLK_ZS_FULL = 0xE, >> +BLK_ZS_OFFLINE = 0xF, >> +} BlkZoneCondition; >> + >> +typedef enum BlkZoneType { >> +BLK_ZT_CONV = 0x1, >> +BLK_ZT_SWR = 0x2, >> +BLK_ZT_SWP = 0x3, >> +} BlkZoneType; >> + >> +/* >> + * Zone descriptor data structure. >> + * Provide information on a zone with all position and size values in bytes. >> + */ >> +typedef struct BlockZoneDescriptor { >> +uint64_t start; >> +uint64_t length; >> +uint64_t cap; >> +uint64_t wp; >> +BlkZoneType type; >> +BlkZoneCondition cond; >> +} BlockZoneDescriptor; >> + >> typedef struct BlockDriverInfo { >> /* in bytes, 0 if irrelevant */ >> int cluster_size; >> diff --git a/include/block/block_int-common.h >> b/include/block/block_int-common.h >> index 8947abab76..6037871089 100644 >> --- a/include/block/block_int-common.h >> +++ b/include/block/block_int-common.h >> @@ -94,6 +94,20 @@ typedef struct BdrvTrackedRequest { >> struct BdrvTrackedRequest *waiting_for; >> } BdrvTrackedRequest; >> >> +/** >> + * Zone device information data structure. >> + * Provide information on a device. >> + */ >> +typedef struct zbd_dev { >> +uint32_t zone_size; >> +zone_model model; >> +uint32_t block_size; >> +uint32_t write_granularity; >> +uint32_t nr_zones; >> +struct BlockZoneDescriptor *zones; /* array of zones */ >> +uint32_t max_nr_open_zones; /* maximum number of explicitly open zones >> */ >> +uint32_t max_nr_active_zones; >> +} zbd_dev; > > This struct isn't use by this patch. Please move this change into the > patch that uses struct zbd_dev. > > QEMU coding style naming would be BlockZoneDev instead of zbd_dev. > > I think this struct contains the block limits fields that could be added > to QEMU's BlockLimits? A new struct may not be necessary. > >> >> struct BlockDriver { >> /* >> @@ -691,6 +705,12 @@ struct BlockDriver { >>QEMUIOVector *qiov, >>int64_t pos); >> >> +int coroutine_fn (*bdrv_co_zone_report)(BlockDriverState *bs, >> +int64_t offset, int64_t *nr_zones, >> +BlockZoneDescriptor *zones); >> +int coroutine_fn (*bdrv_co_zone_mgmt)(BlockDriverState *bs, enum >> zone_op op, >> +int64_t offset, int64_t len); >> + >> /* removable device specific */ >> bool (*bdrv_is_inserted)(BlockDriverState *bs); >> void (*bdrv_eject)(BlockDriverState *bs, bool eject_flag); >> -- >> 2.36.1 >> -- Damien Le Moal Western Digital Research

Re: [RFC v4 9/9] qapi: add support for zoned host device

2022-07-12 Thread Damien Le Moal
_DEVICE' } > ] } This needs to be something like: { 'name': 'zoned_host_device', 'if': 'CONFIG_BLKZONED' } ] } And we need to make sure CONFIG_BLKZONED is defined if and only if we also have HAVE_HOST_BLOCK_DEVICE. > > ## > # @BlockdevOptionsFile: > @@ -4329,7 +4330,9 @@ >'vhdx': 'BlockdevOptionsGenericFormat', >'vmdk': 'BlockdevOptionsGenericCOWFormat', >'vpc':'BlockdevOptionsGenericFormat', > - 'vvfat': 'BlockdevOptionsVVFAT' > + 'vvfat': 'BlockdevOptionsVVFAT', > + 'zoned_host_device': { 'type': 'BlockdevOptionsFile', > + 'if': 'HAVE_HOST_BLOCK_DEVICE' } Same here I think. >} } > > ## -- Damien Le Moal Western Digital Research

Re: [RFC v4 4/9] file-posix: introduce get_sysfs_str_val for device zoned model.

2022-07-12 Thread Damien Le Moal
unsigned long ioctl_op; > diff --git a/include/block/block-common.h b/include/block/block-common.h > index 78cddeeda5..35e00afe8e 100644 > --- a/include/block/block-common.h > +++ b/include/block/block-common.h > @@ -56,8 +56,8 @@ typedef enum zone_op { > } zone_op; > > typedef enum zone_model { > -BLK_Z_HM, > -BLK_Z_HA, > +BLK_Z_HM = 0x1, > +BLK_Z_HA = 0x2, > } zone_model; > > typedef enum BlkZoneCondition { -- Damien Le Moal Western Digital Research

Re: [RFC v4 3/9] file-posix: introduce get_sysfs_long_val for a block queue of sysfs attribute

2022-07-12 Thread Damien Le Moal
d, &file); > +if (mod != BLK_Z_HM) { > +ret = -ENOTSUP; > +return ret; > +} > + > +zone_size = get_sysfs_long_val(fd, &file, "chunk_sectors"); > zone_size_mask = zone_size - 1; > if (offset & zone_size_mask) { > error_report("offset is not the start of a zone"); This hunk that modifies raw_refresh_limits() should go into another patch. Likely, you want that as part of patch 1, so this patch to introduce the get_sysfs_long_val() helper function should come before patch 1. -- Damien Le Moal Western Digital Research

Re: [RFC v4 1/9] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls.

2022-07-12 Thread Damien Le Moal
K_ZT_SWR = 0x2, > +BLK_ZT_SWP = 0x3, > +} BlkZoneType; > + > +/* > + * Zone descriptor data structure. > + * Provide information on a zone with all position and size values in bytes. > + */ > +typedef struct BlockZoneDescriptor { > + uint64_t start; > +uint64_t length; > +uint64_t cap; > +uint64_t wp; > +BlkZoneType type; > +BlkZoneCondition cond; > +} BlockZoneDescriptor; > + > typedef struct BlockDriverInfo { > /* in bytes, 0 if irrelevant */ > int cluster_size; > diff --git a/include/block/block_int-common.h > b/include/block/block_int-common.h > index 8947abab76..6037871089 100644 > --- a/include/block/block_int-common.h > +++ b/include/block/block_int-common.h > @@ -94,6 +94,20 @@ typedef struct BdrvTrackedRequest { > struct BdrvTrackedRequest *waiting_for; > } BdrvTrackedRequest; > > +/** > + * Zone device information data structure. > + * Provide information on a device. > + */ > +typedef struct zbd_dev { > +uint32_t zone_size; > +zone_model model; > +uint32_t block_size; > +uint32_t write_granularity; > +uint32_t nr_zones; > +struct BlockZoneDescriptor *zones; /* array of zones */ > +uint32_t max_nr_open_zones; /* maximum number of explicitly open zones */ > +uint32_t max_nr_active_zones; > +} zbd_dev; > > struct BlockDriver { > /* > @@ -691,6 +705,12 @@ struct BlockDriver { >QEMUIOVector *qiov, >int64_t pos); > > +int coroutine_fn (*bdrv_co_zone_report)(BlockDriverState *bs, > +int64_t offset, int64_t *nr_zones, > +BlockZoneDescriptor *zones); > +int coroutine_fn (*bdrv_co_zone_mgmt)(BlockDriverState *bs, enum zone_op > op, > +int64_t offset, int64_t len); > + > /* removable device specific */ > bool (*bdrv_is_inserted)(BlockDriverState *bs); > void (*bdrv_eject)(BlockDriverState *bs, bool eject_flag); -- Damien Le Moal Western Digital Research

Re: [RFC v4 2/9] qemu-io: add zoned block device operations.

2022-07-12 Thread Damien Le Moal
gt; +offset = cvtnum(argv[optind]); > +++optind; > +len = cvtnum(argv[optind]); > +ret = blk_zone_mgmt(blk, zone_reset, offset, len); > +if (ret < 0) { > +printf("zone reset failed: %s\n", strerror(-ret)); > +} > +return ret; > +} > + > +static const cmdinfo_t zone_reset_cmd = { > +.name = "zone_reset", > +.altname = "zrs", > +.cfunc = zone_reset_f, > +.argmin = 2, > +.argmax = 2, > +.args = "offset len", > +.oneline = "reset a zone write pointer in zone block device", > +}; > + > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2498,6 +2636,11 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&aio_write_cmd); > qemuio_add_command(&aio_flush_cmd); > qemuio_add_command(&flush_cmd); > +qemuio_add_command(&zone_report_cmd); > +qemuio_add_command(&zone_open_cmd); > +qemuio_add_command(&zone_close_cmd); > +qemuio_add_command(&zone_finish_cmd); > +qemuio_add_command(&zone_reset_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); -- Damien Le Moal Western Digital Research

Re: [RFC v4 4/9] file-posix: introduce get_sysfs_str_val for device zoned model.

2022-07-12 Thread Damien Le Moal
t;> const char *ioctl_name; >> unsigned long ioctl_op; >> diff --git a/include/block/block-common.h b/include/block/block-common.h >> index 78cddeeda5..35e00afe8e 100644 >> --- a/include/block/block-common.h >> +++ b/include/block/block-common.h >> @@ -56,8 +56,8 @@ typedef enum zone_op { >> } zone_op; >> >> typedef enum zone_model { >> -BLK_Z_HM, >> -BLK_Z_HA, >> +BLK_Z_HM = 0x1, >> +BLK_Z_HA = 0x2, >> } zone_model; >> >> typedef enum BlkZoneCondition { > > This hunk is unrelated, please move it into a different patch. This actually belong to patch 1 where this enum is introduced. No need for a different patch. > > Cheers, > > Hannes -- Damien Le Moal Western Digital Research

Re: [RFC v3 1/5] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls.

2022-06-28 Thread Damien Le Moal
gt;> >> Declare at the beginning of the function: >> g_autofree struct blk_zone_report *rep = NULL; >> >> And then when needed do: >> >> rep_size = sizeof(struct blk_zone_report) + nrz * sizeof(struct blk_zone); >> rep = g_malloc(rep_size); > > Actua

Re: [RFC v3 1/5] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls.

2022-06-28 Thread Damien Le Moal
On 6/28/22 19:23, Sam Li wrote: > Damien Le Moal 于2022年6月28日周二 17:05写道: >> >> On 6/28/22 17:05, Sam Li wrote: >>> Stefan Hajnoczi 于2022年6月28日周二 14:52写道: >>>> >>>> On Mon, Jun 27, 2022 at 08:19:13AM +0800, Sam Li wrote: >>>>> diff --

Re: [RFC v3 2/5] qemu-io: add zoned block device operations.

2022-06-28 Thread Damien Le Moal
t;> +len = cvtnum(argv[optind]); >> +return blk_zone_mgmt(blk, zone_open, offset, len); > > Where is the error reported? When I look at read_f() I see: > > if (ret < 0) { > printf("read failed: %s\n", strerror(-ret)); > > I think something similar is needed because qemu-io.c does not print an > error message for us. The same is true for the other commands defined in > this patch. > >> +} >> + >> +static const cmdinfo_t zone_open_cmd = { >> +.name = "zone_open", >> +.altname = "f", >> +.cfunc = zone_open_f, >> +.argmin = 2, >> +.argmax = 2, >> +.args = "offset [offset..] len [len..]", > > There are no optional offset/len args. The same is true for the other > commands below. -- Damien Le Moal Western Digital Research

Re: [RFC v3 1/5] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls.

2022-06-28 Thread Damien Le Moal
if (ret) { >>> +return -1; >> >> The return value should be a negative errno. -1 is -EPERM but it's >> probably not that error code you wanted. This should be: >> >> return -errno; >> >>> +} >>> + >>> +zone_size_mask = zone_size - 1; >>> +if (offset & zone_size_mask) { >>> +error_report("offset is not the start of a zone"); >>> +return -1; >> >> return -EINVAL; >> >>> +} >>> + >>> +if (len & zone_size_mask) { >>> +error_report("len is not aligned to zones"); >>> +return -1; >> >> return -EINVAL; >> >>> +static int coroutine_fn raw_co_zone_report(BlockDriverState *bs, int64_t >>> offset, >>> +int64_t len, int64_t *nr_zones, >>> +BlockZoneDescriptor *zones) { >>> +BDRVRawState *s = bs->opaque; >>> +RawPosixAIOData acb; >>> + >>> +acb = (RawPosixAIOData) { >>> +.bs = bs, >>> +.aio_fildes = s->fd, >>> +.aio_type = QEMU_AIO_IOCTL, >>> +.aio_offset = offset, >>> +.aio_nbytes = len, >>> +.zone_report= { >>> +.nr_zones = nr_zones, >>> +.zones = zones, >>> +}, >> >> Indentation is off here. Please use 4-space indentation. > Noted! > > Thanks for reviewing! > > Sam -- Damien Le Moal Western Digital Research

Re: [RFC v2] Adding block layer APIs resembling Linux ZoneBlockDevice ioctls.

2022-06-27 Thread Damien Le Moal
t;> sub-commands. If we use #ifdef __linux__ in block-common.h, it can be >> responsible for converting Linux constants instead. >> >> Thanks for reviewing! If there is any problem, please let me know. > > I suggest defining the constants in block-common.h. #ifdef __linux__ is > not necessary in block-common.h because the constants should just be an > enum with BLK_ZS_NOT_WP = 0 and so on (no need for Linux headers). > > In block/file-posix.c you can define a helper function inside #ifdef > __linux__ that does something like: > > BlkZoneCond zone_cond_from_linux(enum blk_zone_cond val) > { > switch (val) { > case BLK_ZONE_COND_NOT_WP: > return BLK_ZS_NOT_WP; > ... > } > > The code in block/file-posix.c should call this helper to convert from > Linux values to QEMU values. Given that the entire zone API is Linux specific (as far as I know), we do not need to have these helpers: the entire code for zones needs to be under a #ifdef __linux__. But the conversion from Linux struct blk_zone to qemu zone descriptor still needs to be done. And the perfect place to do this is the parse_zone() function. There, we can add: switch (blkz->cond) { case BLK_ZONE_COND_NOT_WP: zone->cond = BLK_ZS_NOT_WP; break; ... } And same for zone type. That will also allow checking the values returned by Linux. ZBC-2 defines more zone types and zone conditions than currently defined in /usr/include/linux/blkzoned.h. If these new zone types/conditions ever get supported by Linux, qemu can catch the values it does not support and reject the drive. > > This way the QEMU block layer does not use Linux constants and compiles > on non-Linux machines. > > Stefan -- Damien Le Moal Western Digital Research

Re: [RFC v2] Adding block layer APIs resembling Linux ZoneBlockDevice ioctls.

2022-06-20 Thread Damien Le Moal
BLK_Z_HA, >> +}; >> >> typedef struct BlockDriverInfo { >> /* in bytes, 0 if irrelevant */ >> diff --git a/include/block/block-io.h b/include/block/block-io.h >> index 62c84f0519..deb8902684 100644 >> --- a/include/block/block-io.h >> +++ b/include/block/block-io.h >> @@ -80,6 +80,13 @@ int bdrv_co_ioctl(BlockDriverState *bs, int req, void >> *buf); >> /* Ensure contents are flushed to disk. */ >> int coroutine_fn bdrv_co_flush(BlockDriverState *bs); >> >> +/* Report zone information of zone block device. */ >> +int coroutine_fn bdrv_co_zone_report(BlockDriverState *bs, int64_t offset, >> + int64_t len, int64_t *nr_zones, >> + struct BlockZoneDescriptor *zones); >> +int coroutine_fn bdrv_co_zone_mgmt(BlockDriverState *bs, enum zone_op op, >> +int64_t offset, int64_t len); >> + >> int bdrv_co_pdiscard(BdrvChild *child, int64_t offset, int64_t bytes); >> bool bdrv_can_write_zeroes_with_unmap(BlockDriverState *bs); >> int bdrv_block_status(BlockDriverState *bs, int64_t offset, >> @@ -290,6 +297,12 @@ bdrv_readv_vmstate(BlockDriverState *bs, QEMUIOVector >> *qiov, int64_t pos); >> int generated_co_wrapper >> bdrv_writev_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos); >> >> +int generated_co_wrapper blk_zone_report(BlockBackend *blk, int64_t offset, >> + int64_t len, int64_t *nr_zones, >> + struct BlockZoneDescriptor *zones); >> +int generated_co_wrapper blk_zone_mgmt(BlockBackend *blk, enum zone_op op, >> +int64_t offset, int64_t len); >> + >> /** >> * bdrv_parent_drained_begin_single: >> * >> diff --git a/include/block/block_int-common.h >> b/include/block/block_int-common.h >> index 8947abab76..4d0180a7da 100644 >> --- a/include/block/block_int-common.h >> +++ b/include/block/block_int-common.h >> @@ -63,6 +63,7 @@ >> #define BLOCK_OPT_EXTL2 "extended_l2" >> >> #define BLOCK_PROBE_BUF_SIZE512 >> +#define zone_start_sector 512 >> >> enum BdrvTrackedRequestType { >> BDRV_TRACKED_READ, >> @@ -94,6 +95,19 @@ typedef struct BdrvTrackedRequest { >> struct BdrvTrackedRequest *waiting_for; >> } BdrvTrackedRequest; >> >> +/** >> + * Zone device information data structure. >> + * Provide information on a device. >> + */ >> +typedef struct zbd_dev { >> +enum zone_model model; >> +uint32_t block_size; > > How does this relate to QEMU's BlockLimits? > >> +uint32_t write_granularity; > > Same. Maybe this belongs in BlockLimits? > >> +uint32_t nr_zones; > > Should this really be limited to 32-bit? For example, take 256 MB zones, > then the max nr_zones * 256 MB is much smaller than a uint64_t capacity > value. It seems safer to make this 64-bit, but maybe Dmitry, Hannes, or > Damien can tell us what to do here. u32 is fine. We are nowhere near 4G zones :) The max out there today is 20TB SMR drive with 128MB zones. About 150,000 zones. Nowhere near 4G limit. Linux kernel also uses unsigned int for number of zones everywhere. > >> +struct BlockZoneDescriptor *zones; /* array of zones */ > > When is this used? This will be needed when zone append implementation comes. But not there yet :) A simpler form of the zone descriptor will likely be better anyway to save memory. -- Damien Le Moal Western Digital Research

Re: [RFC v1] Adding block layer APIs resembling Linux ZoneBlockDevice ioctls.

2022-06-13 Thread Damien Le Moal
tic const cmdinfo_t zone_report_cmd = { >>> + .name = "zone_report", >>> + .altname = "f", >>> +.cfunc = zone_report_f, >>> +.oneline = "report zone information in zone block device", >>> +}; >>> + >>> +static int zone_open_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +return blk_zone_mgmt(blk, zone_open); >>> +} >>> + >>> +static const cmdinfo_t zone_open_cmd = { >>> +.name = "zone_open", >>> +.altname = "f", >>> +.cfunc = zone_open_f, >>> +.oneline = "explicit open a range of zones in zone block device", >>> +}; >>> + >>> +static int zone_close_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +return blk_zone_mgmt(blk, zone_close); >>> +} >>> + >>> +static const cmdinfo_t zone_close_cmd = { >>> +.name = "zone_close", >>> +.altname = "f", >>> +.cfunc = zone_close_f, >>> +.oneline = "close a range of zones in zone block device", >>> +}; >>> + >>> +static int zone_finish_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +return blk_zone_mgmt(blk, zone_finish); >>> +} >>> + >>> +static const cmdinfo_t zone_finish_cmd = { >>> +.name = "zone_finish", >>> +.altname = "f", >>> +.cfunc = zone_finish_f, >>> +.oneline = "finish a range of zones in zone block device", >>> +}; >>> + >>> +static int zone_reset_f(BlockBackend *blk, int argc, char **argv) >>> +{ >>> +return blk_zone_mgmt(blk, zone_reset); >>> +} >>> + >>> +static const cmdinfo_t zone_reset_cmd = { >>> +.name = "zone_reset", >>> +.altname = "f", >>> +.cfunc = zone_reset_f, >>> +.oneline = "reset a zone write pointer in zone block device", >>> +}; >>> + >>> + >>> static int truncate_f(BlockBackend *blk, int argc, char **argv); >>> static const cmdinfo_t truncate_cmd = { >>> .name = "truncate", >>> @@ -2498,6 +2559,11 @@ static void __attribute((constructor)) >>> init_qemuio_commands(void) >>> qemuio_add_command(&aio_write_cmd); >>> qemuio_add_command(&aio_flush_cmd); >>> qemuio_add_command(&flush_cmd); >>> +qemuio_add_command(&zone_report_cmd); >>> +qemuio_add_command(&zone_open_cmd); >>> +qemuio_add_command(&zone_close_cmd); >>> +qemuio_add_command(&zone_finish_cmd); >>> +qemuio_add_command(&zone_reset_cmd); >>> qemuio_add_command(&truncate_cmd); >>> qemuio_add_command(&length_cmd); >>> qemuio_add_command(&info_cmd); >>> diff --git a/tests/qemu-iotests/tests/zoned.sh >>> b/tests/qemu-iotests/tests/zoned.sh >>> new file mode 100644 >>> index 00..a1596e8340 >>> --- /dev/null >>> +++ b/tests/qemu-iotests/tests/zoned.sh >>> @@ -0,0 +1,47 @@ >>> +#!/usr/bin/env bash >>> +# >>> +# Test zone management operations. >>> +# >>> + >>> + >>> + >>> +seq=`basename $0` >>> +echo "QA output created by $seq" >>> + >>> +status=1 # failure is the default! >>> + >>> +_cleanup() >>> +{ >>> + _cleanup_test_img >>> +} >>> +trap "_cleanup; exit \$status" 0 1 2 3 15 >>> + >>> +# get standard environment, filters and checks >>> +. ./common.rc >>> +. ./common.filter >>> +. ./common.pattern >>> + >>> +# much of this could be generic for any format supporting compression. >>> +_supported_fmt qcow qcow2 >>> +_supported_proto file >>> +_supported_os Linux >>> + >>> +TEST_OFFSETS="0 1000 4000" >>> +TEST_LENS="1000" >>> +TEST_OPS="zone_report zone_open zone_close zone_finish zone_reset >>> zone_append" >>> + >>> + >>> +echo "Testing a null_blk device" >>> +echo >>> + >>> +for offset in $TEST_OFFSETS; do >>> +echo "At offset $offset:" >>> +io_test "write -b" $offset $CLUSTER_SIZE 8 >>> +io_test "read -b" $offset $CLUSTER_SIZE 8 >>> +_check_test_img >>> +done >>> + >>> +# success, all done >>> +echo "*** done" >>> +rm -f $seq.full >>> +status=0 >> >> >> -- >> Damien Le Moal >> Western Digital Research > -- Damien Le Moal Western Digital Research

Re: [RFC v1] Adding block layer APIs resembling Linux ZoneBlockDevice ioctls.

2022-06-12 Thread Damien Le Moal
ot;f", > +.cfunc = zone_reset_f, > +.oneline = "reset a zone write pointer in zone block device", > +}; > + > + > static int truncate_f(BlockBackend *blk, int argc, char **argv); > static const cmdinfo_t truncate_cmd = { > .name = "truncate", > @@ -2498,6 +2559,11 @@ static void __attribute((constructor)) > init_qemuio_commands(void) > qemuio_add_command(&aio_write_cmd); > qemuio_add_command(&aio_flush_cmd); > qemuio_add_command(&flush_cmd); > +qemuio_add_command(&zone_report_cmd); > +qemuio_add_command(&zone_open_cmd); > +qemuio_add_command(&zone_close_cmd); > +qemuio_add_command(&zone_finish_cmd); > +qemuio_add_command(&zone_reset_cmd); > qemuio_add_command(&truncate_cmd); > qemuio_add_command(&length_cmd); > qemuio_add_command(&info_cmd); > diff --git a/tests/qemu-iotests/tests/zoned.sh > b/tests/qemu-iotests/tests/zoned.sh > new file mode 100644 > index 00..a1596e8340 > --- /dev/null > +++ b/tests/qemu-iotests/tests/zoned.sh > @@ -0,0 +1,47 @@ > +#!/usr/bin/env bash > +# > +# Test zone management operations. > +# > + > + > + > +seq=`basename $0` > +echo "QA output created by $seq" > + > +status=1 # failure is the default! > + > +_cleanup() > +{ > + _cleanup_test_img > +} > +trap "_cleanup; exit \$status" 0 1 2 3 15 > + > +# get standard environment, filters and checks > +. ./common.rc > +. ./common.filter > +. ./common.pattern > + > +# much of this could be generic for any format supporting compression. > +_supported_fmt qcow qcow2 > +_supported_proto file > +_supported_os Linux > + > +TEST_OFFSETS="0 1000 4000" > +TEST_LENS="1000" > +TEST_OPS="zone_report zone_open zone_close zone_finish zone_reset > zone_append" > + > + > +echo "Testing a null_blk device" > +echo > + > +for offset in $TEST_OFFSETS; do > +echo "At offset $offset:" > +io_test "write -b" $offset $CLUSTER_SIZE 8 > +io_test "read -b" $offset $CLUSTER_SIZE 8 > +_check_test_img > +done > + > +# success, all done > +echo "*** done" > +rm -f $seq.full > +status=0 -- Damien Le Moal Western Digital Research

Re: Outreachy project task: Adding QEMU block layer APIs resembling Linux ZBD ioctls.

2022-06-01 Thread Damien Le Moal
qemu/block/file-posix.c) using the BlockDriver data type (defined as a struct in qemu/include/block/block_int-common.h), you will see that the driver callback functions do not use a file descriptor (fd) but a pointer to some data structure (most of the time a BlockDriverState pointer). Another thing: you will need one more callback to get the device information related to zones: maximum number of open and active zones at least (the number of zones can be discovered with a report zones call). Cheers. -- Damien Le Moal Western Digital Research

Re: Outreachy project task: Adding QEMU block layer APIs resembling Linux ZBD ioctls.

2022-05-30 Thread Damien Le Moal
On 2022/05/30 20:22, Stefan Hajnoczi wrote: > On Mon, 30 May 2022 at 06:38, Damien Le Moal wrote: >> On 2022/05/30 14:09, Sam Li wrote: >> Once you have an API working and the ability to execute all zone operations >> from >> a guest, you can then work on add

Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

2020-09-29 Thread Damien Le Moal
On 2020/09/29 19:46, Klaus Jensen wrote: > On Sep 28 22:54, Damien Le Moal wrote: >> On 2020/09/29 6:25, Keith Busch wrote: >>> On Mon, Sep 28, 2020 at 08:36:48AM +0200, Klaus Jensen wrote: >>>> On Sep 28 02:33, Dmitry Fomichev wrote: >>>>> You are maki

Re: [PATCH v4 00/14] hw/block/nvme: Support Namespace Types and Zoned Namespace Command Set

2020-09-28 Thread Damien Le Moal
to get developers started with testing ZNS software with QEMU. That is the most important goal here. 5.9 is around the corner, we need something for people to get started with ZNS quickly. -- Damien Le Moal Western Digital Research