On 4/30/20 6:10 AM, Vladimir Sementsov-Ogievskiy wrote:
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, prepare bdrv_aligned_pwritev() now and convert the dependencies:
bdrv_co_write_req_prepare() and bdrv_co_write_req_finish() to signed
type bytes.

Series: 64bit-block-status
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
---
  block/io.c | 17 ++++++++++-------
  1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/block/io.c b/block/io.c
index b83749cc50..8bb4ea6285 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1686,12 +1686,11 @@ fail:
  }
static inline int coroutine_fn
-bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, uint64_t bytes,
+bdrv_co_write_req_prepare(BdrvChild *child, int64_t offset, int64_t bytes,
                            BdrvTrackedRequest *req, int flags)

Changes from unsigned to signed.  Audit of callers:

bdrv_aligned_pwritev() - adjusted this patch, safe
bdrv_do_pdiscard() - passes int64_t, safe
bdrv_co_copy_range_internal() - passes int64_t, safe
bdrv_do_truncate() - passes int64_t, safe

Internal usage:

  {
      BlockDriverState *bs = child->bs;
      bool waited;
-    int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);

Drops an old sector calculation, and replaces it with:

if (bs->read_only) {
          return -EPERM;
@@ -1716,8 +1715,10 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t 
offset, uint64_t bytes,
      }
assert(req->overlap_offset <= offset);
+    assert(offset <= INT64_MAX - bytes);
      assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
-    assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
+    assert(offset + bytes <= bs->total_sectors * BDRV_SECTOR_SIZE ||
+           child->perm & BLK_PERM_RESIZE);

assertions that things fit within 63 bits.  Safe

[The req->overlap_offset+ req->overlap_bytes calculation used to be unsigned, but was changed to be signed earlier in this series]

switch (req->type) {
      case BDRV_TRACKED_WRITE:
@@ -1738,7 +1739,7 @@ bdrv_co_write_req_prepare(BdrvChild *child, int64_t 
offset, uint64_t bytes,
  }
static inline void coroutine_fn
-bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, uint64_t bytes,
+bdrv_co_write_req_finish(BdrvChild *child, int64_t offset, int64_t bytes,
                           BdrvTrackedRequest *req, int ret)
  {

Similar to the above; same four callers, all pass int64_t.


      int64_t end_sector = DIV_ROUND_UP(offset + bytes, BDRV_SECTOR_SIZE);

This computation needs analysis.  Previously, we had:

DIV_ROUND_UP(int64_t + uint64_t, unsigned long long)
which expands to:
(((uint64_t) + (ull) - int) / (ull))
which simplifies to uint64_t.

Now we have:
DIV_ROUND_UP(int64_t + int64_t, ull)
Okay, in spite of our argument changing type, the macro still results in a 64-bit unsigned answer. Either way, that answer fits within 63 bits, so it is safe when assigned to int64_t.

Also in this function:
            stat64_max(&bs->wr_highest_offset, offset + bytes);
in include/qemu/stats64.h, takes uint64_t parameter, but we're passing a positive 63-bit number - safe
            bdrv_set_dirty(bs, offset, bytes);
in block/dirty-bitmap.c, takes int64_t parameter - safe

@@ -1780,14 +1781,14 @@ bdrv_co_write_req_finish(BdrvChild *child, int64_t 
offset, uint64_t bytes,
   * after possibly fragmenting it.
   */
  static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
-    BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
+    BdrvTrackedRequest *req, int64_t offset, int64_t bytes,
      int64_t align, QEMUIOVector *qiov, size_t qiov_offset, int flags)
  {

changes signature from unsigned 32-bit to signed 64-bit.  callers:

bdrv_co_do_zero_pwritev() - passes int64_t, but that was clamped to either pad.buf_len [BdrvRequestPadding uses 'size_t buf_len', but initializes it in bdrv_init_padding() to at most 2*align] or align set from BlockLimits.request_alignment (naturally uint32_t, but documented as 'a power of 2 less than INT_MAX' which is at most 1G), so the old code never overflowed, and the new code introduces no change

Perhaps we should separately fix BdrvRequestPadding to use a saner type than size_t for continuity between 32- and 64-bit platforms (perhaps uint32_t rather than int64_t, since we know our padding is bounded by request_alignment), but it doesn't impact this patch

bdrv_do_pwritev_part() - still passes unsigned int at this point in the series, safe

Usage within the function:

      BlockDriverState *bs = child->bs;
      BlockDriver *drv = bs->drv;
      int ret;
- uint64_t bytes_remaining = bytes;
+    int64_t bytes_remaining = bytes;

Previously we widened unsigned 32-bit into unsigned 64-bit; now we use signed 64-bit unchanged.

      int max_transfer;
if (!drv) {
@@ -1799,6 +1800,8 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild 
*child,
      }
assert(is_power_of_2(align));
+    assert(offset >= 0);
+    assert(bytes >= 0);
      assert((offset & (align - 1)) == 0);
      assert((bytes & (align - 1)) == 0);
      assert(!qiov || qiov_offset + bytes <= qiov->size);

qiov->size is only size_t, while 'qiov_offset + bytes' changed from 'size_t + unsigned int' to 'size_t + int64_t'. The resulting type of the computation changes for some platforms, but the assertion is proving that things still fit (including in 32 bits, when size_t is constrained).

    ret = bdrv_co_write_req_prepare(child, offset, bytes, req, flags);
also touched in this patch, safe

        qemu_iovec_is_zero(qiov, qiov_offset, bytes)) {
Passes an 'int64_t' to a 'size_t' parameter, which is possibly narrowing. Fortunately, the assertions just above prove that by this point, we are constrained by qiov->size, which is also size_t. Safe.

        ret = bdrv_co_do_pwrite_zeroes(bs, offset, bytes, flags);
Passes to int64_t, safe

        ret = bdrv_driver_pwritev_compressed(bs, offset, bytes,
Passes to int64_t, safe

ret = bdrv_driver_pwritev(bs, offset, bytes, qiov, qiov_offset, flags);
Passes to int64_t, safe

            ret = bdrv_driver_pwritev(bs, offset + bytes - bytes_remaining,
                                      num, qiov, bytes - bytes_remaining,
Passes int64_t to size_t parameter, but the previous assertion proved we did not overflow qiov->size which is size_t. Safe

    bdrv_co_write_req_finish(child, offset, bytes, req, ret);
also touched in this patch, safe

@@ -1899,7 +1902,7 @@ static int coroutine_fn bdrv_co_do_zero_pwritev(BdrvChild 
*child,
      assert(!bytes || (offset & (align - 1)) == 0);
      if (bytes >= align) {
          /* Write the aligned part in the middle. */
-        uint64_t aligned_bytes = bytes & ~(align - 1);
+        int64_t aligned_bytes = bytes & ~(align - 1);
          ret = bdrv_aligned_pwritev(child, req, offset, aligned_bytes, align,
                                     NULL, 0, flags);
          if (ret < 0) {


Reviewed-by: Eric Blake <ebl...@redhat.com>

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org


Reply via email to