On 11.02.21 21:00, Eric Blake wrote:
On 2/11/21 11:22 AM, Max Reitz wrote:
We have repeatedly received reports that SEEK_HOLE and SEEK_DATA are
slow on certain filesystems and/or under certain circumstances.  That is
why we generally try to avoid it (which is why bdrv_co_block_status()
has the @want_zero parameter, and which is why qcow2 has a metadata
preallocation detection, so we do not fall through to the protocol layer
to discover which blocks are zero, unless that is really necessary
(i.e., for metadata-preallocated images)).

In addition to those measures, we can also try to speed up zero
detection by letting file-posix cache some hole location information,
namely where the next hole after the most recently queried offset is.
This helps especially for images that are (nearly) fully allocated,
which is coincidentally also the case where querying for zero
information cannot gain us much.

Note that this of course only works so long as we have no concurrent
writers to the image, which is the case when the WRITE capability is not
shared.

Alternatively (or perhaps as an improvement in the future), we could let
file-posix keep track of what it knows is zero and what it knows is
non-zero with bitmaps, which would help images that actually have a
significant number of holes (where this implementation here cannot do
much).  But for such images, SEEK_HOLE/DATA are generally faster (they
do not need to seek through the whole file), and the performance lost by
querying the block status does not feel as bad because it is outweighed
by the performance that can be saved by special-cases zeroed areas, so
focussing on images that are (nearly) fully allocated is more important.

focusing

Wiktionary lists both as valid. *shrug*


Signed-off-by: Max Reitz <mre...@redhat.com>
---
  block/file-posix.c | 81 +++++++++++++++++++++++++++++++++++++++++++++-
  1 file changed, 80 insertions(+), 1 deletion(-)


  static int find_allocation(BlockDriverState *bs, off_t start,
                             off_t *data, off_t *hole)
  {
-#if defined SEEK_HOLE && defined SEEK_DATA
      BDRVRawState *s = bs->opaque;
+
+    if (s->next_zero_offset_valid) {
+        if (start >= s->next_zero_offset_from && start < s->next_zero_offset) {
+            *data = start;
+            *hole = s->next_zero_offset;
+            return 0;
+        }
+    }
+
+#if defined SEEK_HOLE && defined SEEK_DATA

Why move the #if? If SEEK_HOLE is not defined, s->next_zero_offset_valid
should never be set, because we'll treat the entire image as data.  But
at the same time, it doesn't hurt, so doesn't stop my review.

I found it better to put it outside the SEEK_HOLE/DATA section, because while it won’t work when neither are defined, the code is still valid.

I don’t know. :)

Reviewed-by: Eric Blake <ebl...@redhat.com>

Thanks!

Max


Reply via email to