On Sat, Mar 23, 2019 at 02:45:44PM -0500, Eric Blake wrote: > On 3/20/19 5:11 PM, Richard W.M. Jones wrote: > > This uses lseek SEEK_DATA/SEEK_HOLE to search for allocated data and > > holes in the underlying file. > > --- > > plugins/file/file.c | 139 ++++++++++++++++++++++++++++++++++++++++---- > > 1 file changed, 129 insertions(+), 10 deletions(-) > > > > > -int file_debug_zero; /* to enable: -D file.zero=1 */ > > +/* Any callbacks using lseek must be protected by this lock. */ > > +static pthread_mutex_t lseek_lock = PTHREAD_MUTEX_INITIALIZER; > > + > > +/* to enable: -D file.zero=1 */ > > +int file_debug_zero; > > > > static void > > file_unload (void) > > @@ -220,6 +227,21 @@ file_close (void *handle) > > > > #define THREAD_MODEL NBDKIT_THREAD_MODEL_PARALLEL > > > > +/* For block devices, stat->st_size is not the true size. */ > > +static int64_t > > +block_device_size (int fd) > > +{ > > + off_t size; > > + > > + size = lseek (fd, 0, SEEK_END); > > Calling lseek without the lock? I'm not sure if you can guarantee that > .size won't be called in parallel with some other .extents. > /me reads ahead > Oh, the caller has the lock. I might add a comment to this function that > it expects the caller to grab the lock.
Yes I'll add a comment as it wasn't even obvious to me when I looked back at the code. > > +#ifdef SEEK_HOLE > > +/* Extents. */ > > + > > +static int > > +file_can_extents (void *handle) > > +{ > > + struct handle *h = handle; > > + off_t r; > > + > > + /* A simple test to see whether SEEK_HOLE etc is likely to work on > > + * the current filesystem. > > + */ > > + pthread_mutex_lock (&lseek_lock); > > + r = lseek (h->fd, 0, SEEK_HOLE); > > + pthread_mutex_unlock (&lseek_lock); > > + if (r == -1) { > > + nbdkit_debug ("extents disabled: lseek: SEEK_HOLE: %m"); > > + return 0; > > In fact, if lseek() succeeds but returns the current file size, you know > there are no holes in the file at all, in which case, it might be more > efficient to store a variable stating that .extents should avoid lseek() > altogether and just report the entire file as data. > > Hmm - if we just make .extents return false in this case, then nbdkit > won't advertise block status to the guest even though we have valid > status available (knowing there are no holes is nicer than merely > assuming there are no holes because block status was unavailable). But > if we return true, then WE have to do that optimization ourselves. Maybe > .can_extent should be a tri-state return, just like .can_fua, where a > plugin can choose NBDKIT_EXTENT_NONE to force no block status > advertisement, NBDKIT_EXTENT_EMULATE (default if .extents is missing) to > let nbdkit do the work of reporting the entire disk as data without > calling .extents, and NBDKIT_EXTENT_NATIVE (requires .extents, and lets > the plugin do all the work). Then, when lseek(SEEK_HOLE) returns EOF, we > can return NBDKIT_EXTENT_EMULATE instead of having to track our > optimization of skipping lseek in .extents ourselves. I'm not really sure I understand this. However in the latest iteration of the series can_extents is disconnected from what we advertise back to clients. It is solely used to decide if the server calls .extents or bypasses it (returning fully allocated). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/ _______________________________________________ Libguestfs mailing list Libguestfs@redhat.com https://www.redhat.com/mailman/listinfo/libguestfs