Fwd: [FOSDEM] CfP Software Defined Storage devroom FOSDEM23

2022-11-16 Thread Niels de Vos
Hi!

In a few montsh time FOSDEM will host an in-person Software Defined
Storage devroom again. It would be a great oppertunity to show what
storage related things the QEMU project has been doing, and what is
planned for the future. Please consider proposing a talk!

Thanks,
Niels


- Forwarded message from Jan Fajerski  -

> From: Jan Fajerski 
> To: fos...@lists.fosdem.org
> Cc: devroom-manag...@lists.fosdem.org
> Date: Thu, 10 Nov 2022 10:49:51 +0100
> Subject: [FOSDEM] CfP Software Defined Storage devroom FOSDEM23
> 
> FOSDEM is a free software event that offers open source communities a place to
> meet, share ideas and collaborate.  It is well known for being highly
> developer-oriented and in the past brought together 8000+ participants from
> all over the world.  Its home is in the city of Brussels (Belgium).
> 
> FOSDEM 2023 will take place as an in-person event during the weekend of 
> February
> 4./5. 2023. More details about the event can be found at http://fosdem.org/
> 
> ** Call For Participation
> 
> The Software Defined Storage devroom will go into its seventh round for talks
> around Open Source Software Defined Storage projects, management tools
> and real world deployments.
> 
> Presentation topics could include but are not limited too:
> 
> - Your work on a SDS project like Ceph, Gluster, OpenEBS, CORTX or Longhorn
> 
> - Your work on or with SDS related projects like OpenStack SWIFT or Container
> Storage Interface
> 
> - Management tools for SDS deployments
> 
> - Monitoring tools for SDS clusters
> 
> ** Important dates:
> 
> - Dec 10th 2022:  submission deadline for talk proposals
> - Dec 15th 2022:  announcement of the final schedule
> - Feb  4th 2023:  Software Defined Storage dev room
> 
> Talk proposals will be reviewed by a steering committee:
> - Niels de Vos (Red Hat)
> - Jan Fajerski (Red Hat)
> - TBD
> 
> We also welcome additional volunteers to help with making this devroom a
> success.
> 
> Use the FOSDEM 'pentabarf' tool to submit your proposal:
> https://penta.fosdem.org/submission/FOSDEM23
> 
> - If necessary, create a Pentabarf account and activate it.
> Please reuse your account from previous years if you have
> already created it.
> https://penta.fosdem.org/user/new_account/FOSDEM23
> 
> - In the "Person" section, provide First name, Last name
> (in the "General" tab), Email (in the "Contact" tab)
> and Bio ("Abstract" field in the "Description" tab).
> 
> - Submit a proposal by clicking on "Create event".
> 
> - If you plan to register your proposal in several tracks to increase your
> chances, don't! Register your talk once, in the most accurate track.
> 
> - Presentations have to be pre-recorded before the event and will be streamed
> on   the event weekend.
> 
> - Important! Select the "Software Defined Storage devroom" track
> (on the "General" tab).
> 
> - Provide the title of your talk ("Event title" in the "General" tab).
> 
> - Provide a description of the subject of the talk and the
> intended audience (in the "Abstract" field of the "Description" tab)
> 
> - Provide a rough outline of the talk or goals of the session (a short
> list of bullet points covering topics that will be discussed) in the
> "Full description" field in the "Description" tab
> 
> - Provide an expected length of your talk in the "Duration" field.
>   We suggest a length between 15 and 45 minutes.
> 
> ** Recording of talks
> 
> The FOSDEM organizers plan to have live streaming and recording fully working,
> both for remote/later viewing of talks, and so that people can watch streams
> in the hallways when rooms are full. This requires speakers to consent to
> being recorded and streamed. If you plan to be a speaker, please understand
> that by doing so you implicitly give consent for your talk to be recorded and
> streamed. The recordings will be published under the same license as all
> FOSDEM content (CC-BY).
> 
> Hope to hear from you soon! And please forward this announcement.
> 
> If you have any further questions, please write to the mailing list at
> storage-devr...@lists.fosdem.org and we will try to answer as soon as
> possible.
> 
> Thanks!
> 
> ___
> FOSDEM mailing list
> fos...@lists.fosdem.org
> https://lists.fosdem.org/listinfo/fosdem

- End forwarded message -


signature.asc
Description: PGP signature


Re: Strange data corruption issue with gluster (libgfapi) and ZFS

2020-02-20 Thread Niels de Vos
On Thu, Feb 20, 2020 at 10:17:00AM +0100, Stefan Ring wrote:
> This list seems to be used for patches only. I will re-post to qemu-discuss.

Do include integrat...@gluster.org as well. There should be people on
that list that can help with debugging from the Gluster side.

Niels




Re: [GEDI] [PATCH 07/17] gluster: Drop useless has_zero_init callback

2020-02-17 Thread Niels de Vos
On Mon, Feb 17, 2020 at 06:03:40AM -0600, Eric Blake wrote:
> On 2/17/20 2:06 AM, Niels de Vos wrote:
> > On Fri, Jan 31, 2020 at 11:44:26AM -0600, Eric Blake wrote:
> > > block.c already defaults to 0 if we don't provide a callback; there's
> > > no need to write a callback that always fails.
> > > 
> > > Signed-off-by: Eric Blake 
> > 
> > Reviewed-by: Niels de Vos 
> > 
> 
> Per your other message,
> 
> On 2/17/20 2:16 AM, Niels de Vos wrote:
> > On Fri, Jan 31, 2020 at 11:44:31AM -0600, Eric Blake wrote:
> >> Since gluster already copies file-posix for lseek usage in block
> >> status, it also makes sense to copy it for learning if the image
> >> currently reads as all zeroes.
> >>
> 
> >> +static int qemu_gluster_known_zeroes(BlockDriverState *bs)
> >> +{
> >> +/*
> >> + * GlusterFS volume could be backed by a block device, with no way
> >
> > Actually, Gluster dropped support for volumes backed by block devices
> > (LVM) a few releases back. Nobody could be found that used it, and it
> > could not be combined with other Gluster features. All contents on a
> > Gluster volume is now always backed by 'normal' files on a filesystem.
> 
> That's useful to know.  Thanks!
> 
> >
> > Creation or truncation should behave just as on a file on a local
> > filesystem. So maybe qemu_gluster_known_zeroes is not needed at all?
> 
> Which version of gluster first required a regular filesystem backing for all
> gluster files?  Does qemu support older versions (in which case, what is the
> correct version-probing invocation to return 0 prior to that point, and 1
> after), or do all versions supported by qemu already guarantee zero
> initialization on creation or widening truncation by virtue of POSIX file
> semantics (in which case, patch 7 should instead switch to using
> .bdrv_has_zero_init_1 for both functions)?  Per configure, we probe for
> glusterfs_xlator_opt from gluster 4, which implies the code still tries to
> be portable to even older gluster, but I'm not sure if this squares with
> qemu-doc.texi which mentions our minimum distro policy (for example, now
> that qemu requires python 3 consistent with our distro policy, that rules
> out several older systems where older gluster was likely to be present).

The block device feature (storage/bd xlator) got deprecated in Gluster
5.0, and was removed with Gluster 6.0. Fedora 29 is the last version
that contained the bd.so xlator (glusterfs-server 5.0, deprecated).

All currently maintained and available Gluster releases should have
glusterfs_xlator_opt (introduced with glusterfs-3.5 in 2014). However, I
am not sure what versions are provided with different distributions. The
expectation is that at least Gluster 5 is provided, as older releases
will not get any updates anymore. See
https://www.gluster.org/release-schedule/ for a more detailed timeline.

Unfortunately there is no reasonable way to probe for the type of
backend (block or filesystem) that is used. So, a runtime check to be on
the extreme safe side to fallback on block device backends is not an
option.

HTH,
Niels




Re: [GEDI] [PATCH 12/17] gluster: Support BDRV_ZERO_OPEN

2020-02-17 Thread Niels de Vos
On Fri, Jan 31, 2020 at 11:44:31AM -0600, Eric Blake wrote:
> Since gluster already copies file-posix for lseek usage in block
> status, it also makes sense to copy it for learning if the image
> currently reads as all zeroes.
> 
> Signed-off-by: Eric Blake 
> ---
>  block/gluster.c | 20 
>  1 file changed, 20 insertions(+)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 9d952c70981b..0417a86547c8 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -1464,6 +1464,22 @@ exit:
>  return -ENOTSUP;
>  }
> 
> +static int qemu_gluster_known_zeroes(BlockDriverState *bs)
> +{
> +/*
> + * GlusterFS volume could be backed by a block device, with no way

Actually, Gluster dropped support for volumes backed by block devices
(LVM) a few releases back. Nobody could be found that used it, and it
could not be combined with other Gluster features. All contents on a
Gluster volume is now always backed by 'normal' files on a filesystem.

Creation or truncation should behave just as on a file on a local
filesystem. So maybe qemu_gluster_known_zeroes is not needed at all?

Niels


> + * to query if regions added by creation or truncation will read
> + * as zeroes.  However, we can use lseek(SEEK_DATA) to check if
> + * contents currently read as zero.
> + */
> +off_t data, hole;
> +
> +if (find_allocation(bs, 0, , ) == -ENXIO) {
> +return BDRV_ZERO_OPEN;
> +}
> +return 0;
> +}
> +
>  /*
>   * Returns the allocation status of the specified offset.
>   *
> @@ -1561,6 +1577,7 @@ static BlockDriver bdrv_gluster = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1591,6 +1608,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1621,6 +1639,7 @@ static BlockDriver bdrv_gluster_unix = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1657,6 +1676,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_known_zeroes= qemu_gluster_known_zeroes,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> -- 
> 2.24.1
> 
> ___
> integration mailing list
> integrat...@gluster.org
> https://lists.gluster.org/mailman/listinfo/integration
> 




Re: [GEDI] [PATCH 07/17] gluster: Drop useless has_zero_init callback

2020-02-17 Thread Niels de Vos
On Fri, Jan 31, 2020 at 11:44:26AM -0600, Eric Blake wrote:
> block.c already defaults to 0 if we don't provide a callback; there's
> no need to write a callback that always fails.
> 
> Signed-off-by: Eric Blake 

Reviewed-by: Niels de Vos 

> ---
>  block/gluster.c | 14 --
>  1 file changed, 14 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 4fa4a77a4777..9d952c70981b 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -1357,12 +1357,6 @@ static int64_t 
> qemu_gluster_allocated_file_size(BlockDriverState *bs)
>  }
>  }
> 
> -static int qemu_gluster_has_zero_init(BlockDriverState *bs)
> -{
> -/* GlusterFS volume could be backed by a block device */
> -return 0;
> -}
> -
>  /*
>   * Find allocation range in @bs around offset @start.
>   * May change underlying file descriptor's file offset.
> @@ -1567,8 +1561,6 @@ static BlockDriver bdrv_gluster = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1599,8 +1591,6 @@ static BlockDriver bdrv_gluster_tcp = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1631,8 +1621,6 @@ static BlockDriver bdrv_gluster_unix = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> @@ -1669,8 +1657,6 @@ static BlockDriver bdrv_gluster_rdma = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> -.bdrv_has_zero_init   = qemu_gluster_has_zero_init,
> -.bdrv_has_zero_init_truncate  = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_pdiscard = qemu_gluster_co_pdiscard,
>  #endif
> -- 
> 2.24.1
> 
> ___
> integration mailing list
> integrat...@gluster.org
> https://lists.gluster.org/mailman/listinfo/integration
> 




CfP: Software Defined Storage devroom at FOSDEM 2020

2019-10-11 Thread Niels de Vos
FOSDEM is a free software event that offers open source communities a
place to meet, share ideas and collaborate.  It is renown for being
highly developer- oriented and brings together 8000+ participants from
all over the world.  It is held in the city of Brussels (Belgium).

FOSDEM 2020 will take place during the weekend of February 1st-2nd 2020.
More details about the event can be found at http://fosdem.org/

** Call For Participation

The Software Defined Storage devroom will go into it's fourth round for
talks around Open Source Software Defined Storage projects, management
tools and real world deployments.

Presentation topics could include but are not limited too:

- Your work on a SDS project like Ceph, Gluster, OpenEBS or LizardFS
- Your work on or with SDS related projects like SWIFT or Container
  Storage Interface
- Management tools for SDS deployments
- Monitoring tools for SDS clusters

** Important dates:

- Nov 24th 2019:  submission deadline for talk proposals
- Dec 15th 2019:  announcement of the final schedule
- Feb  2nd 2020:  Software Defined Storage dev room

Talk proposals will be reviewed by a steering committee:
- Niels de Vos (OpenShift Container Storage Developer - Red Hat)
- Jan Fajerski (Ceph Developer - SUSE)
- Kai Wagner (SUSE)
- Mike Perez (Ceph Community Manager, Red Hat)

Use the FOSDEM 'pentabarf' tool to submit your proposal:
https://penta.fosdem.org/submission/FOSDEM20

- If necessary, create a Pentabarf account and activate it.  Please
  reuse your account from previous years if you have already created it.
  https://penta.fosdem.org/user/new_account/FOSDEM20

- In the "Person" section, provide First name, Last name (in the
  "General" tab), Email (in the "Contact" tab) and Bio ("Abstract" field
  in the "Description" tab).

- Submit a proposal by clicking on "Create event".

- Important! Select the "Software Defined Storage devroom" track (on the
  "General" tab).

- Provide the title of your talk ("Event title" in the "General" tab).

- Provide a description of the subject of the talk and the intended
  audience (in the "Abstract" field of the "Description" tab)

- Provide a rough outline of the talk or goals of the session (a short
  list of bullet points covering topics that will be discussed) in the
  "Full description" field in the "Description" tab

- Provide an expected length of your talk in the "Duration" field.
  Please consider at least 5 minutes of discussion into your proposal
  plus allow 5 minutes for the handover to the next presenter.
  Suggested talk length would be 20+5+5 and 45+10+5 minutes. Note that
  short talks have a preference so that more topics can be presented
  during the day.

** Recording of talks

The FOSDEM organizers plan to have live streaming and recording fully
working, both for remote/later viewing of talks, and so that people can
watch streams in the hallways when rooms are full. This requires
speakers to consent to being recorded and streamed. If you plan to be a
speaker, please understand that by doing so you implicitly give consent
for your talk to be recorded and streamed. The recordings will be
published under the same license as all FOSDEM content (CC-BY).

Hope to hear from you soon! And please forward this announcement.

If you have any further questions, please write to the mailinglist at
storage-devr...@lists.fosdem.org and we will try to answer as soon as
possible.

Thanks!


signature.asc
Description: PGP signature


Re: [Qemu-block] [PATCH] block: gluster: Probe alignment limits

2019-08-22 Thread Niels de Vos
On Wed, Aug 21, 2019 at 07:04:17PM +0200, Max Reitz wrote:
> On 17.08.19 23:21, Nir Soffer wrote:
> > Implement alignment probing similar to file-posix, by reading from the
> > first 4k of the image.
> > 
> > Before this change, provisioning a VM on storage with sector size of
> > 4096 bytes would fail when the installer try to create filesystems. Here
> > is an example command that reproduces this issue:
> > 
> > $ qemu-system-x86_64 -accel kvm -m 2048 -smp 2 \
> > -drive 
> > file=gluster://gluster1/gv0/fedora29.raw,format=raw,cache=none \
> > -cdrom Fedora-Server-dvd-x86_64-29-1.2.iso
> > 
> > The installer fails in few seconds when trying to create filesystem on
> > /dev/mapper/fedora-root. In error report we can see that it failed with
> > EINVAL (I could not extract the error from guest).
> > 
> > Copying disk fails with EINVAL:
> > 
> > $ qemu-img convert -p -f raw -O raw -t none -T none \
> > gluster://gluster1/gv0/fedora29.raw \
> > gluster://gluster1/gv0/fedora29-clone.raw
> > qemu-img: error while writing sector 4190208: Invalid argument
> > 
> > This is a fix to same issue fixed in commit a6b257a08e3d (file-posix:
> > Handle undetectable alignment) for gluster:// images.
> > 
> > This fix has the same limit, that the first block of the image should be
> > allocated, otherwise we cannot detect the alignment and fallback to a
> > safe value (4096) even when using storage with sector size of 512 bytes.
> > 
> > Signed-off-by: Nir Soffer 
> > ---
> >  block/gluster.c | 47 +++
> >  1 file changed, 47 insertions(+)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index f64dc5b01e..d936240b72 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -52,6 +52,9 @@
> >  
> >  #define GERR_INDEX_HINT "hint: check in 'server' array index '%d'\n"
> >  
> > +/* The value is known only on the server side. */
> > +#define MAX_ALIGN 4096
> > +
> >  typedef struct GlusterAIOCB {
> >  int64_t size;
> >  int ret;
> > @@ -902,8 +905,52 @@ out:
> >  return ret;
> >  }
> >  
> > +/*
> > + * Check if read is allowed with given memory buffer and length.
> > + *
> > + * This function is used to check O_DIRECT request alignment.
> > + */
> > +static bool gluster_is_io_aligned(struct glfs_fd *fd, void *buf, size_t 
> > len)
> > +{
> > +ssize_t ret = glfs_pread(fd, buf, len, 0, 0, NULL);
> > +return ret >= 0 || errno != EINVAL;
> 
> Is glfs_pread() guaranteed to return EINVAL on invalid alignment?
> file-posix says this is only the case on Linux (for normal files).  Now
> I also don’t know whether the gluster driver works on anything but Linux
> anyway.

The behaviour depends on the filesystem used by the Gluster backend. XFS
is the recommendation, but in the end it is up to the users. The Gluster
server is known to work on Linux, NetBSD and FreeBSD, the vast majority
of users runs it on Linux.

I do not think there is a strong guarantee EINVAL is always correct. How
about only checking 'ret > 0'?

> 
> > +}
> > +
> > +static void gluster_probe_alignment(BlockDriverState *bs, struct glfs_fd 
> > *fd,
> > +Error **errp)
> > +{
> > +char *buf;
> > +size_t alignments[] = {1, 512, 1024, 2048, 4096};
> > +size_t align;
> > +int i;
> > +
> > +buf = qemu_memalign(MAX_ALIGN, MAX_ALIGN);
> > +
> > +for (i = 0; i < ARRAY_SIZE(alignments); i++) {
> > +align = alignments[i];
> > +if (gluster_is_io_aligned(fd, buf, align)) {
> > +/* Fallback to safe value. */
> > +bs->bl.request_alignment = (align != 1) ? align : MAX_ALIGN;
> > +break;
> > +}
> > +}
> 
> I don’t like the fact that the last element of alignments[] should be
> the same as MAX_ALIGN, without that ever having been made explicit anywhere.
> 
> It’s a bit worse in the file-posix patch, because if getpagesize() is
> greater than 4k, max_align will be greater than 4k.  But MAX_BLOCKSIZE
> is 4k, too, so I suppose we wouldn’t support any block size beyond that
> anyway, which makes the error message appropriate still.
> 
> > +
> > +qemu_vfree(buf);
> > +
> > +if (!bs->bl.request_alignment) {
> > +error_setg(errp, "Could not find working O_DIRECT alignment");
> > +error_append_hint(errp, "Try cache.direct=off\n");
> > +}
> > +}
> > +
> >  static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> >  {
> > +BDRVGlusterState *s = bs->opaque;
> > +
> > +gluster_probe_alignment(bs, s->fd, errp);
> > +
> > +bs->bl.min_mem_alignment = bs->bl.request_alignment;
> 
> Well, I’ll just trust you that there is no weird system where the memory
> alignment is greater than the request alignment.
> 
> > +bs->bl.opt_mem_alignment = MAX(bs->bl.request_alignment, MAX_ALIGN);
> 
> Isn’t request_alignment guaranteed to not exceed MAX_ALIGN, i.e. isn’t
> this always MAX_ALIGN?
> 
> >  

Re: [Qemu-block] [PATCH 1/6] configure: Only generate GLUSTERFS variables if glusterfs is usable

2019-06-14 Thread Niels de Vos
On Fri, Jun 14, 2019 at 09:24:27AM +0200, Philippe Mathieu-Daudé wrote:
> It is pointless and confusing to have GLUSTERFS variables
> in config-host.mak when glusterfs is not usable.
> 
> Signed-off-by: Philippe Mathieu-Daudé 

Looks good to me, thanks.

Reviewed-by: Niels de Vos 

> ---
>  configure | 36 ++--
>  1 file changed, 18 insertions(+), 18 deletions(-)
> 
> diff --git a/configure b/configure
> index b091b82cb3..13fd4a1166 100755
> --- a/configure
> +++ b/configure
> @@ -7118,30 +7118,30 @@ if test "$glusterfs" = "yes" ; then
>echo "CONFIG_GLUSTERFS=m" >> $config_host_mak
>echo "GLUSTERFS_CFLAGS=$glusterfs_cflags" >> $config_host_mak
>echo "GLUSTERFS_LIBS=$glusterfs_libs" >> $config_host_mak
> -fi
>  
> -if test "$glusterfs_xlator_opt" = "yes" ; then
> -  echo "CONFIG_GLUSTERFS_XLATOR_OPT=y" >> $config_host_mak
> -fi
> +  if test "$glusterfs_xlator_opt" = "yes" ; then
> +echo "CONFIG_GLUSTERFS_XLATOR_OPT=y" >> $config_host_mak
> +  fi
>  
> -if test "$glusterfs_discard" = "yes" ; then
> -  echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak
> -fi
> +  if test "$glusterfs_discard" = "yes" ; then
> +echo "CONFIG_GLUSTERFS_DISCARD=y" >> $config_host_mak
> +  fi
>  
> -if test "$glusterfs_fallocate" = "yes" ; then
> -  echo "CONFIG_GLUSTERFS_FALLOCATE=y" >> $config_host_mak
> -fi
> +  if test "$glusterfs_fallocate" = "yes" ; then
> +echo "CONFIG_GLUSTERFS_FALLOCATE=y" >> $config_host_mak
> +  fi
>  
> -if test "$glusterfs_zerofill" = "yes" ; then
> -  echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
> -fi
> +  if test "$glusterfs_zerofill" = "yes" ; then
> +echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
> +  fi
>  
> -if test "$glusterfs_ftruncate_has_stat" = "yes" ; then
> -  echo "CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT=y" >> $config_host_mak
> -fi
> +  if test "$glusterfs_ftruncate_has_stat" = "yes" ; then
> +echo "CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT=y" >> $config_host_mak
> +  fi
>  
> -if test "$glusterfs_iocb_has_stat" = "yes" ; then
> -  echo "CONFIG_GLUSTERFS_IOCB_HAS_STAT=y" >> $config_host_mak
> +  if test "$glusterfs_iocb_has_stat" = "yes" ; then
> +echo "CONFIG_GLUSTERFS_IOCB_HAS_STAT=y" >> $config_host_mak
> +  fi
>  fi
>  
>  if test "$libssh2" = "yes" ; then
> -- 
> 2.20.1
> 



Re: [Qemu-block] [QEMU PATCH] MAINTAINERS: Downgrade status of block sections without "M:" to "Odd Fixes"

2019-05-06 Thread Niels de Vos
On Mon, May 06, 2019 at 08:18:54AM +0200, Thomas Huth wrote:
> Fixes might still get picked up via the qemu-block mailing list,
> so the status is not "Orphan" yet.
> Also add the gluster mailing list as suggested by Niels here:
> 
>  https://patchwork.kernel.org/patch/10613297/#22409943
> 
> Signed-off-by: Thomas Huth 

Thanks, this counts for the Gluster part:
Reviewed-by: Niels de Vos 


> ---
>  MAINTAINERS | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 66ddbda9c9..899a4cd572 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2404,12 +2404,13 @@ F: block/ssh.c
>  
>  CURL
>  L: qemu-block@nongnu.org
> -S: Supported
> +S: Odd Fixes
>  F: block/curl.c
>  
>  GLUSTER
>  L: qemu-block@nongnu.org
> -S: Supported
> +L: integrat...@gluster.org
> +S: Odd Fixes
>  F: block/gluster.c
>  
>  Null Block Driver
> -- 
> 2.21.0
> 



Re: [Qemu-block] [PATCH RFC] block/gluster: limit the transfer size to 512 MiB

2019-03-28 Thread Niels de Vos
On Thu, Mar 28, 2019 at 11:52:27AM +0100, Stefano Garzarella wrote:
> Several versions of GlusterFS (3.12? -> 6.0.1) fail when the
> transfer size is greater or equal to 1024 MiB, so we are
> limiting the transfer size to 512 MiB to avoid this rare issue.
> 
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1691320
> Signed-off-by: Stefano Garzarella 
> ---
> 
> RFC:
> Should I add a parameter to allow the user to modify the max_transfer
> variable?

No, that is not needed IMHO. In most environments where large files are
used, sharding is enabled on the Gluster volumes. For VM images shards
of 256MB are recommeneded (I think). When sharding is enabled, the
writes will be limited to that size as well, hence the problem is not
noticed on (most) production deployments.

Reviewed-by: Niels de Vos 

> 
> 
>  block/gluster.c | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index af64330211..e1e4eaa525 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -9,6 +9,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/units.h"
>  #include 
>  #include "block/block_int.h"
>  #include "block/qdict.h"
> @@ -37,6 +38,12 @@
>  #define GLUSTER_DEBUG_MAX   9
>  #define GLUSTER_OPT_LOGFILE "logfile"
>  #define GLUSTER_LOGFILE_DEFAULT "-" /* handled in libgfapi as 
> /dev/stderr */
> +/*
> + * Several versions of GlusterFS (3.12? -> 6.0.1) fail when the transfer size
> + * is greater or equal to 1024 MiB, so we are limiting the transfer size to 
> 512
> + * MiB to avoid this rare issue.
> + */
> +#define GLUSTER_MAX_TRANSFER(512 * MiB)
>  
>  #define GERR_INDEX_HINT "hint: check in 'server' array index '%d'\n"
>  
> @@ -879,6 +886,11 @@ out:
>  return ret;
>  }
>  
> +static void qemu_gluster_refresh_limits(BlockDriverState *bs, Error **errp)
> +{
> +bs->bl.max_transfer = GLUSTER_MAX_TRANSFER;
> +}
> +
>  static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
> BlockReopenQueue *queue, Error **errp)
>  {
> @@ -1536,6 +1548,7 @@ static BlockDriver bdrv_gluster = {
>  .bdrv_co_pwrite_zeroes= qemu_gluster_co_pwrite_zeroes,
>  #endif
>  .bdrv_co_block_status = qemu_gluster_co_block_status,
> +.bdrv_refresh_limits  = qemu_gluster_refresh_limits,
>  .create_opts  = _gluster_create_opts,
>  .strong_runtime_opts  = gluster_strong_open_opts,
>  };
> @@ -1566,6 +1579,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  .bdrv_co_pwrite_zeroes= qemu_gluster_co_pwrite_zeroes,
>  #endif
>  .bdrv_co_block_status = qemu_gluster_co_block_status,
> +.bdrv_refresh_limits  = qemu_gluster_refresh_limits,
>  .create_opts  = _gluster_create_opts,
>  .strong_runtime_opts  = gluster_strong_open_opts,
>  };
> @@ -1596,6 +1610,7 @@ static BlockDriver bdrv_gluster_unix = {
>  .bdrv_co_pwrite_zeroes= qemu_gluster_co_pwrite_zeroes,
>  #endif
>  .bdrv_co_block_status = qemu_gluster_co_block_status,
> +.bdrv_refresh_limits  = qemu_gluster_refresh_limits,
>  .create_opts  = _gluster_create_opts,
>  .strong_runtime_opts  = gluster_strong_open_opts,
>  };
> @@ -1632,6 +1647,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  .bdrv_co_pwrite_zeroes= qemu_gluster_co_pwrite_zeroes,
>  #endif
>  .bdrv_co_block_status = qemu_gluster_co_block_status,
> +.bdrv_refresh_limits  = qemu_gluster_refresh_limits,
>  .create_opts  = _gluster_create_opts,
>  .strong_runtime_opts  = gluster_strong_open_opts,
>  };
> -- 
> 2.20.1
> 
> 



Re: [Qemu-block] [PATCH v2 0/2] block: Gluster 6 compatibility

2019-03-12 Thread Niels de Vos
On Mon, Mar 11, 2019 at 12:10:06PM +0100, Kevin Wolf wrote:
> Am 09.03.2019 um 10:40 hat Niels de Vos geschrieben:
> > On Fri, Mar 08, 2019 at 02:11:51PM +0100, Kevin Wolf wrote:
> > > Am 05.03.2019 um 16:46 hat Niels de Vos geschrieben:
> > > > Gluster 6 is currently available as release candidate. There have been a
> > > > few changes to libgfapi.so that need to be adapted by consuming projects
> > > > like QEMU. Fedora Rawhide already contains glusterfs-6.0-RC0, and this
> > > > prevents rebuilds of QEMU there (https://bugzilla.redhat.com/1684298).
> > > > 
> > > > The following two changes should be sufficient to consume Gluster 6 once
> > > > it is released. These have been tested on CentOS-7 with Gluster 5 and
> > > > Gluster 6 (minimal manual qemu-img tests only).
> > > > 
> > > > This v2 post contains changes suggested by Daniel P. Berrangé and Kevin
> > > > Wolf. Thanks!
> > > 
> > > Thanks, applied to the block branch.
> > 
> > Thanks! Stefano Garzarella gave a suggestion for further cleanup. I was
> > planning to address that (no #ifdef for function arguments) next week
> > when I'm back from a trip, Is that something you would also like to see,
> > or do you prefer the change to stay minimal/small as it is now? I'm
> > happy to send a followup if you agree that it is cleaner.
> 
> I don't mind either way.
> 
> I'm going to send a pull request tomorrow for soft freeze, but if you
> tell me that I should wait with this one, I can remove it from my queue
> for now. It's a bug fix, so we can still apply an updated version during
> the freeze.
> 
> A follow-up works for me, too, so whatever you prefer.

In that case, I prefer to keep the current patches as they are. If
further changes make the code much better readable/maintainable I would
have provided a new version. But at the moment, and after a little more
consideration, I do not think there is much benefit in the cleanup
Stefano suggested.

Thanks,
Niels



Re: [Qemu-block] [PATCH v2 0/2] block: Gluster 6 compatibility

2019-03-09 Thread Niels de Vos
On Fri, Mar 08, 2019 at 02:11:51PM +0100, Kevin Wolf wrote:
> Am 05.03.2019 um 16:46 hat Niels de Vos geschrieben:
> > Gluster 6 is currently available as release candidate. There have been a
> > few changes to libgfapi.so that need to be adapted by consuming projects
> > like QEMU. Fedora Rawhide already contains glusterfs-6.0-RC0, and this
> > prevents rebuilds of QEMU there (https://bugzilla.redhat.com/1684298).
> > 
> > The following two changes should be sufficient to consume Gluster 6 once
> > it is released. These have been tested on CentOS-7 with Gluster 5 and
> > Gluster 6 (minimal manual qemu-img tests only).
> > 
> > This v2 post contains changes suggested by Daniel P. Berrangé and Kevin
> > Wolf. Thanks!
> 
> Thanks, applied to the block branch.

Thanks! Stefano Garzarella gave a suggestion for further cleanup. I was
planning to address that (no #ifdef for function arguments) next week
when I'm back from a trip, Is that something you would also like to see,
or do you prefer the change to stay minimal/small as it is now? I'm
happy to send a followup if you agree that it is cleaner.

Niels



[Qemu-block] [PATCH v5 1/2] block/gluster: Handle changed glfs_ftruncate signature

2019-03-05 Thread Niels de Vos
From: Prasanna Kumar Kalever 

New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Nevertheless, the API has changed and needs to be
adopted.

Signed-off-by: Prasanna Kumar Kalever 
Signed-off-by: Niels de Vos 

--
v5: pass default NULL arguments through macro (Daniel P. Berrangé)
v4: rebase to current master branch
v3: define old backwards compatible glfs_ftruncate() macro, from Eric Blake
v2: do a compile check as suggested by Eric Blake
---
 block/gluster.c |  4 
 configure   | 18 ++
 2 files changed, 22 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index af64330211..f853aa87f4 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -20,6 +20,10 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 
+#ifdef CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT
+# define glfs_ftruncate(fd, offset) glfs_ftruncate(fd, offset, NULL, NULL)
+#endif
+
 #define GLUSTER_OPT_FILENAME"filename"
 #define GLUSTER_OPT_VOLUME  "volume"
 #define GLUSTER_OPT_PATH"path"
diff --git a/configure b/configure
index cefeb8fcce..bbfe587b88 100755
--- a/configure
+++ b/configure
@@ -456,6 +456,7 @@ glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
+glusterfs_ftruncate_has_stat="no"
 gtk=""
 gtk_gl="no"
 tls_priority="NORMAL"
@@ -4083,6 +4084,19 @@ if test "$glusterfs" != "no" ; then
   glusterfs_fallocate="yes"
   glusterfs_zerofill="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+int
+main(void)
+{
+   /* new glfs_ftruncate() passes two additional args */
+   return glfs_ftruncate(NULL, 0, NULL, NULL);
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_ftruncate_has_stat="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6885,6 +6899,10 @@ if test "$glusterfs_zerofill" = "yes" ; then
   echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_ftruncate_has_stat" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
-- 
2.20.1




[Qemu-block] [PATCH v2 0/2] block: Gluster 6 compatibility

2019-03-05 Thread Niels de Vos
Gluster 6 is currently available as release candidate. There have been a
few changes to libgfapi.so that need to be adapted by consuming projects
like QEMU. Fedora Rawhide already contains glusterfs-6.0-RC0, and this
prevents rebuilds of QEMU there (https://bugzilla.redhat.com/1684298).

The following two changes should be sufficient to consume Gluster 6 once
it is released. These have been tested on CentOS-7 with Gluster 5 and
Gluster 6 (minimal manual qemu-img tests only).

This v2 post contains changes suggested by Daniel P. Berrangé and Kevin
Wolf. Thanks!

Cheers,
Niels


Niels de Vos (2):
  block/gluster: Handle changed glfs_ftruncate signature
  gluster: the glfs_io_cbk callback function pointer adds pre/post stat
args

 block/gluster.c | 17 ++---
 configure   | 42 ++
 2 files changed, 56 insertions(+), 3 deletions(-)

-- 
2.20.1




[Qemu-block] [PATCH v2 2/2] gluster: the glfs_io_cbk callback function pointer adds pre/post stat args

2019-03-05 Thread Niels de Vos
The glfs_*_async() functions do a callback once finished. This callback
has changed its arguments, pre- and post-stat structures have been
added. This makes it possible to improve caching, which is useful for
Samba and NFS-Ganesha, but not so much for QEMU. Gluster 6 is the first
release that includes these new arguments.

With an additional detection in ./configure, the new arguments can
conditionally get included in the glfs_io_cbk handler.

Signed-off-by: Niels de Vos 

--
v2: correct typo in commit message (Kevin Wolf)
---
 block/gluster.c |  6 +-
 configure   | 24 
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index f853aa87f4..51f184cbd8 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -729,7 +729,11 @@ static struct glfs 
*qemu_gluster_init(BlockdevOptionsGluster *gconf,
 /*
  * AIO callback routine called from GlusterFS thread.
  */
-static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
+static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
+#ifdef CONFIG_GLUSTERFS_IOCB_HAS_STAT
+ struct glfs_stat *pre, struct glfs_stat *post,
+#endif
+ void *arg)
 {
 GlusterAIOCB *acb = (GlusterAIOCB *)arg;
 
diff --git a/configure b/configure
index bbfe587b88..db548d9b08 100755
--- a/configure
+++ b/configure
@@ -457,6 +457,7 @@ glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
 glusterfs_ftruncate_has_stat="no"
+glusterfs_iocb_has_stat="no"
 gtk=""
 gtk_gl="no"
 tls_priority="NORMAL"
@@ -4097,6 +4098,25 @@ EOF
 if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
   glusterfs_ftruncate_has_stat="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+/* new glfs_io_cbk() passes two additional glfs_stat structs */
+static void
+glusterfs_iocb(glfs_fd_t *fd, ssize_t ret, struct glfs_stat *prestat, struct 
glfs_stat *poststat, void *data)
+{}
+
+int
+main(void)
+{
+   glfs_io_cbk iocb = _iocb;
+   iocb(NULL, 0 , NULL, NULL, NULL);
+   return 0;
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_iocb_has_stat="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6903,6 +6923,10 @@ if test "$glusterfs_ftruncate_has_stat" = "yes" ; then
   echo "CONFIG_GLUSTERFS_FTRUNCATE_HAS_STAT=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_iocb_has_stat" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_IOCB_HAS_STAT=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
-- 
2.20.1




Re: [Qemu-block] [PATCH 2/2] gluster: the glfs_io_cbk callback function pointer adds pre/post stat args

2019-03-05 Thread Niels de Vos
On Tue, Mar 05, 2019 at 11:29:34AM +0100, Kevin Wolf wrote:
> Am 04.03.2019 um 17:21 hat Niels de Vos geschrieben:
> > The glfs_*_async() functions do a callback once finished. This callback
> > has changed its arguments, pre- and post-stat structures have been
> > added. This makes it possible to improve cashing, which is useful for
> 
> Did you mean "caching"?

Yuck, yes, of course. I'll correct that when I send an updated version
for patch 1/2.

Niels



Re: [Qemu-block] [Qemu-devel] [PATCH 1/2] block/gluster: Handle changed glfs_ftruncate signature

2019-03-05 Thread Niels de Vos
On Mon, Mar 04, 2019 at 04:41:44PM +, Daniel P. Berrangé wrote:
> On Mon, Mar 04, 2019 at 05:21:02PM +0100, Niels de Vos wrote:
> > From: Prasanna Kumar Kalever 
> > 
> > New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
> > function that returns additional 'struct stat' structures to enable
> > advanced caching of attributes. This is useful for file servers, not so
> > much for QEMU. Nevertheless, the API has changed and needs to be
> > adopted.
> > 
> > Signed-off-by: Prasanna Kumar Kalever 
> > Signed-off-by: Niels de Vos 
> > 
> > ---
> > v4: rebase to current master branch
> > v3: define old backwards compatible glfs_ftruncate() macro, from Eric Blake
> > v2: do a compile check as suggested by Eric Blake
> > ---
> >  block/gluster.c | 11 +--
> >  configure   | 18 ++
> >  2 files changed, 27 insertions(+), 2 deletions(-)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index af64330211..86e5278524 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -20,6 +20,10 @@
> >  #include "qemu/option.h"
> >  #include "qemu/cutils.h"
> >  
> > +#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
> > +# define glfs_ftruncate(fd, offset, _u1, _u2) glfs_ftruncate(fd, offset)
> > +#endif
> 
> I don't much like this approach as it sets up a trapdoor. If future
> QEMU passes a non-NULL value for the 3rd/4th argument it will be
> silently ignored depending on glfs version built against which can
> result in incorrect behaviour at runtime. If we reverse it:
> 
>  #ifndef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
>  # define glfs_ftruncate(fd, offset) glfs_ftruncate(fd, offset, NULL, NULL)
>  #endif
> 
> then it ensures we can't silently do the wrong thing in future. Anyone
> who wants to use the 3rd/4th args will have to make an explicit effort
> to ensure it works correctly with old glfs APIs. An added benefit is
> that it avoids the following patch chunks.

Thanks for the suggestion. All makes sense and I'll update the patch
accordingly.

Niels



[Qemu-block] [PATCH 0/2] block: Gluster 6 compatibility

2019-03-04 Thread Niels de Vos
Gluster 6 is currently available as release candidate. There have been a
few changes to libgfapi.so that need to be adapted by consuming projects
like QEMU. Fedora Rawhide already contains glusterfs-6.0-RC0, and this
prevents rebuilds of QEMU there (https://bugzilla.redhat.com/1684298).

The following two changes should be sufficient to consume Gluster 6 once
it is released. These have been tested on CentOS-7 with Gluster 5 and
Gluster 6 (minimal manual qemu-img tests only).

Cheers,
Niels


Niels de Vos (2):
  block/gluster: Handle changed glfs_ftruncate signature
  gluster: the glfs_io_cbk callback function pointer adds pre/post stat
args

 block/gluster.c | 17 ++---
 configure   | 42 ++
 2 files changed, 56 insertions(+), 3 deletions(-)

-- 
2.20.1




[Qemu-block] [PATCH 2/2] gluster: the glfs_io_cbk callback function pointer adds pre/post stat args

2019-03-04 Thread Niels de Vos
The glfs_*_async() functions do a callback once finished. This callback
has changed its arguments, pre- and post-stat structures have been
added. This makes it possible to improve cashing, which is useful for
Samba and NFS-Ganesha, but not so much for QEMU. Gluster 6 is the first
release that includes these new arguments.

With an additional detection in ./configure, the new arguments can
conditionally get included in the glfs_io_cbk handler.

Signed-off-by: Niels de Vos 
---
 block/gluster.c |  6 +-
 configure   | 24 
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/block/gluster.c b/block/gluster.c
index 86e5278524..7483c3b2aa 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -729,7 +729,11 @@ static struct glfs 
*qemu_gluster_init(BlockdevOptionsGluster *gconf,
 /*
  * AIO callback routine called from GlusterFS thread.
  */
-static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
+static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret,
+#ifdef CONFIG_GLUSTERFS_IOCB_HAS_STAT
+ struct glfs_stat *pre, struct glfs_stat *post,
+#endif
+ void *arg)
 {
 GlusterAIOCB *acb = (GlusterAIOCB *)arg;
 
diff --git a/configure b/configure
index 1d09bef1f9..d8ae8c2f50 100755
--- a/configure
+++ b/configure
@@ -457,6 +457,7 @@ glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
 glusterfs_legacy_ftruncate="no"
+glusterfs_iocb_has_stat="no"
 gtk=""
 gtk_gl="no"
 tls_priority="NORMAL"
@@ -4071,6 +4072,25 @@ EOF
 if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
   glusterfs_legacy_ftruncate="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+/* new glfs_io_cbk() passes two additional glfs_stat structs */
+static void
+glusterfs_iocb(glfs_fd_t *fd, ssize_t ret, struct glfs_stat *prestat, struct 
glfs_stat *poststat, void *data)
+{}
+
+int
+main(void)
+{
+   glfs_io_cbk iocb = _iocb;
+   iocb(NULL, 0 , NULL, NULL, NULL);
+   return 0;
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_iocb_has_stat="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6871,6 +6891,10 @@ if test "$glusterfs_legacy_ftruncate" = "yes" ; then
   echo "CONFIG_GLUSTERFS_LEGACY_FTRUNCATE=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_iocb_has_stat" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_IOCB_HAS_STAT=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
-- 
2.20.1




[Qemu-block] [PATCH 1/2] block/gluster: Handle changed glfs_ftruncate signature

2019-03-04 Thread Niels de Vos
From: Prasanna Kumar Kalever 

New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Nevertheless, the API has changed and needs to be
adopted.

Signed-off-by: Prasanna Kumar Kalever 
Signed-off-by: Niels de Vos 

---
v4: rebase to current master branch
v3: define old backwards compatible glfs_ftruncate() macro, from Eric Blake
v2: do a compile check as suggested by Eric Blake
---
 block/gluster.c | 11 +--
 configure   | 18 ++
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index af64330211..86e5278524 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -20,6 +20,10 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 
+#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
+# define glfs_ftruncate(fd, offset, _u1, _u2) glfs_ftruncate(fd, offset)
+#endif
+
 #define GLUSTER_OPT_FILENAME"filename"
 #define GLUSTER_OPT_VOLUME  "volume"
 #define GLUSTER_OPT_PATH"path"
@@ -1005,6 +1009,7 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 PreallocMode prealloc, Error **errp)
 {
 int64_t current_length;
+int ret;
 
 current_length = glfs_lseek(fd, 0, SEEK_END);
 if (current_length < 0) {
@@ -1032,7 +1037,8 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 #endif /* CONFIG_GLUSTERFS_FALLOCATE */
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
 case PREALLOC_MODE_FULL:
-if (glfs_ftruncate(fd, offset)) {
+ret = glfs_ftruncate(fd, offset, NULL, NULL);
+if (ret) {
 error_setg_errno(errp, errno, "Could not resize file");
 return -errno;
 }
@@ -1043,7 +1049,8 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 break;
 #endif /* CONFIG_GLUSTERFS_ZEROFILL */
 case PREALLOC_MODE_OFF:
-if (glfs_ftruncate(fd, offset)) {
+ret = glfs_ftruncate(fd, offset, NULL, NULL);
+if (ret) {
 error_setg_errno(errp, errno, "Could not resize file");
 return -errno;
 }
diff --git a/configure b/configure
index 540bee19ba..1d09bef1f9 100755
--- a/configure
+++ b/configure
@@ -456,6 +456,7 @@ glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
+glusterfs_legacy_ftruncate="no"
 gtk=""
 gtk_gl="no"
 tls_priority="NORMAL"
@@ -4057,6 +4058,19 @@ if test "$glusterfs" != "no" ; then
   glusterfs_fallocate="yes"
   glusterfs_zerofill="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+int
+main(void)
+{
+   /* new glfs_ftruncate() passes two additional args */
+   return glfs_ftruncate(NULL, 0 /*, NULL, NULL */);
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_legacy_ftruncate="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6853,6 +6867,10 @@ if test "$glusterfs_zerofill" = "yes" ; then
   echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_legacy_ftruncate" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_LEGACY_FTRUNCATE=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
-- 
2.20.1




Re: [Qemu-block] [Qemu-devel] [PATCH 2/2] MAINTAINERS: Remove myself as block maintainer

2019-01-09 Thread Niels de Vos
On Tue, Jan 08, 2019 at 08:12:16PM +0100, Kevin Wolf wrote:
> Am 08.01.2019 um 18:18 hat Markus Armbruster geschrieben:
> > This patch series got stuck.
> > 
> > Markus Armbruster  writes:
> > 
> > > Fam Zheng  writes:
> > >
> > >> On Tue, 09/25 07:00, Markus Armbruster wrote:
> > >>> Jeff Cody  writes:
> > >>> 
> > >>> > I'll not be involved in day-to-day qemu development.  Remove
> > >>> > myself as maintainer from the remainder of the network block drivers
> > >>> > (and vhdx), and revert them to the general block layer maintainership.
> > >>> >
> > >>> > Signed-off-by: Jeff Cody 
> > >>> > ---
> > >>> >  MAINTAINERS | 14 --
> > >>> >  1 file changed, 14 deletions(-)
> > >>> >
> > >>> > diff --git a/MAINTAINERS b/MAINTAINERS
> > >>> > index e93f79672f..6ef6932628 100644
> > >>> > --- a/MAINTAINERS
> > >>> > +++ b/MAINTAINERS
> > >>> > @@ -1982,28 +1982,22 @@ F: block/vmdk.c
> > >>> >  
> > >>> >  RBD
> > >>> >  M: Josh Durgin 
> > >>> > -M: Jeff Cody 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  S: Supported
> > >>> >  F: block/rbd.c
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> >  
> > >>> >  Sheepdog
> > >>> >  M: Hitoshi Mitake 
> > >>> >  M: Liu Yuan 
> > >>> > -M: Jeff Cody 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  L: sheep...@lists.wpkg.org
> > >>> >  S: Supported
> > >>> >  F: block/sheepdog.c
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> >  
> > >>> >  VHDX
> > >>> > -M: Jeff Cody 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  S: Supported
> > >>> >  F: block/vhdx*
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> 
> > >>> Does "S: Supported" make sense without an M:?
> > >>> 
> > >>> >  
> > >>> >  VDI
> > >>> >  M: Stefan Weil 
> > >>> > @@ -2034,34 +2028,26 @@ F: docs/interop/nbd.txt
> > >>> >  T: git git://repo.or.cz/qemu/ericb.git nbd
> > >>> >  
> > >>> >  NFS
> > >>> > -M: Jeff Cody 
> > >>> >  M: Peter Lieven 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  S: Maintained
> > >>> >  F: block/nfs.c
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> >  
> > >>> >  SSH
> > >>> >  M: Richard W.M. Jones 
> > >>> > -M: Jeff Cody 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  S: Supported
> > >>> >  F: block/ssh.c
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> >  
> > >>> >  CURL
> > >>> > -M: Jeff Cody 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  S: Supported
> > >>> >  F: block/curl.c
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> 
> > >>> Likewise.
> > >>> 
> > >>> >  GLUSTER
> > >>> > -M: Jeff Cody 
> > >>> >  L: qemu-block@nongnu.org
> > >>> >  S: Supported
> > >>> >  F: block/gluster.c
> > >>> > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >>> 
> > >>> Likewise.
> > >>> 
> > >>> >  Null Block Driver
> > >>> >  M: Fam Zheng 
> > >>> 
> > >>
> > >> Block drivers without an M: naturally fall under the overall 
> > >> maintainership of
> > >> block layer (Kevin), so IMO keeping the statuses is fine. Maybe CURL 
> > >> can be
> > >> degraded to Maintained, though.
> > >
> > > Yes, get_maintainer.pl combines all applicable sections.  Human readers
> > > are left to wonder, unless they know to look for other matches.
> > >
> > > Do we want to have a dedicated network block driver submaintainer again,
> > > if we can find one?
> > >
> > > Do we want to have a dedicated VHDX driver submaintainer again?  Fam,
> > > you're maintaining VMDK, could you cover VHDX as well?
> > 
> > Fam can't.
> > 
> > I can see two sane options for the three sections that lose their sole
> > maintainer (VHDX, CURL, GLUSTER):
> > 
> > * Downgrade to S: Orphan
> > 
> >   This reflects the fact that we'd like to have dedicated maintainers
> >   for them.  It camouflages the fact that the "Block layer core"
> >   maintainers pick up the slack.
> > 
> > * Delete
> > 
> >   The opposite.
> > 
> > Kevin, Max, please pick your poison, or suggest one you find tastier.
> 
> Orphan is probably by far closer to the truth than Supported. But I
> think what we really do is Odd Fixes. Not sure if that's a status that
> works in any meaningful way without having a maintainer for the specific
> thing assigned?

For Gluster we are interested in keeping it maintained in some form. At
least integrat...@gluster.org could be added as a list. I am not sure
there is anyone working on Gluster that would fit for a maintainer of
the driver. If you really want to put a name there, I can function as a
'maintainer' and pull in other Gluster developers in case I am otherwise
occupied.

Niels



Re: [Qemu-block] [PATCH v3 7/9] gluster: Support auto-read-only option

2018-10-17 Thread Niels de Vos
On Wed, Oct 17, 2018 at 06:41:58PM +0200, Kevin Wolf wrote:
> If read-only=off, but auto-read-only=on is given, open the file
> read-write if we have the permissions, but instead of erroring out for
> read-only files, just degrade to read-only.
> 
> Signed-off-by: Kevin Wolf 
> ---
>  block/gluster.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 4fd55a9cc5..5e300c96c8 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -849,8 +849,16 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> QDict *options,
>  qemu_gluster_parse_flags(bdrv_flags, _flags);
>  
>  s->fd = glfs_open(s->glfs, gconf->path, open_flags);
> -if (!s->fd) {
> -ret = -errno;
> +ret = s->fd ? 0 : -errno;
> +
> +if (ret == -EACCES || ret == -EROFS) {
> +/* Try to degrade to read-only, but if it doesn't work, still use the
> + * normal error message. */
> +if (bdrv_apply_auto_read_only(bs, NULL, NULL) == 0) {
> +open_flags = (open_flags & ~O_RDWR) | O_RDONLY;
> +s->fd = glfs_open(s->glfs, gconf->path, open_flags);
> +ret = s->fd ? 0 : -errno;
> +}
>      }
>  
>  s->supports_seek_data = qemu_gluster_test_seek(s->fd);
> -- 

Looks good to me, thanks.

Reviewed-by: Niels de Vos 



Re: [Qemu-block] [PATCH v2 7/8] gluster: Support auto-read-only option

2018-10-14 Thread Niels de Vos
On Fri, Oct 12, 2018 at 12:31:21PM -0500, Eric Blake wrote:
> On 10/12/18 6:55 AM, Kevin Wolf wrote:
> > If read-only=off, but auto-read-only=on is given, open the file
> > read-write if we have the permissions, but instead of erroring out for
> > read-only files, just degrade to read-only.
> > 
> > Signed-off-by: Kevin Wolf 
> > ---
> >   block/gluster.c | 9 +
> >   1 file changed, 9 insertions(+)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 4fd55a9cc5..68d20c8830 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -849,6 +849,15 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> > QDict *options,
> >   qemu_gluster_parse_flags(bdrv_flags, _flags);
> >   s->fd = glfs_open(s->glfs, gconf->path, open_flags);
> > +if (!s->fd && errno == EACCES) {
> 
> EROFS is not possible as it was for posix file?

EROFS can happen, depending on the configuration of the Gluster volume.
In that case, opening read-only should work fine.

Niels



Re: [Qemu-block] [Qemu-devel] [PATCH 2/2] MAINTAINERS: Remove myself as block maintainer

2018-09-25 Thread Niels de Vos
On Tue, Sep 25, 2018 at 01:32:04PM +0800, Fam Zheng wrote:
> On Tue, 09/25 07:00, Markus Armbruster wrote:
> > Jeff Cody  writes:
> > 
> > > I'll not be involved in day-to-day qemu development.  Remove
> > > myself as maintainer from the remainder of the network block drivers
> > > (and vhdx), and revert them to the general block layer maintainership.
> > >
> > > Signed-off-by: Jeff Cody 
> > > ---
> > >  MAINTAINERS | 14 --
> > >  1 file changed, 14 deletions(-)
> > >
> > > diff --git a/MAINTAINERS b/MAINTAINERS
> > > index e93f79672f..6ef6932628 100644
> > > --- a/MAINTAINERS
> > > +++ b/MAINTAINERS
> > > @@ -1982,28 +1982,22 @@ F: block/vmdk.c
> > >  
> > >  RBD
> > >  M: Josh Durgin 
> > > -M: Jeff Cody 
> > >  L: qemu-block@nongnu.org
> > >  S: Supported
> > >  F: block/rbd.c
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >  
> > >  Sheepdog
> > >  M: Hitoshi Mitake 
> > >  M: Liu Yuan 
> > > -M: Jeff Cody 
> > >  L: qemu-block@nongnu.org
> > >  L: sheep...@lists.wpkg.org
> > >  S: Supported
> > >  F: block/sheepdog.c
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >  
> > >  VHDX
> > > -M: Jeff Cody 
> > >  L: qemu-block@nongnu.org
> > >  S: Supported
> > >  F: block/vhdx*
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > 
> > Does "S: Supported" make sense without an M:?
> > 
> > >  
> > >  VDI
> > >  M: Stefan Weil 
> > > @@ -2034,34 +2028,26 @@ F: docs/interop/nbd.txt
> > >  T: git git://repo.or.cz/qemu/ericb.git nbd
> > >  
> > >  NFS
> > > -M: Jeff Cody 
> > >  M: Peter Lieven 
> > >  L: qemu-block@nongnu.org
> > >  S: Maintained
> > >  F: block/nfs.c
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >  
> > >  SSH
> > >  M: Richard W.M. Jones 
> > > -M: Jeff Cody 
> > >  L: qemu-block@nongnu.org
> > >  S: Supported
> > >  F: block/ssh.c
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > >  
> > >  CURL
> > > -M: Jeff Cody 
> > >  L: qemu-block@nongnu.org
> > >  S: Supported
> > >  F: block/curl.c
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > 
> > Likewise.
> > 
> > >  GLUSTER
> > > -M: Jeff Cody 
> > >  L: qemu-block@nongnu.org
> > >  S: Supported
> > >  F: block/gluster.c
> > > -T: git git://github.com/codyprime/qemu-kvm-jtc.git block
> > 
> > Likewise.
> > 
> > >  Null Block Driver
> > >  M: Fam Zheng 
> > 
> 
> Block drivers without an M: naturally fall under the overall maintainership of
> block layer (Kevin), so IMO keeping the statuses is fine. Maybe CURL can 
> be
> degraded to Maintained, though.

For Gluster you can count on continued support from several developers
on the Gluster side when needed. I do not think we are (pro-)active
enough to claim maintainership of the driver.

Thanks,
Niels (one of the GlusterFS maintainers)



Re: [Qemu-block] [PATCH v3] block/gluster: Handle changed glfs_ftruncate signature

2018-08-01 Thread Niels de Vos
On Tue, Jul 31, 2018 at 03:51:22PM -0400, Jeff Cody wrote:
> On Tue, Jul 31, 2018 at 11:18:02AM +0200, Niels de Vos wrote:
> > On Mon, Jul 30, 2018 at 03:27:29PM -0400, Jeff Cody wrote:
> > > On Mon, Jul 30, 2018 at 10:07:27AM -0500, Eric Blake wrote:
> > > > On 07/28/2018 02:50 AM, Niels de Vos wrote:
> > > > >>
> > > > >>Part of me wishes that libgfapi had just created a new function
> > > > >>'glfs_ftruncate2', so that existing users don't need to handle the api
> > > > >>change.  But I guess in the grand scheme, not a huge deal either way.
> > > > >
> > > > >Gluster uses versioned symbols, so older binaries will keep working 
> > > > >with
> > > > >new libraries. It is (hopefully) rare that existing symbols get 
> > > > >updated.
> > > > >We try to send patches for these kind of changes to the projects we 
> > > > >know
> > > > >well in advance, reducing the number of surprises.
> > > > 
> > > > >>I can go ahead and add that to the comment in my branch after 
> > > > >>applying, if
> > > > >>Niels can let me know what that version is/will be (if known).
> > > > >
> > > > >The new glfs_ftruncate() will be part of glusterfs-5 (planned for
> > > > >October). We're changing the numbering scheme, it was expected to come
> > > > >in glusterfs-4.2, but that is a version that never will be released.
> > > > >
> > > > 
> > > > Wait - so you're saying gluster has not yet released the incompatible
> > > > change? Now would be the right time to get rid of the API breakage, 
> > > > before
> > > > you bake it in, rather than relying solely on the versioned symbols to 
> > > > avoid
> > > > an ABI breakage but forcing all clients to compensate to the API 
> > > > breakage.
> > > > 
> > > 
> > > If this is not yet in a released version of Gluster, I'm not real eager to
> > > pollute the QEMU driver codebase with #ifdef's, especially if there is a
> > > possibility the API change may not actually materialize.
> > > 
> > > Is there any reason that this change is being approached in a way that
> > > breaks API usage, and is it too late in the Gluster development pipeline 
> > > to
> > > change that?
> > 
> > There recently have been a few changes like this in libgfapi. These have
> > been introduced to improve performance in common use-cases where an
> > updated 'struct stat' is needed after an operation. Some functions have
> > been adapted in previous releases, glfs_ftruncate() landed too late for
> > that. I hope we'll get a complete coherent API with glusterfs-5 again.
> > 
> 
> Am I correct in assuming the API changes that happened in previous libgfapi
> release are for APIs that QEMU does not use (i.e. no action needed from
> us?)

Yes, that is correct.

> > For QEMU this means additional changes will come related to
> > glfs_fallocate(), glfs_zerofill() and probably more. The current
> > glusterfs-4.1 version will be maintained for one year, after which the
> > detection/#ifdef can be removed as the than maintained versions should
> > all have the updated API. I'm sorry for the inconvenience that this
> > causes.
> > 
> 
> Understood.  I don't want to make a huge deal out of it.  I guess my
> recommendation/request is that API enhancements happen in a way that don't
> break existing APIs (e.g. use parallel APIs), because QEMU version usage and
> supported glusterfs usage may not always follow supported versions in the
> wild.  I know versioned symbols helps with this, but it still complicates
> the code (and couples QEMU's code to gluster support windows).

Once the last Gluster version that has the old API goes out of
maintenance, we will remove the legacy functions. At that point, we can
also provide a new patch to QEMU (and other projcets) that removes the
compatibility #ifdef's.

> > If you prefer, I can wait with sending patches for QEMU with future
> > Gluster releases until additional changes have landed in libgfapi.
> >
> 
> I think generally, we don't want to start introducing #ifdef's and new APi
> usage for unreleased versions of libgfapi if possible (as unreleased APIs,
> it could technically change, and then QEMU releases would have 'dead' API
> support in it).
> 
> If glusterfs-5 releases in October, that may line up with 3.1 or 3.2.

Ok, I'll keep these kind of patches in my tree as work-in-progress and
when the glusterfs-5 gets tagged for alpha/beta send it again.

Even if not optimal, would you accept this as a way going forward?

Thanks,
Niels



Re: [Qemu-block] [PATCH v3] block/gluster: Handle changed glfs_ftruncate signature

2018-07-31 Thread Niels de Vos
On Mon, Jul 30, 2018 at 03:27:29PM -0400, Jeff Cody wrote:
> On Mon, Jul 30, 2018 at 10:07:27AM -0500, Eric Blake wrote:
> > On 07/28/2018 02:50 AM, Niels de Vos wrote:
> > >>
> > >>Part of me wishes that libgfapi had just created a new function
> > >>'glfs_ftruncate2', so that existing users don't need to handle the api
> > >>change.  But I guess in the grand scheme, not a huge deal either way.
> > >
> > >Gluster uses versioned symbols, so older binaries will keep working with
> > >new libraries. It is (hopefully) rare that existing symbols get updated.
> > >We try to send patches for these kind of changes to the projects we know
> > >well in advance, reducing the number of surprises.
> > 
> > >>I can go ahead and add that to the comment in my branch after applying, if
> > >>Niels can let me know what that version is/will be (if known).
> > >
> > >The new glfs_ftruncate() will be part of glusterfs-5 (planned for
> > >October). We're changing the numbering scheme, it was expected to come
> > >in glusterfs-4.2, but that is a version that never will be released.
> > >
> > 
> > Wait - so you're saying gluster has not yet released the incompatible
> > change? Now would be the right time to get rid of the API breakage, before
> > you bake it in, rather than relying solely on the versioned symbols to avoid
> > an ABI breakage but forcing all clients to compensate to the API breakage.
> > 
> 
> If this is not yet in a released version of Gluster, I'm not real eager to
> pollute the QEMU driver codebase with #ifdef's, especially if there is a
> possibility the API change may not actually materialize.
> 
> Is there any reason that this change is being approached in a way that
> breaks API usage, and is it too late in the Gluster development pipeline to
> change that?

There recently have been a few changes like this in libgfapi. These have
been introduced to improve performance in common use-cases where an
updated 'struct stat' is needed after an operation. Some functions have
been adapted in previous releases, glfs_ftruncate() landed too late for
that. I hope we'll get a complete coherent API with glusterfs-5 again.

For QEMU this means additional changes will come related to
glfs_fallocate(), glfs_zerofill() and probably more. The current
glusterfs-4.1 version will be maintained for one year, after which the
detection/#ifdef can be removed as the than maintained versions should
all have the updated API. I'm sorry for the inconvenience that this
causes.

If you prefer, I can wait with sending patches for QEMU with future
Gluster releases until additional changes have landed in libgfapi.

Thanks,
Niels



Re: [Qemu-block] [PATCH v3] block/gluster: Handle changed glfs_ftruncate signature

2018-07-28 Thread Niels de Vos
On Sat, Jul 28, 2018 at 12:18:39AM -0400, Jeff Cody wrote:
> On Fri, Jul 27, 2018 at 08:24:05AM -0500, Eric Blake wrote:
> > On 07/27/2018 03:19 AM, Niels de Vos wrote:
> > >From: Prasanna Kumar Kalever 
> > >
> > >New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
> > >function that returns additional 'struct stat' structures to enable
> > >advanced caching of attributes. This is useful for file servers, not so
> > >much for QEMU. Nevertheless, the API has changed and needs to be
> > >adopted.
> > >
> > 
> > Oh, one other comment.
> > 
> > >+++ b/block/gluster.c
> > >@@ -20,6 +20,10 @@
> > >  #include "qemu/option.h"
> > >  #include "qemu/cutils.h"
> > >+#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
> > >+# define glfs_ftruncate(fd, offset, _u1, _u2) glfs_ftruncate(fd, offset)
> > >+#endif
> > 
> > Someday, when we can assume new enough gluster everywhere, we can drop this
> > hunk...
> > 
> 
> Part of me wishes that libgfapi had just created a new function
> 'glfs_ftruncate2', so that existing users don't need to handle the api
> change.  But I guess in the grand scheme, not a huge deal either way.

Gluster uses versioned symbols, so older binaries will keep working with
new libraries. It is (hopefully) rare that existing symbols get updated.
We try to send patches for these kind of changes to the projects we know
well in advance, reducing the number of surprises.

> > >+++ b/configure
> > 
> > >+  /* new glfs_ftruncate() passes two additional args */
> > >+  return glfs_ftruncate(NULL, 0 /*, NULL, NULL */);
> > >+}
> > >+EOF
> > >+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
> > >+  glusterfs_legacy_ftruncate="yes"
> > >+fi
> > 
> > ...but it will be easier to remember to do so if this comment in configure
> > calls out the upstream gluster version that no longer requires the legacy
> > workaround, as our hint for when...
> > 
> 
> I can go ahead and add that to the comment in my branch after applying, if
> Niels can let me know what that version is/will be (if known).

The new glfs_ftruncate() will be part of glusterfs-5 (planned for
October). We're changing the numbering scheme, it was expected to come
in glusterfs-4.2, but that is a version that never will be released.

Thanks for correcting the last bits of the patch!
Niels


> > >else
> > >  if test "$glusterfs" = "yes" ; then
> > >feature_not_found "GlusterFS backend support" \
> > >@@ -6644,6 +6658,10 @@ if test "$glusterfs_zerofill" = "yes" ; then
> > >echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
> > >  fi
> > >+if test "$glusterfs_legacy_ftruncate" = "yes" ; then
> > >+  echo "CONFIG_GLUSTERFS_LEGACY_FTRUNCATE=y" >> $config_host_mak
> > 
> > ...this #define is no longer necessary.
> > 
> > -- 
> > Eric Blake, Principal Software Engineer
> > Red Hat, Inc.   +1-919-301-3266
> > Virtualization:  qemu.org | libvirt.org



[Qemu-block] [PATCH v3] block/gluster: Handle changed glfs_ftruncate signature

2018-07-27 Thread Niels de Vos
From: Prasanna Kumar Kalever 

New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Nevertheless, the API has changed and needs to be
adopted.

Signed-off-by: Prasanna Kumar Kalever 
Signed-off-by: Niels de Vos 

---
v2: do a compile check as suggested by Eric Blake
v3: define old backwards compatible glfs_ftruncate() macro, from Eric Blake
---
 block/gluster.c | 11 +--
 configure   | 18 ++
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 4fd55a9cc5..9bd6525b41 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -20,6 +20,10 @@
 #include "qemu/option.h"
 #include "qemu/cutils.h"
 
+#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
+# define glfs_ftruncate(fd, offset, _u1, _u2) glfs_ftruncate(fd, offset)
+#endif
+
 #define GLUSTER_OPT_FILENAME"filename"
 #define GLUSTER_OPT_VOLUME  "volume"
 #define GLUSTER_OPT_PATH"path"
@@ -997,6 +1001,7 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 PreallocMode prealloc, Error **errp)
 {
 int64_t current_length;
+int ret;
 
 current_length = glfs_lseek(fd, 0, SEEK_END);
 if (current_length < 0) {
@@ -1024,7 +1029,8 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 #endif /* CONFIG_GLUSTERFS_FALLOCATE */
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
 case PREALLOC_MODE_FULL:
-if (glfs_ftruncate(fd, offset)) {
+ret = glfs_ftruncate(fd, offset, NULL, NULL);
+if (ret) {
 error_setg_errno(errp, errno, "Could not resize file");
 return -errno;
 }
@@ -1035,7 +1041,8 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 break;
 #endif /* CONFIG_GLUSTERFS_ZEROFILL */
 case PREALLOC_MODE_OFF:
-if (glfs_ftruncate(fd, offset)) {
+ret = glfs_ftruncate(fd, offset, NULL, NULL);
+if (ret) {
 error_setg_errno(errp, errno, "Could not resize file");
 return -errno;
 }
diff --git a/configure b/configure
index 2a7796ea80..f3c0918d6b 100755
--- a/configure
+++ b/configure
@@ -451,6 +451,7 @@ glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
+glusterfs_legacy_ftruncate="no"
 gtk=""
 gtkabi=""
 gtk_gl="no"
@@ -3947,6 +3948,19 @@ if test "$glusterfs" != "no" ; then
   glusterfs_fallocate="yes"
   glusterfs_zerofill="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+int
+main(void)
+{
+   /* new glfs_ftruncate() passes two additional args */
+   return glfs_ftruncate(NULL, 0 /*, NULL, NULL */);
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_legacy_ftruncate="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6644,6 +6658,10 @@ if test "$glusterfs_zerofill" = "yes" ; then
   echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_legacy_ftruncate" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_LEGACY_FTRUNCATE=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
-- 
2.17.1




[Qemu-block] [PATCH v2] block/gluster: defend on legacy ftruncate api use

2018-07-26 Thread Niels de Vos
From: Prasanna Kumar Kalever 

New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Never the less, the API has changed and needs to be
adopted.

Signed-off-by: Prasanna Kumar Kalever 
Signed-off-by: Niels de Vos 

--
v2: do a compile check as suggested by Eric Blake
---
 block/gluster.c | 15 +--
 configure   | 18 ++
 2 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 4fd55a9cc5..d1c6f81f5c 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -997,6 +997,7 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 PreallocMode prealloc, Error **errp)
 {
 int64_t current_length;
+int ret;
 
 current_length = glfs_lseek(fd, 0, SEEK_END);
 if (current_length < 0) {
@@ -1024,7 +1025,12 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 #endif /* CONFIG_GLUSTERFS_FALLOCATE */
 #ifdef CONFIG_GLUSTERFS_ZEROFILL
 case PREALLOC_MODE_FULL:
-if (glfs_ftruncate(fd, offset)) {
+#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
+ret = glfs_ftruncate(fd, offset);
+#else
+ret = glfs_ftruncate(fd, offset, NULL, NULL);
+#endif
+if (ret) {
 error_setg_errno(errp, errno, "Could not resize file");
 return -errno;
 }
@@ -1035,7 +1041,12 @@ static int qemu_gluster_do_truncate(struct glfs_fd *fd, 
int64_t offset,
 break;
 #endif /* CONFIG_GLUSTERFS_ZEROFILL */
 case PREALLOC_MODE_OFF:
-if (glfs_ftruncate(fd, offset)) {
+#ifdef CONFIG_GLUSTERFS_LEGACY_FTRUNCATE
+ret = glfs_ftruncate(fd, offset);
+#else
+ret = glfs_ftruncate(fd, offset, NULL, NULL);
+#endif
+if (ret) {
 error_setg_errno(errp, errno, "Could not resize file");
 return -errno;
 }
diff --git a/configure b/configure
index 2a7796ea80..f3c0918d6b 100755
--- a/configure
+++ b/configure
@@ -451,6 +451,7 @@ glusterfs_xlator_opt="no"
 glusterfs_discard="no"
 glusterfs_fallocate="no"
 glusterfs_zerofill="no"
+glusterfs_legacy_ftruncate="no"
 gtk=""
 gtkabi=""
 gtk_gl="no"
@@ -3947,6 +3948,19 @@ if test "$glusterfs" != "no" ; then
   glusterfs_fallocate="yes"
   glusterfs_zerofill="yes"
 fi
+cat > $TMPC << EOF
+#include 
+
+int
+main(void)
+{
+   /* new glfs_ftruncate() passes two additional args */
+   return glfs_ftruncate(NULL, 0 /*, NULL, NULL */);
+}
+EOF
+if compile_prog "$glusterfs_cflags" "$glusterfs_libs" ; then
+  glusterfs_legacy_ftruncate="yes"
+fi
   else
 if test "$glusterfs" = "yes" ; then
   feature_not_found "GlusterFS backend support" \
@@ -6644,6 +6658,10 @@ if test "$glusterfs_zerofill" = "yes" ; then
   echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
 fi
 
+if test "$glusterfs_legacy_ftruncate" = "yes" ; then
+  echo "CONFIG_GLUSTERFS_LEGACY_FTRUNCATE=y" >> $config_host_mak
+fi
+
 if test "$libssh2" = "yes" ; then
   echo "CONFIG_LIBSSH2=m" >> $config_host_mak
   echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
-- 
2.17.1




[Qemu-block] [PATCH v3] gluster: add support for PREALLOC_MODE_FALLOC

2017-05-28 Thread Niels de Vos
Add missing support for "preallocation=falloc" to the Gluster block
driver. This change bases its logic on that of block/file-posix.c and
removed the gluster_supports_zerofill() and qemu_gluster_zerofill()
functions in favour of #ifdef checks in an easy to read
switch-statement.

Both glfs_zerofill() and glfs_fallocate() have been introduced with
GlusterFS 3.5.0 (pkg-config glusterfs-api = 6). A #define for the
availability of glfs_fallocate() has been added to ./configure.

Reported-by: Satheesaran Sundaramoorthi <sasun...@redhat.com>
URL: https://bugzilla.redhat.com/1450759
Signed-off-by: Niels de Vos <nde...@redhat.com>
---
v3 typo fixes spotted by Jeff Cody and Eric Blake:
- typo in the commit message (Eric)
- copy/paste mistake causing incorrect if-statement (Jeff)

v2 changes requested by Jeff Cody:
- add CONFIG_GLUSTERFS_FALLOCATE
- remove unneeded wrapper qemu_gluster_zerofill()

 block/gluster.c | 76 ++---
 configure   |  6 +
 2 files changed, 46 insertions(+), 36 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 7c76cd0..7863991 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -964,29 +964,6 @@ static coroutine_fn int 
qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
 qemu_coroutine_yield();
 return acb.ret;
 }
-
-static inline bool gluster_supports_zerofill(void)
-{
-return 1;
-}
-
-static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
-int64_t size)
-{
-return glfs_zerofill(fd, offset, size);
-}
-
-#else
-static inline bool gluster_supports_zerofill(void)
-{
-return 0;
-}
-
-static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
-int64_t size)
-{
-return 0;
-}
 #endif
 
 static int qemu_gluster_create(const char *filename,
@@ -996,9 +973,10 @@ static int qemu_gluster_create(const char *filename,
 struct glfs *glfs;
 struct glfs_fd *fd;
 int ret = 0;
-int prealloc = 0;
+PreallocMode prealloc;
 int64_t total_size = 0;
 char *tmp = NULL;
+Error *local_err = NULL;
 
 gconf = g_new0(BlockdevOptionsGluster, 1);
 gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
@@ -1026,13 +1004,12 @@ static int qemu_gluster_create(const char *filename,
   BDRV_SECTOR_SIZE);
 
 tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
-if (!tmp || !strcmp(tmp, "off")) {
-prealloc = 0;
-} else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
-prealloc = 1;
-} else {
-error_setg(errp, "Invalid preallocation mode: '%s'"
- " or GlusterFS doesn't support zerofill API", tmp);
+prealloc = qapi_enum_parse(PreallocMode_lookup, tmp,
+   PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
+   _err);
+g_free(tmp);
+if (local_err) {
+error_propagate(errp, local_err);
 ret = -EINVAL;
 goto out;
 }
@@ -1041,21 +1018,48 @@ static int qemu_gluster_create(const char *filename,
 O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | 
S_IWUSR);
 if (!fd) {
 ret = -errno;
-} else {
+goto out;
+}
+
+switch (prealloc) {
+#ifdef CONFIG_GLUSTERFS_FALLOCATE
+case PREALLOC_MODE_FALLOC:
+if (glfs_fallocate(fd, 0, 0, total_size)) {
+error_setg(errp, "Could not preallocate data for the new file");
+ret = -errno;
+}
+break;
+#endif /* CONFIG_GLUSTERFS_FALLOCATE */
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+case PREALLOC_MODE_FULL:
 if (!glfs_ftruncate(fd, total_size)) {
-if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
+if (glfs_zerofill(fd, 0, total_size)) {
+error_setg(errp, "Could not zerofill the new file");
 ret = -errno;
 }
 } else {
+error_setg(errp, "Could not resize file");
 ret = -errno;
 }
-
-if (glfs_close(fd) != 0) {
+break;
+#endif /* CONFIG_GLUSTERFS_ZEROFILL */
+case PREALLOC_MODE_OFF:
+if (glfs_ftruncate(fd, total_size) != 0) {
 ret = -errno;
+error_setg(errp, "Could not resize file");
 }
+break;
+default:
+ret = -EINVAL;
+error_setg(errp, "Unsupported preallocation mode: %s",
+   PreallocMode_lookup[prealloc]);
+break;
+}
+
+if (glfs_close(fd) != 0) {
+ret = -errno;
 }
 out:
-g_free(tmp);
 qapi_free_BlockdevOptionsGluster(gconf);
 glfs_clear_preopened(glfs);
 return ret;
diff --git a/configure b/configure
index 7c020c0..8b8291e 100755
--- a/configure
+++ b/configure
@@ -300,6 +300,7 @@ 

Re: [Qemu-block] [PATCH] block/gluster: glfs_lseek() workaround

2017-05-24 Thread Niels de Vos
On Wed, May 24, 2017 at 04:50:03PM -0400, Jeff Cody wrote:
> On Wed, May 24, 2017 at 11:02:02AM +0200, Niels de Vos wrote:
> > On Tue, May 23, 2017 at 01:27:50PM -0400, Jeff Cody wrote:
> > > On current released versions of glusterfs, glfs_lseek() will sometimes
> > > return invalid values for SEEK_DATA or SEEK_HOLE.  For SEEK_DATA and
> > > SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in
> > > the case of error:
> > > 
> > > LSEEK(2):
> > > 
> > > off_t lseek(int fd, off_t offset, int whence);
> > > 
> > > [...]
> > > 
> > > SEEK_HOLE
> > >   Adjust  the file offset to the next hole in the file greater
> > >   than or equal to offset.  If offset points into the middle 
> > > of
> > >   a hole, then the file offset is set to offset.  If there is 
> > > no
> > >   hole past offset, then the file offset is adjusted to the 
> > > end
> > >   of the file (i.e., there is  an implicit hole at the end of
> > >   any file).
> > > 
> > > [...]
> > > 
> > > RETURN VALUE
> > >   Upon  successful  completion,  lseek()  returns  the 
> > > resulting
> > >   offset location as measured in bytes from the beginning of 
> > > the
> > >   file.  On error, the value (off_t) -1 is returned and errno 
> > > is
> > >   set to indicate the error
> > > 
> > > However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a
> > > value less than the passed offset, yet greater than zero.
> > > 
> > > For instance, here are example values observed from this call:
> > > 
> > > offs = glfs_lseek(s->fd, start, SEEK_HOLE);
> > > if (offs < 0) {
> > > return -errno;  /* D1 and (H3 or H4) */
> > > }
> > > 
> > > start == 7608336384
> > > offs == 7607877632
> > > 
> > > This causes QEMU to abort on the assert test.  When this value is
> > > returned, errno is also 0.
> > > 
> > > This is a reported and known bug to glusterfs:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=1425293
> > > 
> > > Although this is being fixed in gluster, we still should work around it
> > > in QEMU, given that multiple released versions of gluster behave this
> > > way.
> > 
> > Versions of GlusterFS 3.8.0 - 3.8.8 are affected, 3.8.9 has the fix
> > already and was released in February this year. We encourage users to
> > update to recent versions, and provide a stable (bugfix only) update
> > each month.
> 
> I am able to reproduce this bug on a server running glusterfs 3.11.0rc0.  Is
> that expected?

No, that really is not expected. The backport that was reported to fix
it for a 3.8.4 version is definitely part of 3.11 already. Could you
pass me some details about your Gluster environment and volume, either
by email or in a bug? I'll try to reproduce and debug it from the
Gluster side.

There is a holiday tomorrow (I'm in The Netherlands), and I'm travelling
the whole weekend. I might not be able to look into this before next
week. Sorry about that!

Thanks,
Niels


> 
> > 
> > The Red Hat Gluster Storage product that is often used in combination
> > with QEMU (+oVirt) does unfortunately not have an update where this is
> > fixed.
> > 
> > Using an unfixed Gluster storage environment can cause QEMU to segfault.
> > Although fixes exist for Gluster, not all users will have them
> > installed. Preventing the segfault in QEMU due to a broken storage
> > environment makes sense to me.
> > 
> > > This patch treats the return case of (offs < start) the same as if an
> > > error value other than ENXIO is returned; we will assume we learned
> > > nothing, and there are no holes in the file.
> > > 
> > > Signed-off-by: Jeff Cody <jc...@redhat.com>
> > > ---
> > >  block/gluster.c | 18 --
> > >  1 file changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/block/gluster.c b/block/gluster.c
> > > index 7c76cd0..c147909e 100644
> > > --- a/block/gluster.c
> > > +++ b/block/gluster.c
> > > @@ -1275,7 +1275,14 @@ static int find_allocation(BlockDriverState *bs, 
> > > off_t start,
> > >  if (offs < 0) {
> > >  return -errno;  /* D3 or D4 */
> > >   

Re: [Qemu-block] [PATCH] block/gluster: glfs_lseek() workaround

2017-05-24 Thread Niels de Vos
On Tue, May 23, 2017 at 01:27:50PM -0400, Jeff Cody wrote:
> On current released versions of glusterfs, glfs_lseek() will sometimes
> return invalid values for SEEK_DATA or SEEK_HOLE.  For SEEK_DATA and
> SEEK_HOLE, the returned value should be >= the passed offset, or < 0 in
> the case of error:
> 
> LSEEK(2):
> 
> off_t lseek(int fd, off_t offset, int whence);
> 
> [...]
> 
> SEEK_HOLE
>   Adjust  the file offset to the next hole in the file greater
>   than or equal to offset.  If offset points into the middle of
>   a hole, then the file offset is set to offset.  If there is no
>   hole past offset, then the file offset is adjusted to the end
>   of the file (i.e., there is  an implicit hole at the end of
>   any file).
> 
> [...]
> 
> RETURN VALUE
>   Upon  successful  completion,  lseek()  returns  the resulting
>   offset location as measured in bytes from the beginning of the
>   file.  On error, the value (off_t) -1 is returned and errno is
>   set to indicate the error
> 
> However, occasionally glfs_lseek() for SEEK_HOLE/DATA will return a
> value less than the passed offset, yet greater than zero.
> 
> For instance, here are example values observed from this call:
> 
> offs = glfs_lseek(s->fd, start, SEEK_HOLE);
> if (offs < 0) {
> return -errno;  /* D1 and (H3 or H4) */
> }
> 
> start == 7608336384
> offs == 7607877632
> 
> This causes QEMU to abort on the assert test.  When this value is
> returned, errno is also 0.
> 
> This is a reported and known bug to glusterfs:
> https://bugzilla.redhat.com/show_bug.cgi?id=1425293
> 
> Although this is being fixed in gluster, we still should work around it
> in QEMU, given that multiple released versions of gluster behave this
> way.

Versions of GlusterFS 3.8.0 - 3.8.8 are affected, 3.8.9 has the fix
already and was released in February this year. We encourage users to
update to recent versions, and provide a stable (bugfix only) update
each month.

The Red Hat Gluster Storage product that is often used in combination
with QEMU (+oVirt) does unfortunately not have an update where this is
fixed.

Using an unfixed Gluster storage environment can cause QEMU to segfault.
Although fixes exist for Gluster, not all users will have them
installed. Preventing the segfault in QEMU due to a broken storage
environment makes sense to me.

> This patch treats the return case of (offs < start) the same as if an
> error value other than ENXIO is returned; we will assume we learned
> nothing, and there are no holes in the file.
> 
> Signed-off-by: Jeff Cody <jc...@redhat.com>
> ---
>  block/gluster.c | 18 --
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 7c76cd0..c147909e 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -1275,7 +1275,14 @@ static int find_allocation(BlockDriverState *bs, off_t 
> start,
>  if (offs < 0) {
>  return -errno;  /* D3 or D4 */
>  }
> -assert(offs >= start);
> +
> +if (offs < start) {
> +/* This is not a valid return by lseek().  We are safe to just return
> + * -EIO in this case, and we'll treat it like D4. Unfortunately some
> + *  versions of libgfapi will return offs < start, so an assert here
> + *  will unneccesarily abort QEMU. */

This is not really correct, the problem is not in libgfapi, but in the
protocol translator on the server-side. The version of libgfapi does not
matter here, it is the version on the server. But that might be too much
detail for the comment.

> +return -EIO;
> +}
>  
>  if (offs > start) {
>  /* D2: in hole, next data at offs */
> @@ -1307,7 +1314,14 @@ static int find_allocation(BlockDriverState *bs, off_t 
> start,
>  if (offs < 0) {
>  return -errno;  /* D1 and (H3 or H4) */
>  }
> -assert(offs >= start);
> +
> +if (offs < start) {
> +/* This is not a valid return by lseek().  We are safe to just return
> + * -EIO in this case, and we'll treat it like H4. Unfortunately some
> + *  versions of libgfapi will return offs < start, so an assert here
> + *  will unneccesarily abort QEMU. */
> +return -EIO;
> +}
>  
>  if (offs > start) {
>  /*
> -- 
> 2.9.3
> 

You might want to explain the problem a little different in the commit
message. It is fine too if you think it would become too detailed, my
explanation is in the archives now in any case.

Reviewed-by: Niels de Vos <nde...@redhat.com>



Re: [Qemu-block] [PATCH v2] gluster: add support for PREALLOC_MODE_FALLOC

2017-05-18 Thread Niels de Vos
On Thu, May 18, 2017 at 01:54:36PM -0400, Jeff Cody wrote:
> On Thu, May 18, 2017 at 11:54:22AM +0200, Niels de Vos wrote:
> > Add missing support for "preallocation=falloc" to the Gluster block
> > driver. This change bases its logic on that of block/file-posix.c and
> > removed the gluster_supports_zerofill() and qemu_gluster_zerofill()
> > functiond in favour of #ifdef checks in an easy to read
> > switch-statement.
> > 
> > Both glfs_zerofill() and glfs_fallocate() have been introduced with
> > GlusterFS 3.5.0 (pkg-config glusterfs-api = 6). A #define for the
> > availability of glfs_fallocate() has been added to ./configure.
> > 
> > Reported-by: Satheesaran Sundaramoorthi <sasun...@redhat.com>
> > URL: https://bugzilla.redhat.com/1450759
> > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > ---
> > v2 changes requested by Jeff Cody:
> > - add CONFIG_GLUSTERFS_FALLOCATE
> > - remove unneeded wrapper qemu_gluster_zerofill()
> > 
> >  block/gluster.c | 76 
> > ++---
> >  configure   |  6 +
> >  2 files changed, 46 insertions(+), 36 deletions(-)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 7c76cd0..0610183 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -964,29 +964,6 @@ static coroutine_fn int 
> > qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
> >  qemu_coroutine_yield();
> >  return acb.ret;
> >  }
> > -
> > -static inline bool gluster_supports_zerofill(void)
> > -{
> > -return 1;
> > -}
> > -
> > -static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
> > -int64_t size)
> > -{
> > -return glfs_zerofill(fd, offset, size);
> > -}
> > -
> > -#else
> > -static inline bool gluster_supports_zerofill(void)
> > -{
> > -return 0;
> > -}
> > -
> > -static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
> > -int64_t size)
> > -{
> > -return 0;
> > -}
> >  #endif
> >  
> >  static int qemu_gluster_create(const char *filename,
> > @@ -996,9 +973,10 @@ static int qemu_gluster_create(const char *filename,
> >  struct glfs *glfs;
> >  struct glfs_fd *fd;
> >  int ret = 0;
> > -int prealloc = 0;
> > +PreallocMode prealloc;
> >  int64_t total_size = 0;
> >  char *tmp = NULL;
> > +Error *local_err = NULL;
> >  
> >  gconf = g_new0(BlockdevOptionsGluster, 1);
> >  gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
> > @@ -1026,13 +1004,12 @@ static int qemu_gluster_create(const char *filename,
> >BDRV_SECTOR_SIZE);
> >  
> >  tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
> > -if (!tmp || !strcmp(tmp, "off")) {
> > -prealloc = 0;
> > -} else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
> > -prealloc = 1;
> > -} else {
> > -error_setg(errp, "Invalid preallocation mode: '%s'"
> > - " or GlusterFS doesn't support zerofill API", 
> > tmp);
> > +prealloc = qapi_enum_parse(PreallocMode_lookup, tmp,
> > +   PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
> > +   _err);
> > +g_free(tmp);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> >  ret = -EINVAL;
> >  goto out;
> >  }
> > @@ -1041,21 +1018,48 @@ static int qemu_gluster_create(const char *filename,
> >  O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | 
> > S_IWUSR);
> >  if (!fd) {
> >  ret = -errno;
> > -} else {
> > +goto out;
> > +}
> > +
> > +switch (prealloc) {
> > +#ifdef CONFIG_GLUSTERFS_FALLOCATE
> > +case PREALLOC_MODE_FALLOC:
> > +if (!glfs_fallocate(fd, 0, 0, total_size)) {
> 
> Does glfs_fallocate() return 0 on failure?  Both posix and linux versions of
> fallocate() return 0 on success.

No, it should return 0 on success. This is a copy/paste error from the
glfs_ftruncate() below, and that if/else structure is a little
different. I did not notice the error message during my testing
though... Will check it again tomorrow.

Thanks,
Niels


> 
> > +error_setg(errp, &q

Re: [Qemu-block] [Qemu-devel] [PATCH v2] gluster: add support for PREALLOC_MODE_FALLOC

2017-05-18 Thread Niels de Vos
On Thu, May 18, 2017 at 09:30:53AM -0500, Eric Blake wrote:
> On 05/18/2017 04:54 AM, Niels de Vos wrote:
> > Add missing support for "preallocation=falloc" to the Gluster block
> > driver. This change bases its logic on that of block/file-posix.c and
> > removed the gluster_supports_zerofill() and qemu_gluster_zerofill()
> > functiond in favour of #ifdef checks in an easy to read
> 
> s/functiond/functions/
> 
> > switch-statement.
> > 
> > Both glfs_zerofill() and glfs_fallocate() have been introduced with
> > GlusterFS 3.5.0 (pkg-config glusterfs-api = 6). A #define for the
> > availability of glfs_fallocate() has been added to ./configure.
> > 
> > Reported-by: Satheesaran Sundaramoorthi <sasun...@redhat.com>
> > URL: https://bugzilla.redhat.com/1450759
> > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > ---
> > v2 changes requested by Jeff Cody:
> > - add CONFIG_GLUSTERFS_FALLOCATE
> > - remove unneeded wrapper qemu_gluster_zerofill()
> 
> The automated tools prefer that v2 patches are sent as a new top-level
> thread, rather than buried in-reply-to the v1 thread.

Ok, I'm happy to resend it with the typo corrected if that helps.

Niels



[Qemu-block] [PATCH v2] gluster: add support for PREALLOC_MODE_FALLOC

2017-05-18 Thread Niels de Vos
Add missing support for "preallocation=falloc" to the Gluster block
driver. This change bases its logic on that of block/file-posix.c and
removed the gluster_supports_zerofill() and qemu_gluster_zerofill()
functiond in favour of #ifdef checks in an easy to read
switch-statement.

Both glfs_zerofill() and glfs_fallocate() have been introduced with
GlusterFS 3.5.0 (pkg-config glusterfs-api = 6). A #define for the
availability of glfs_fallocate() has been added to ./configure.

Reported-by: Satheesaran Sundaramoorthi <sasun...@redhat.com>
URL: https://bugzilla.redhat.com/1450759
Signed-off-by: Niels de Vos <nde...@redhat.com>
---
v2 changes requested by Jeff Cody:
- add CONFIG_GLUSTERFS_FALLOCATE
- remove unneeded wrapper qemu_gluster_zerofill()

 block/gluster.c | 76 ++---
 configure   |  6 +
 2 files changed, 46 insertions(+), 36 deletions(-)

diff --git a/block/gluster.c b/block/gluster.c
index 7c76cd0..0610183 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -964,29 +964,6 @@ static coroutine_fn int 
qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
 qemu_coroutine_yield();
 return acb.ret;
 }
-
-static inline bool gluster_supports_zerofill(void)
-{
-return 1;
-}
-
-static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
-int64_t size)
-{
-return glfs_zerofill(fd, offset, size);
-}
-
-#else
-static inline bool gluster_supports_zerofill(void)
-{
-return 0;
-}
-
-static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
-int64_t size)
-{
-return 0;
-}
 #endif
 
 static int qemu_gluster_create(const char *filename,
@@ -996,9 +973,10 @@ static int qemu_gluster_create(const char *filename,
 struct glfs *glfs;
 struct glfs_fd *fd;
 int ret = 0;
-int prealloc = 0;
+PreallocMode prealloc;
 int64_t total_size = 0;
 char *tmp = NULL;
+Error *local_err = NULL;
 
 gconf = g_new0(BlockdevOptionsGluster, 1);
 gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
@@ -1026,13 +1004,12 @@ static int qemu_gluster_create(const char *filename,
   BDRV_SECTOR_SIZE);
 
 tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
-if (!tmp || !strcmp(tmp, "off")) {
-prealloc = 0;
-} else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
-prealloc = 1;
-} else {
-error_setg(errp, "Invalid preallocation mode: '%s'"
- " or GlusterFS doesn't support zerofill API", tmp);
+prealloc = qapi_enum_parse(PreallocMode_lookup, tmp,
+   PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
+   _err);
+g_free(tmp);
+if (local_err) {
+error_propagate(errp, local_err);
 ret = -EINVAL;
 goto out;
 }
@@ -1041,21 +1018,48 @@ static int qemu_gluster_create(const char *filename,
 O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | 
S_IWUSR);
 if (!fd) {
 ret = -errno;
-} else {
+goto out;
+}
+
+switch (prealloc) {
+#ifdef CONFIG_GLUSTERFS_FALLOCATE
+case PREALLOC_MODE_FALLOC:
+if (!glfs_fallocate(fd, 0, 0, total_size)) {
+error_setg(errp, "Could not preallocate data for the new file");
+ret = -errno;
+}
+break;
+#endif /* CONFIG_GLUSTERFS_FALLOCATE */
+#ifdef CONFIG_GLUSTERFS_ZEROFILL
+case PREALLOC_MODE_FULL:
 if (!glfs_ftruncate(fd, total_size)) {
-if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
+if (glfs_zerofill(fd, 0, total_size)) {
+error_setg(errp, "Could not zerofill the new file");
 ret = -errno;
 }
 } else {
+error_setg(errp, "Could not resize file");
 ret = -errno;
 }
-
-if (glfs_close(fd) != 0) {
+break;
+#endif /* CONFIG_GLUSTERFS_ZEROFILL */
+case PREALLOC_MODE_OFF:
+if (glfs_ftruncate(fd, total_size) != 0) {
 ret = -errno;
+error_setg(errp, "Could not resize file");
 }
+break;
+default:
+ret = -EINVAL;
+error_setg(errp, "Unsupported preallocation mode: %s",
+   PreallocMode_lookup[prealloc]);
+break;
+}
+
+if (glfs_close(fd) != 0) {
+ret = -errno;
 }
 out:
-g_free(tmp);
 qapi_free_BlockdevOptionsGluster(gconf);
 glfs_clear_preopened(glfs);
 return ret;
diff --git a/configure b/configure
index 7c020c0..8b8291e 100755
--- a/configure
+++ b/configure
@@ -300,6 +300,7 @@ seccomp=""
 glusterfs=""
 glusterfs_xlator_opt="no"
 glusterfs_discard="no"
+glusterfs_fallocate=&quo

Re: [Qemu-block] [PATCH] gluster: add support for PREALLOC_MODE_FALLOC

2017-05-17 Thread Niels de Vos
On Tue, May 16, 2017 at 11:42:37AM -0400, Jeff Cody wrote:
> On Mon, May 15, 2017 at 09:11:36PM +0200, Niels de Vos wrote:
> > Add missing support for "preallocation=falloc" to the Gluster block
> > driver. This change bases its logic on that of block/file-posix.c and
> > removed the gluster_supports_zerofill() function in favour of an #ifdef
> > check in an easy to read switch-statement.
> > 
> > Reported-by: Satheesaran Sundaramoorthi <sasun...@redhat.com>
> > URL: https://bugzilla.redhat.com/1450759
> > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > ---
> >  block/gluster.c | 61 
> > +++--
> >  1 file changed, 38 insertions(+), 23 deletions(-)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 7c76cd0..566166f 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -965,11 +965,6 @@ static coroutine_fn int 
> > qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
> >  return acb.ret;
> >  }
> >  
> > -static inline bool gluster_supports_zerofill(void)
> > -{
> > -return 1;
> > -}
> > -
> >  static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
> >  int64_t size)
> >  {
> > @@ -977,11 +972,6 @@ static inline int qemu_gluster_zerofill(struct glfs_fd 
> > *fd, int64_t offset,
> >  }
> >  
> >  #else
> > -static inline bool gluster_supports_zerofill(void)
> > -{
> > -return 0;
> > -}
> > -
> >  static inline int qemu_gluster_zerofill(struct glfs_fd *fd, int64_t offset,
> >  int64_t size)
> >  {
> > @@ -996,9 +986,10 @@ static int qemu_gluster_create(const char *filename,
> >  struct glfs *glfs;
> >  struct glfs_fd *fd;
> >  int ret = 0;
> > -int prealloc = 0;
> > +PreallocMode prealloc;
> >  int64_t total_size = 0;
> >  char *tmp = NULL;
> > +Error *local_err = NULL;
> >  
> >  gconf = g_new0(BlockdevOptionsGluster, 1);
> >  gconf->debug = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
> > @@ -1026,13 +1017,12 @@ static int qemu_gluster_create(const char *filename,
> >BDRV_SECTOR_SIZE);
> >  
> >  tmp = qemu_opt_get_del(opts, BLOCK_OPT_PREALLOC);
> > -if (!tmp || !strcmp(tmp, "off")) {
> > -prealloc = 0;
> > -} else if (!strcmp(tmp, "full") && gluster_supports_zerofill()) {
> > -prealloc = 1;
> > -} else {
> > -error_setg(errp, "Invalid preallocation mode: '%s'"
> > - " or GlusterFS doesn't support zerofill API", 
> > tmp);
> > +prealloc = qapi_enum_parse(PreallocMode_lookup, tmp,
> > +   PREALLOC_MODE__MAX, PREALLOC_MODE_OFF,
> > +   _err);
> > +g_free(tmp);
> > +if (local_err) {
> > +error_propagate(errp, local_err);
> >  ret = -EINVAL;
> >  goto out;
> >  }
> > @@ -1041,21 +1031,46 @@ static int qemu_gluster_create(const char *filename,
> >  O_WRONLY | O_CREAT | O_TRUNC | O_BINARY, S_IRUSR | 
> > S_IWUSR);
> >  if (!fd) {
> >  ret = -errno;
> > -} else {
> > +goto out;
> > +}
> > +
> > +switch (prealloc) {
> > +case PREALLOC_MODE_FALLOC:
> > +if (!glfs_fallocate(fd, 0, 0, total_size)) {
> > +error_setg(errp, "Could not preallocate data for the new 
> > file");
> > +ret = -errno;
> > +}
> 
> Both glfs_fallocate() and glfs_zerofill() are from the same release version,
> right?  I think we need a CONFIG_GLUSTER_FALLOC around the
> PREALLOC_MODE_FALLOC case, just like CONFIG_GLUSTERFS_ZEROFILL.

Yes, both were added at the same GlusterFS release. I'll add the check
in ./configure and #ifdef it.

> > +break;
> > +#ifdef CONFIG_GLUSTERFS_ZEROFILL
> > +case PREALLOC_MODE_FULL:
> >  if (!glfs_ftruncate(fd, total_size)) {
> > -if (prealloc && qemu_gluster_zerofill(fd, 0, total_size)) {
> > +if (qemu_gluster_zerofill(fd, 0, total_size)) {
> 
> If using the CONFIG_GLUSTER_ZEROFILL here, then please get rid of the
> qemu_gluster_zerofill() wrapper, which is in place for the same reason as
> this ifdef.  We can just call glfs_zerofill() here directly.

Rig

Re: [Qemu-block] [PATCH 14/31] gluster: Switch to .bdrv_co_block_status()

2017-04-19 Thread Niels de Vos
On Mon, Apr 17, 2017 at 08:33:39PM -0500, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  Update the gluster driver accordingly.
> 
> Signed-off-by: Eric Blake <ebl...@redhat.com>
> ---
>  block/gluster.c | 47 +++
>  1 file changed, 23 insertions(+), 24 deletions(-)

Looks good to me, thanks.

Reviewed-by: Niels de Vos <nde...@redhat.com>


> diff --git a/block/gluster.c b/block/gluster.c
> index 1d4e2f7..3f252c6 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -1332,24 +1332,24 @@ exit:
>  /*
>   * Returns the allocation status of the specified sectors.
>   *
> - * If 'sector_num' is beyond the end of the disk image the return value is 0
> + * If 'offset' is beyond the end of the disk image the return value is 0
>   * and 'pnum' is set to 0.
>   *
> - * 'pnum' is set to the number of sectors (including and immediately 
> following
> - * the specified sector) that are known to be in the same
> + * 'pnum' is set to the number of bytes (including and immediately following
> + * the specified offset) that are known to be in the same
>   * allocated/unallocated state.
>   *
> - * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
> + * 'bytes' is the max value 'pnum' should be set to.  If bytes goes
>   * beyond the end of the disk image it will be clamped.
>   *
> - * (Based on raw_co_get_block_status() from file-posix.c.)
> + * (Based on raw_co_block_status() from file-posix.c.)
>   */
> -static int64_t coroutine_fn qemu_gluster_co_get_block_status(
> -BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
> +static int64_t coroutine_fn qemu_gluster_co_block_status(
> +BlockDriverState *bs, int64_t offset, int64_t bytes, int64_t *pnum,
>  BlockDriverState **file)
>  {
>  BDRVGlusterState *s = bs->opaque;
> -off_t start, data = 0, hole = 0;
> +off_t data = 0, hole = 0;
>  int64_t total_size;
>  int ret = -EINVAL;
> 
> @@ -1357,41 +1357,40 @@ static int64_t coroutine_fn 
> qemu_gluster_co_get_block_status(
>  return ret;
>  }
> 
> -start = sector_num * BDRV_SECTOR_SIZE;
>  total_size = bdrv_getlength(bs);
>  if (total_size < 0) {
>  return total_size;
> -} else if (start >= total_size) {
> +} else if (offset >= total_size) {
>  *pnum = 0;
>  return 0;
> -} else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) {
> -nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE);
> +} else if (offset + bytes > total_size) {
> +bytes = total_size - offset;
>  }
> 
> -ret = find_allocation(bs, start, , );
> +ret = find_allocation(bs, offset, , );
>  if (ret == -ENXIO) {
>  /* Trailing hole */
> -*pnum = nb_sectors;
> +*pnum = bytes;
>  ret = BDRV_BLOCK_ZERO;
>  } else if (ret < 0) {
>  /* No info available, so pretend there are no holes */
> -*pnum = nb_sectors;
> +*pnum = bytes;
>  ret = BDRV_BLOCK_DATA;
> -} else if (data == start) {
> +} else if (data == offset) {
>  /* On a data extent, compute sectors to the end of the extent,
>   * possibly including a partial sector at EOF. */
> -*pnum = MIN(nb_sectors, DIV_ROUND_UP(hole - start, 
> BDRV_SECTOR_SIZE));
> +*pnum = MIN(bytes, hole - offset);
>  ret = BDRV_BLOCK_DATA;
>  } else {
>  /* On a hole, compute sectors to the beginning of the next extent.  
> */
> -assert(hole == start);
> -*pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
> +assert(hole == offset);
> +*pnum = MIN(bytes, data - offset);
>  ret = BDRV_BLOCK_ZERO;
>  }
> 
>  *file = bs;
> 
> -return ret | BDRV_BLOCK_OFFSET_VALID | start;
> +return ret | BDRV_BLOCK_OFFSET_VALID | (offset & BDRV_BLOCK_OFFSET_MASK);
>  }
> 
> 
> @@ -1419,7 +1418,7 @@ static BlockDriver bdrv_gluster = {
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>  .bdrv_co_pwrite_zeroes= qemu_gluster_co_pwrite_zeroes,
>  #endif
> -.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
> +.bdrv_co_block_status = qemu_gluster_co_block_status,
>  .create_opts  = _gluster_create_opts,
>  };
> 
> @@ -1447,7 +1446,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  #ifdef CONFIG_GLUSTERFS_ZEROFILL
>  .bdrv_co_pwrite_zeroes= qemu_gluster_co_pwrite_zeroes,
>  #endif
> -.bdrv_co_get_block_status = qemu

Re: [Qemu-block] [Qemu-devel] [PATCH 12/15] gluster: Plug memory leaks in qemu_gluster_parse_json()

2017-03-03 Thread Niels de Vos
On Fri, Mar 03, 2017 at 09:35:16AM +0100, Markus Armbruster wrote:
> Niels de Vos <nde...@redhat.com> writes:
> 
> > On Fri, Mar 03, 2017 at 08:38:45AM +0100, Markus Armbruster wrote:
> >> Niels de Vos <nde...@redhat.com> writes:
> >> 
> >> > On Thu, Mar 02, 2017 at 10:44:03PM +0100, Markus Armbruster wrote:
> >> >> To reproduce, run
> >> >> 
> >> >> $ valgrind qemu-system-x86_64 --nodefaults -S --drive 
> >> >> driver=gluster,volume=testvol,path=/a/b/c,server.0.type=xxx
> >> >> 
> >> >> Signed-off-by: Markus Armbruster <arm...@redhat.com>
> >> >> ---
> >> >>  block/gluster.c | 28 ++--
> >> >>  1 file changed, 14 insertions(+), 14 deletions(-)
> >> >> 
> >> >> diff --git a/block/gluster.c b/block/gluster.c
> >> >> index 6fbcf9e..35a7abb 100644
> >> >> --- a/block/gluster.c
> >> >> +++ b/block/gluster.c
> >> >> @@ -480,7 +480,7 @@ static int 
> >> >> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
> >> >>QDict *options, Error **errp)
> >> >>  {
> >> >>  QemuOpts *opts;
> >> >> -GlusterServer *gsconf;
> >> >> +GlusterServer *gsconf = NULL;
> >> >>  GlusterServerList *curr = NULL;
> >> >>  QDict *backing_options = NULL;
> >> >>  Error *local_err = NULL;
> >> >> @@ -529,17 +529,16 @@ static int 
> >> >> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
> >> >>  }
> >> >>  
> >> >>  ptr = qemu_opt_get(opts, GLUSTER_OPT_TYPE);
> >> >> +if (!ptr) {
> >> >> +error_setg(_err, QERR_MISSING_PARAMETER, 
> >> >> GLUSTER_OPT_TYPE);
> >> >> +error_append_hint(_err, GERR_INDEX_HINT, i);
> >> >> +goto out;
> >> >> +
> >> >> +}
> >> >>  gsconf = g_new0(GlusterServer, 1);
> >> >>  gsconf->type = qapi_enum_parse(GlusterTransport_lookup, ptr,
> >> >> -   GLUSTER_TRANSPORT__MAX,
> >> >> -   GLUSTER_TRANSPORT__MAX,
> >> >> +   GLUSTER_TRANSPORT__MAX, 0,
> >> >
> >> > What is the reason to set the default to 0 and not the more readable
> >> > GLUSTER_TRANSPORT_UNIX?
> >> 
> >> I chose 0 because the actual value must not matter: we don't want a
> >> default here.
> >> 
> >> qapi_enum_parse() returns this argument when ptr isn't found.  If ptr is
> >> non-null, it additionally sets an error.  Since ptr can't be null here,
> >> the argument is only returned together with an error.  But then we won't
> >> use *gsconf.
> >
> > Ah, right, I that see now.
> >
> >> Do you think -1 instead of 0 would be clearer?
> >
> > Yes, it would be to me. -1 is clearly not part of the GLUSTER_TRANSPORT_*
> > enum, so it suggests it is a different case.
> >
> > Thanks!
> 
> I'll change it.
> 
> May I add your R-by then?

Yes, of course.

Thanks,
Niels



Re: [Qemu-block] [Qemu-devel] [PATCH 12/15] gluster: Plug memory leaks in qemu_gluster_parse_json()

2017-03-03 Thread Niels de Vos
On Fri, Mar 03, 2017 at 08:38:45AM +0100, Markus Armbruster wrote:
> Niels de Vos <nde...@redhat.com> writes:
> 
> > On Thu, Mar 02, 2017 at 10:44:03PM +0100, Markus Armbruster wrote:
> >> To reproduce, run
> >> 
> >> $ valgrind qemu-system-x86_64 --nodefaults -S --drive 
> >> driver=gluster,volume=testvol,path=/a/b/c,server.0.type=xxx
> >> 
> >> Signed-off-by: Markus Armbruster <arm...@redhat.com>
> >> ---
> >>  block/gluster.c | 28 ++--
> >>  1 file changed, 14 insertions(+), 14 deletions(-)
> >> 
> >> diff --git a/block/gluster.c b/block/gluster.c
> >> index 6fbcf9e..35a7abb 100644
> >> --- a/block/gluster.c
> >> +++ b/block/gluster.c
> >> @@ -480,7 +480,7 @@ static int 
> >> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
> >>QDict *options, Error **errp)
> >>  {
> >>  QemuOpts *opts;
> >> -GlusterServer *gsconf;
> >> +GlusterServer *gsconf = NULL;
> >>  GlusterServerList *curr = NULL;
> >>  QDict *backing_options = NULL;
> >>  Error *local_err = NULL;
> >> @@ -529,17 +529,16 @@ static int 
> >> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
> >>  }
> >>  
> >>  ptr = qemu_opt_get(opts, GLUSTER_OPT_TYPE);
> >> +if (!ptr) {
> >> +error_setg(_err, QERR_MISSING_PARAMETER, 
> >> GLUSTER_OPT_TYPE);
> >> +error_append_hint(_err, GERR_INDEX_HINT, i);
> >> +goto out;
> >> +
> >> +}
> >>  gsconf = g_new0(GlusterServer, 1);
> >>  gsconf->type = qapi_enum_parse(GlusterTransport_lookup, ptr,
> >> -   GLUSTER_TRANSPORT__MAX,
> >> -   GLUSTER_TRANSPORT__MAX,
> >> +   GLUSTER_TRANSPORT__MAX, 0,
> >
> > What is the reason to set the default to 0 and not the more readable
> > GLUSTER_TRANSPORT_UNIX?
> 
> I chose 0 because the actual value must not matter: we don't want a
> default here.
> 
> qapi_enum_parse() returns this argument when ptr isn't found.  If ptr is
> non-null, it additionally sets an error.  Since ptr can't be null here,
> the argument is only returned together with an error.  But then we won't
> use *gsconf.

Ah, right, I that see now.

> Do you think -1 instead of 0 would be clearer?

Yes, it would be to me. -1 is clearly not part of the GLUSTER_TRANSPORT_*
enum, so it suggests it is a different case.

Thanks!

> 
> >> _err);
> >> -if (!ptr) {
> >> -error_setg(_err, QERR_MISSING_PARAMETER, 
> >> GLUSTER_OPT_TYPE);
> >> -error_append_hint(_err, GERR_INDEX_HINT, i);
> >> -goto out;
> >> -
> >> -}
> >>  if (local_err) {
> >>  error_append_hint(_err,
> >>"Parameter '%s' may be 'tcp' or 'unix'\n",
> >> @@ -626,8 +625,10 @@ static int 
> >> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
> >>  curr->next->value = gsconf;
> >>  curr = curr->next;
> >>  }
> >> +gsconf = NULL;
> >>  
> >> -qdict_del(backing_options, str);
> >> +QDECREF(backing_options);
> >> +backing_options = NULL;
> >>  g_free(str);
> >>  str = NULL;
> >>  }
> >> @@ -636,11 +637,10 @@ static int 
> >> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
> >>  
> >>  out:
> >>  error_propagate(errp, local_err);
> >> +qapi_free_GlusterServer(gsconf);
> >>  qemu_opts_del(opts);
> >> -if (str) {
> >> -qdict_del(backing_options, str);
> >> -g_free(str);
> >> -}
> >> +g_free(str);
> >> +QDECREF(backing_options);
> >>  errno = EINVAL;
> >>  return -errno;
> >>  }
> >> -- 
> >> 2.7.4
> >> 
> >> 



Re: [Qemu-block] [PATCH 12/15] gluster: Plug memory leaks in qemu_gluster_parse_json()

2017-03-02 Thread Niels de Vos
On Thu, Mar 02, 2017 at 10:44:03PM +0100, Markus Armbruster wrote:
> To reproduce, run
> 
> $ valgrind qemu-system-x86_64 --nodefaults -S --drive 
> driver=gluster,volume=testvol,path=/a/b/c,server.0.type=xxx
> 
> Signed-off-by: Markus Armbruster 
> ---
>  block/gluster.c | 28 ++--
>  1 file changed, 14 insertions(+), 14 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 6fbcf9e..35a7abb 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -480,7 +480,7 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster 
> *gconf,
>QDict *options, Error **errp)
>  {
>  QemuOpts *opts;
> -GlusterServer *gsconf;
> +GlusterServer *gsconf = NULL;
>  GlusterServerList *curr = NULL;
>  QDict *backing_options = NULL;
>  Error *local_err = NULL;
> @@ -529,17 +529,16 @@ static int 
> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
>  }
>  
>  ptr = qemu_opt_get(opts, GLUSTER_OPT_TYPE);
> +if (!ptr) {
> +error_setg(_err, QERR_MISSING_PARAMETER, GLUSTER_OPT_TYPE);
> +error_append_hint(_err, GERR_INDEX_HINT, i);
> +goto out;
> +
> +}
>  gsconf = g_new0(GlusterServer, 1);
>  gsconf->type = qapi_enum_parse(GlusterTransport_lookup, ptr,
> -   GLUSTER_TRANSPORT__MAX,
> -   GLUSTER_TRANSPORT__MAX,
> +   GLUSTER_TRANSPORT__MAX, 0,

What is the reason to set the default to 0 and not the more readable
GLUSTER_TRANSPORT_UNIX?

> _err);
> -if (!ptr) {
> -error_setg(_err, QERR_MISSING_PARAMETER, GLUSTER_OPT_TYPE);
> -error_append_hint(_err, GERR_INDEX_HINT, i);
> -goto out;
> -
> -}
>  if (local_err) {
>  error_append_hint(_err,
>"Parameter '%s' may be 'tcp' or 'unix'\n",
> @@ -626,8 +625,10 @@ static int 
> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
>  curr->next->value = gsconf;
>  curr = curr->next;
>  }
> +gsconf = NULL;
>  
> -qdict_del(backing_options, str);
> +QDECREF(backing_options);
> +backing_options = NULL;
>  g_free(str);
>  str = NULL;
>  }
> @@ -636,11 +637,10 @@ static int 
> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
>  
>  out:
>  error_propagate(errp, local_err);
> +qapi_free_GlusterServer(gsconf);
>  qemu_opts_del(opts);
> -if (str) {
> -qdict_del(backing_options, str);
> -g_free(str);
> -}
> +g_free(str);
> +QDECREF(backing_options);
>  errno = EINVAL;
>  return -errno;
>  }
> -- 
> 2.7.4
> 
> 



Re: [Qemu-block] [PATCH 10/15] gluster: Drop assumptions on SocketTransport names

2017-03-02 Thread Niels de Vos
On Thu, Mar 02, 2017 at 10:44:01PM +0100, Markus Armbruster wrote:
> qemu_gluster_glfs_init() passes the names of QAPI enumeration type
> SocketTransport to glfs_set_volfile_server().  Works, because they
> were chosen to match.  But the coupling is artificial.  Use the
> appropriate literal strings instead.
> 
> Signed-off-by: Markus Armbruster <arm...@redhat.com>
> ---
>  block/gluster.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 56b4abe..7236d59 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -412,8 +412,7 @@ static struct glfs 
> *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
>  
>  for (server = gconf->server; server; server = server->next) {
>  if (server->value->type  == GLUSTER_TRANSPORT_UNIX) {
> -ret = glfs_set_volfile_server(glfs,
> -   
> GlusterTransport_lookup[server->value->type],
> +ret = glfs_set_volfile_server(glfs, "unix",
> server->value->u.q_unix.path, 0);
>  } else {
>  if (parse_uint_full(server->value->u.tcp.port, , 10) < 0 ||
> @@ -423,8 +422,7 @@ static struct glfs 
> *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
>  errno = EINVAL;
>  goto out;
>  }
> -ret = glfs_set_volfile_server(glfs,
> -   
> GlusterTransport_lookup[server->value->type],
> +ret = glfs_set_volfile_server(glfs, "tcp",
> server->value->u.tcp.host,
> (int)port);
>  }
> -- 
> 2.7.4

Instead of the strings for "unix" and "tcp", I would have liked
#define's. Unfortunately it seems that these are not available in public
headers :-/

If this is easier to understand, I don't have any objections.

Reviewed-by: Niels de Vos <nde...@redhat.com>



Re: [Qemu-block] [PATCH 11/15] gluster: Don't duplicate qapi-util.c's qapi_enum_parse()

2017-03-02 Thread Niels de Vos
On Thu, Mar 02, 2017 at 10:44:02PM +0100, Markus Armbruster wrote:
> Signed-off-by: Markus Armbruster <arm...@redhat.com>
> ---
>  block/gluster.c | 30 +-
>  1 file changed, 9 insertions(+), 21 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 7236d59..6fbcf9e 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -12,6 +12,7 @@
>  #include "block/block_int.h"
>  #include "qapi/error.h"
>  #include "qapi/qmp/qerror.h"
> +#include "qapi/util.h"
>  #include "qemu/uri.h"
>  #include "qemu/error-report.h"
>  #include "qemu/cutils.h"
> @@ -472,23 +473,6 @@ out:
>  return NULL;
>  }
>  
> -static int qapi_enum_parse(const char *opt)
> -{
> -int i;
> -
> -if (!opt) {
> -return GLUSTER_TRANSPORT__MAX;
> -}
> -
> -for (i = 0; i < GLUSTER_TRANSPORT__MAX; i++) {
> -if (!strcmp(opt, GlusterTransport_lookup[i])) {
> -return i;
> -}
> -}
> -
> -return i;
> -}
> -
>  /*
>   * Convert the json formatted command line into qapi.
>  */
> @@ -546,16 +530,20 @@ static int 
> qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
>  
>  ptr = qemu_opt_get(opts, GLUSTER_OPT_TYPE);
>  gsconf = g_new0(GlusterServer, 1);
> -gsconf->type = qapi_enum_parse(ptr);
> +gsconf->type = qapi_enum_parse(GlusterTransport_lookup, ptr,
> +   GLUSTER_TRANSPORT__MAX,
> +   GLUSTER_TRANSPORT__MAX,
> +   _err);
>  if (!ptr) {
>  error_setg(_err, QERR_MISSING_PARAMETER, GLUSTER_OPT_TYPE);
>  error_append_hint(_err, GERR_INDEX_HINT, i);
>  goto out;
>  
>  }
> -if (gsconf->type == GLUSTER_TRANSPORT__MAX) {
> -error_setg(_err, QERR_INVALID_PARAMETER_VALUE,
> -   GLUSTER_OPT_TYPE, "tcp or unix");
> +if (local_err) {
> +error_append_hint(_err,
> +  "Parameter '%s' may be 'tcp' or 'unix'\n",
> +  GLUSTER_OPT_TYPE);
>  error_append_hint(_err, GERR_INDEX_HINT, i);
>  goto out;
>  }
> -- 
> 2.7.4
> 
> 

Looks good to me.

Reviewed-by: Niels de Vos <nde...@redhat.com>



Re: [Qemu-block] semantics of FIEMAP without FIEMAP_FLAG_SYNC (was Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server)

2016-07-20 Thread Niels de Vos
On Wed, Jul 20, 2016 at 10:30:25PM +1000, Dave Chinner wrote:
> On Wed, Jul 20, 2016 at 05:19:37AM -0400, Paolo Bonzini wrote:
> > Adding ext4 and XFS guys (Lukas and Dave respectively).  As a quick recap, 
> > the
> > issue here is the semantics of FIEMAP and SEEK_HOLE/SEEK_DATA, which we use 
> > in
> > "qemu-img map".  This command prints metadata about a virtual disk 
> > image---which
> > in the case of a raw image amounts to detecting holes and unwritten extents.
> > 
> > First, it seems like SEEK_HOLE and SEEK_DATA really should be 
> > "SEEK_NONZERO" and
> > "SEEK_ZERO", on both ext4 and XFS.You can see that unwritten extents are
> > reported by "qemu-img map" as holes:
> 
> Correctly so. seek hole/data knows nothing about the underlying
> filesystem storage implementation. A "hole" is defined as a region
> of zeros, and the filesystem is free call anything it kows for
> certain contains zeros as a hole.
> 
> FYI, SEEK_HOLE/DATA were named after the the pre-existing Solaris
> seek API that uses these definitions for hole and data
> 
> > $ dd if=/dev/urandom of=test.img bs=1M count=100
> > $ fallocate -z -o 10M -l 10M test.img
> > $ du -h test.img
> > $ qemu-img map --output=json test.img
> > [{ "start": 0, "length": 10485760, "depth": 0, "zero": false, "data": 
> > true, "offset": 0},
> > { "start": 10485760, "length": 10485760, "depth": 0, "zero": true, 
> > "data": false, "offset": 10485760},
> > { "start": 20971520, "length": 83886080, "depth": 0, "zero": false, 
> > "data": true, "offset": 20971520}]
> 
> > 
> > On the second line, zero=true data=false identifies a hole.  The right 
> > output
> > would either have zero=true data=true (unwritten extent) or just
> > 
> > [{ "start": 0, "length": 104857600, "depth": 0, "zero": false, "data": 
> > true, "offset": 0},
> >
> > since the zero flag is advisory (it doesn't try to detect zeroes beyond 
> > what the
> > filesystem says).
> 
> Not from SEEK_HOLE/SEEK_DATA. All it conveys is "this is the start
> of a range of zeros" and "this is the start of a range of data". And
> for filesystems that don't specifically implement these seek
> operations, zeros are considered data. i.e. SEEK_HOLE will take you
> to the end of file, SEEK_DATA returns the current position
> 
> i.e. unwritten extents contain no data, so they are semantically
> identical to holes for the purposes of seeking and hence SEEK_DATA
> can skip over them.
> 
> > The reason why we disabled FIEMAP was a combination of a corruption and 
> > performance
> > issue.  The data corruption bug was at 
> > https://bugs.launchpad.net/qemu/+bug/1368815
> > and it was reported on Ubuntu Trusty (kernel 3.13 based on the release 
> > notes at
> > https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes).  We corrected that by 
> > using
> > FIEMAP_FLAG_SYNC, based on a similar patch to coreutils.  This turned out 
> > to be too
> > slow, so we dropped FIEMAP altogether.
> 
> Yes, because FIEMAP output is only useful for diagnostic purposes as
> it can be stale even before the syscall returns to userspace. i.e.
> it can't be used in userspace for optimising file copies

Oh... And I was surprised to learn that "cp" does use FIEMAP and not
SEEK_HOLE/SEEK_DATA. You give a strong suggestion to fix that.
  http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/extent-scan.c#n87

> Finding regions of valid data in a file is what SEEK_HOLE/SEEK_DATA
> is for, not FIEMAP. FIEMAP only reports the physical layout of the
> file, now where the currently valid data in the file lies.
> 
> > However, today I found kernel commit 91dd8c114499 ("ext4: prevent race 
> > while walking
> > extent tree for fiemap", 2012-11-28) whose commit message says:
> > 
> > Moreover the extent currently in delayed allocation might be allocated
> > after we search the extent tree and before we search extent status tree
> > delayed buffers resulting in those delayed buffers being completely
> > missed, even though completely written and allocated.
> > 
> > This seems pretty much like our data corruption bug; it would mean that
> > using FIEMAP_FLAG_SYNC was only working around a bug and delayed allocations
> > _should_ be reported as usual by FIEMAP.
> 
> What do you think you can do with the delayed allocation regions in
> the file?
> 
> Indeed, with specualtive delayed allocation, XFS can report delalloc
> regions that have no actual resemblence to the layout of dirty data
> in the file. SEEK_DATA will iterate the delalloc regions containing
> data precisely, however.
> 
> > Except that the commit went in kernel 3.8 and as said above Trusty had 3.13.
> > So either there are other bugs, or my understanding of the commit is not 
> > correct.
> > So the questions for Lukas and Dave are:
> > 
> > 1) is it expected that SEEK_HOLE skips unwritten extents?
> 
> There are multiple answers to this, all of which are correct depending
> on current context and state:
> 
> 1. No - some 

Re: [Qemu-block] [PATCH for-2.7 v2 06/17] gluster: Implement .bdrv_lockf

2016-04-15 Thread Niels de Vos
On Fri, Apr 15, 2016 at 11:27:56AM +0800, Fam Zheng wrote:
> Signed-off-by: Fam Zheng <f...@redhat.com>
> ---
>  block/gluster.c | 30 ++
>  1 file changed, 30 insertions(+)

The Gluster changes look good to me.

Reviewed-by: Niels de Vos <nde...@redhat.com>

> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 51e154c..c23e944 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -672,6 +672,32 @@ static void qemu_gluster_close(BlockDriverState *bs)
>  glfs_fini(s->glfs);
>  }
>  
> +static int qemu_gluster_lockf(BlockDriverState *bs, BdrvLockfCmd cmd)
> +{
> +BDRVGlusterState *s = bs->opaque;
> +int ret;
> +struct flock fl = (struct flock) {
> +.l_start = 0,
> +.l_whence  = SEEK_SET,
> +.l_len = 0,
> +};
> +switch (cmd) {
> +case BDRV_LOCKF_RWLOCK:
> +fl.l_type = F_WRLCK;
> +break;
> +case BDRV_LOCKF_ROLOCK:
> +fl.l_type = F_RDLCK;
> +break;
> +case BDRV_LOCKF_UNLOCK:
> +fl.l_type = F_UNLCK;
> +break;
> +default:
> +abort();
> +}
> +ret = glfs_posix_lock(s->fd, F_SETLK, );
> +return ret == -1 ? -errno : 0;
> +}
> +
>  static int qemu_gluster_has_zero_init(BlockDriverState *bs)
>  {
>  /* GlusterFS volume could be backed by a block device */
> @@ -713,6 +739,7 @@ static BlockDriver bdrv_gluster = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_lockf   = qemu_gluster_lockf,
>  .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_discard  = qemu_gluster_co_discard,
> @@ -740,6 +767,7 @@ static BlockDriver bdrv_gluster_tcp = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_lockf   = qemu_gluster_lockf,
>  .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_discard  = qemu_gluster_co_discard,
> @@ -767,6 +795,7 @@ static BlockDriver bdrv_gluster_unix = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_lockf   = qemu_gluster_lockf,
>  .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_discard  = qemu_gluster_co_discard,
> @@ -794,6 +823,7 @@ static BlockDriver bdrv_gluster_rdma = {
>  .bdrv_co_readv= qemu_gluster_co_readv,
>  .bdrv_co_writev   = qemu_gluster_co_writev,
>  .bdrv_co_flush_to_disk= qemu_gluster_co_flush_to_disk,
> +.bdrv_lockf   = qemu_gluster_lockf,
>  .bdrv_has_zero_init   = qemu_gluster_has_zero_init,
>  #ifdef CONFIG_GLUSTERFS_DISCARD
>  .bdrv_co_discard  = qemu_gluster_co_discard,
> -- 
> 2.8.0
> 
> 



Re: [Qemu-block] [PATCH for-2.7 1/1] block/gluster: add support for selecting debug logging level

2016-04-08 Thread Niels de Vos
On Thu, Apr 07, 2016 at 05:24:19PM -0400, Jeff Cody wrote:
> This adds commandline support for the logging level of the
> gluster protocol driver, output to stdout.  The option is 'debug',
> e.g.:
> 
> -drive filename=gluster://192.168.15.180/gv2/test.qcow2,debug=9
> 
> Debug levels are 0-9, with 9 being the most verbose, and 0 representing
> no debugging output.  The default is the same as it was before, which
> is a level of 4.  The current logging levels defined in the gluster
> source are:
> 
> 0 - None
> 1 - Emergency
> 2 - Alert
> 3 - Critical
> 4 - Error
> 5 - Warning
> 6 - Notice
> 7 - Info
> 8 - Debug
> 9 - Trace
> 
> (From: glusterfs/logging.h)
> 
> Signed-off-by: Jeff Cody <jc...@redhat.com>

Thanks, this is something I wanted to get done for a long time already.

Reviewed-by: Niels de Vos <nde...@redhat.com>

> ---
>  block/gluster.c | 48 +---
>  1 file changed, 41 insertions(+), 7 deletions(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 51e154c..c6d4892 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -24,6 +24,7 @@ typedef struct GlusterAIOCB {
>  typedef struct BDRVGlusterState {
>  struct glfs *glfs;
>  struct glfs_fd *fd;
> +int debug_level;
>  } BDRVGlusterState;
>  
>  typedef struct GlusterConf {
> @@ -32,6 +33,7 @@ typedef struct GlusterConf {
>  char *volname;
>  char *image;
>  char *transport;
> +int debug_level;
>  } GlusterConf;
>  
>  static void qemu_gluster_gconf_free(GlusterConf *gconf)
> @@ -194,11 +196,7 @@ static struct glfs *qemu_gluster_init(GlusterConf 
> *gconf, const char *filename,
>  goto out;
>  }
>  
> -/*
> - * TODO: Use GF_LOG_ERROR instead of hard code value of 4 here when
> - * GlusterFS makes GF_LOG_* macros available to libgfapi users.
> - */
> -ret = glfs_set_logging(glfs, "-", 4);
> +ret = glfs_set_logging(glfs, "-", gconf->debug_level);
>  if (ret < 0) {
>  goto out;
>  }
> @@ -256,16 +254,26 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, 
> ssize_t ret, void *arg)
>  qemu_bh_schedule(acb->bh);
>  }
>  
> +#define GLUSTER_OPT_FILENAME "filename"
> +#define GLUSTER_OPT_DEBUG "debug"
> +#define GLUSTER_DEBUG_DEFAULT 4
> +#define GLUSTER_DEBUG_MAX 9
> +
>  /* TODO Convert to fine grained options */
>  static QemuOptsList runtime_opts = {
>  .name = "gluster",
>  .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
>  .desc = {
>  {
> -.name = "filename",
> +.name = GLUSTER_OPT_FILENAME,
>  .type = QEMU_OPT_STRING,
>  .help = "URL to the gluster image",
>  },
> +{
> +.name = GLUSTER_OPT_DEBUG,
> +.type = QEMU_OPT_NUMBER,
> +.help = "Gluster log level, valid range is 0-9",
> +},
>  { /* end of list */ }
>  },
>  };
> @@ -306,8 +314,17 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> QDict *options,
>  goto out;
>  }
>  
> -filename = qemu_opt_get(opts, "filename");
> +filename = qemu_opt_get(opts, GLUSTER_OPT_FILENAME);
>  
> +s->debug_level = qemu_opt_get_number(opts, GLUSTER_OPT_DEBUG,
> + GLUSTER_DEBUG_DEFAULT);
> +if (s->debug_level < 0) {
> +s->debug_level = 0;
> +} else if (s->debug_level > GLUSTER_DEBUG_MAX) {
> +s->debug_level = GLUSTER_DEBUG_MAX;
> +}
> +
> +gconf->debug_level = s->debug_level;
>  s->glfs = qemu_gluster_init(gconf, filename, errp);
>  if (!s->glfs) {
>  ret = -errno;
> @@ -346,6 +363,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
> *state,
> BlockReopenQueue *queue, Error **errp)
>  {
>  int ret = 0;
> +BDRVGlusterState *s;
>  BDRVGlusterReopenState *reop_s;
>  GlusterConf *gconf = NULL;
>  int open_flags = 0;
> @@ -353,6 +371,8 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
> *state,
>  assert(state != NULL);
>  assert(state->bs != NULL);
>  
> +s = state->bs->opaque;
> +
>  state->opaque = g_new0(BDRVGlusterReopenState, 1);
>  reop_s = state->opaque;
>  
> @@ -360,6 +380,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState 
> *state,
>  
>  gconf = g_new0(GlusterConf, 1);
>  
> +gconf->debug_leve

Re: [Qemu-block] [PATCH for-2.6 1/2] block/gluster: return correct error value

2016-04-06 Thread Niels de Vos
On Tue, Apr 05, 2016 at 11:29:51PM -0400, Jeff Cody wrote:
> Upon error, gluster will call the aio callback function with a
> ret value of -1, with errno set to the proper error value.  If
> we set the acb->ret value to the return value in the callback,
> that results in every error being EPERM (i.e. 1).  Instead, set
> it to the proper error result.
> 
> Signed-off-by: Jeff Cody <jc...@redhat.com>

Looks good to me.

Reviewed-by: Niels de Vos <nde...@redhat.com>

> ---
>  block/gluster.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block/gluster.c b/block/gluster.c
> index 7bface2..30a827e 100644
> --- a/block/gluster.c
> +++ b/block/gluster.c
> @@ -245,7 +245,7 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, 
> ssize_t ret, void *arg)
>  if (!ret || ret == acb->size) {
>  acb->ret = 0; /* Success */
>  } else if (ret < 0) {
> -acb->ret = ret; /* Read/Write failed */
> +acb->ret = -errno; /* Read/Write failed */
>  } else {
>  acb->ret = -EIO; /* Partial read/write - fail it */
>  }
> -- 
> 1.9.3
> 



Re: [Qemu-block] [PATCH for-2.6 2/2] block/gluster: prevent data loss after i/o error

2016-04-06 Thread Niels de Vos
On Wed, Apr 06, 2016 at 08:44:18AM -0400, Jeff Cody wrote:
> On Wed, Apr 06, 2016 at 01:02:16PM +0200, Kevin Wolf wrote:
> > [ Adding some CCs ]
> > 
> > Am 06.04.2016 um 05:29 hat Jeff Cody geschrieben:
> > > Upon receiving an I/O error after an fsync, by default gluster will
> > > dump its cache.  However, QEMU will retry the fsync, which is especially
> > > useful when encountering errors such as ENOSPC when using the werror=stop
> > > option.  When using caching with gluster, however, the last written data
> > > will be lost upon encountering ENOSPC.  Using the cache xlator option of

Use "write-behind xlator" instead of "cache xlator". There are different
caches in Gluster.

> > > 'resync-failed-syncs-after-fsync' should cause gluster to retain the
> > > cached data after a failed fsync, so that ENOSPC and other transient
> > > errors are recoverable.
> > > 
> > > Signed-off-by: Jeff Cody 
> > > ---
> > >  block/gluster.c | 27 +++
> > >  configure   |  8 
> > >  2 files changed, 35 insertions(+)
> > > 
> > > diff --git a/block/gluster.c b/block/gluster.c
> > > index 30a827e..b1cf71b 100644
> > > --- a/block/gluster.c
> > > +++ b/block/gluster.c
> > > @@ -330,6 +330,23 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> > > QDict *options,
> > >  goto out;
> > >  }
> > >  
> > > +#ifdef CONFIG_GLUSTERFS_XLATOR_OPT
> > > +/* Without this, if fsync fails for a recoverable reason (for 
> > > instance,
> > > + * ENOSPC), gluster will dump its cache, preventing retries.  This 
> > > means
> > > + * almost certain data loss.  Not all gluster versions support the
> > > + * 'resync-failed-syncs-after-fsync' key value, but there is no way 
> > > to
> > > + * discover during runtime if it is supported (this api returns 
> > > success for
> > > + * unknown key/value pairs) */
> > 
> > Honestly, this sucks. There is apparently no way to operate gluster so
> > we can safely recover after a failed fsync. "We hope everything is fine,
> > but depending on your gluster version, we may now corrupt your image"
> > isn't very good.
> > 
> > We need to consider very carefully if this is good enough to go on after
> > an error. I'm currently leaning towards "no". That is, we should only
> > enable this after Gluster provides us a way to make sure that the option
> > is really set.

Unfortunately it is also possible to disable the write-behind xlator as
well. This would cause setting the option to fail too :-/ At the moment
there is no real useful way to detect if write-behind is disabled (it is
enabled by default).

> > > +ret = glfs_set_xlator_option (s->glfs, "*-write-behind",
> > > +   
> > > "resync-failed-syncs-after-fsync",
> > > +   "on");
> > > +if (ret < 0) {
> > > +error_setg_errno(errp, errno, "Unable to set xlator key/value 
> > > pair");
> > > +ret = -errno;
> > > +goto out;
> > > +}
> > > +#endif
> > 
> > We also need to consider the case without CONFIG_GLUSTERFS_XLATOR_OPT.
> > In this case (as well as theoretically in the case that the option
> > didn't take effect - if only we could know about it), a failed
> > glfs_fsync_async() is fatal and we need to stop operating on the image,
> > i.e. set bs->drv = NULL like when we detect corruption in qcow2 images.
> > The guest will see a broken disk that fails all I/O requests, but that's
> > better than corrupting data.
> >
> 
> Gluster versions that don't support CONFIG_GLUSTERFS_XLATOR_OPT will
> also not have the gluster patch that removes the file descriptor
> invalidation upon error (unless that was a relatively new
> bug/feature).  But if that is the case, every operation on the file
> descriptor in those versions will return error.  But it is also rather
> old versions that don't support glfs_set_xlator_option() I believe.

Indeed, glfs_set_xlator_option() was introduced with glusterfs-3.4.0. We
are currently on glusterfs-3.7, with the oldest supported version of
3.5. In ~2 months we hopefully have a 3.8 release and that will cause
the end-of-life of 3.5. 3.4 has been EOL for about a year now, hopefully
all our users have upgraded, but we know that some users will stay on
unsupported versions for a long time...

However, the "resync-failed-syncs-after-fsync" option was only
introduced recently, with glusterfs-3.7.9. You could detect this with
pkg-config glusterfs-api >= 7.3.7.9 if need to be.

More details about the problem the option addresses are in the commit
message on http://review.gluster.org/13057 .

HTH,
Niels



Re: [Qemu-block] [PATCH v3] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-15 Thread Niels de Vos
On Tue, Mar 15, 2016 at 03:52:02PM -0400, Jeff Cody wrote:
> On Tue, Mar 15, 2016 at 03:50:17PM -0400, Jeff Cody wrote:
> > On Thu, Mar 10, 2016 at 07:38:00PM +0100, Niels de Vos wrote:
> > > GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
> > > it possible to detect sparse areas in files.
> > > 
> > > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > > 
> > > ---
> > > Tested by compiling and running "qemu-img map gluster://..." with a
> > > build of the current master branch of glusterfs. Using a Fedora cloud
> > > image (in raw format) shows many SEEK procudure calls going back and
> > > forth over the network. The output of "qemu map" matches the output when
> > > run against the image on the local filesystem.
> > > 
> > > v2 based on feedback from Jeff Cody:
> > > - Replace compile time detection by runtime detection
> > > - Update return pointer (new argument) for .bdrv_co_get_block_status
> > > ---
> > >  block/gluster.c | 182 
> > > 
> > >  1 file changed, 182 insertions(+)
> > > 
> > > diff --git a/block/gluster.c b/block/gluster.c
> > > index 65077a0..a4f0628 100644
> > > --- a/block/gluster.c
> > > +++ b/block/gluster.c
> > > @@ -23,6 +23,7 @@ typedef struct GlusterAIOCB {
> > >  typedef struct BDRVGlusterState {
> > >  struct glfs *glfs;
> > >  struct glfs_fd *fd;
> > > +bool supports_seek_data;
> > >  } BDRVGlusterState;
> > >  
> > >  typedef struct GlusterConf {
> > > @@ -286,6 +287,28 @@ static void qemu_gluster_parse_flags(int bdrv_flags, 
> > > int *open_flags)
> > >  }
> > >  }
> > >  
> > > +/*
> > > + * Do SEEK_DATA/HOLE to detect if it is functional. Older broken 
> > > versions of
> > > + * gfapi incorrectly return the current offset when SEEK_DATA/HOLE is 
> > > used.
> > > + * - Corrected versions return -1 and set errno to EINVAL.
> > > + * - Versions that support SEEK_DATA/HOLE correctly, will return -1 and 
> > > set
> > > + *   errno to ENXIO when SEEK_DATA is called with a position of EOF.
> > > + */
> > > +static bool qemu_gluster_test_seek(struct glfs_fd *fd)
> > > +{
> > > +off_t ret, eof;
> > > +
> > > +eof = glfs_lseek(fd, 0, SEEK_END);
> > > +if (eof < 0) {
> > > +/* this should never occur */
> > > +return false;
> > > +}
> > > +
> > > +/* this should always fail with ENXIO if SEEK_DATA is supported */
> > > +ret = glfs_lseek(fd, eof, SEEK_DATA);
> > > +return (ret < 0) && (errno == ENXIO);
> > > +}
> > > +
> > >  static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
> > >   int bdrv_flags, Error **errp)
> > >  {
> > > @@ -320,6 +343,8 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> > > QDict *options,
> > >  ret = -errno;
> > >  }
> > >  
> > > +s->supports_seek_data = qemu_gluster_test_seek(s->fd);
> > > +
> > >  out:
> > >  qemu_opts_del(opts);
> > >  qemu_gluster_gconf_free(gconf);
> > > @@ -677,6 +702,159 @@ static int 
> > > qemu_gluster_has_zero_init(BlockDriverState *bs)
> > >  return 0;
> > >  }
> > >  
> > > +/*
> > > + * Find allocation range in @bs around offset @start.
> > > + * May change underlying file descriptor's file offset.
> > > + * If @start is not in a hole, store @start in @data, and the
> > > + * beginning of the next hole in @hole, and return 0.
> > > + * If @start is in a non-trailing hole, store @start in @hole and the
> > > + * beginning of the next non-hole in @data, and return 0.
> > > + * If @start is in a trailing hole or beyond EOF, return -ENXIO.
> > > + * If we can't find out, return a negative errno other than -ENXIO.
> > > + *
> > > + * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
> > > + */
> > > +static int find_allocation(BlockDriverState *bs, off_t start,
> > > +   off_t *data, off_t *hole)
> > > +{
> > > +BDRVGlusterState *s = bs->opaque;
> > > +off_t offs;
> > > +
> > > +if (!s->supports_seek_data) {
> > > +

[Qemu-block] [PATCH v3] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-10 Thread Niels de Vos
GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
it possible to detect sparse areas in files.

Signed-off-by: Niels de Vos <nde...@redhat.com>

---
Tested by compiling and running "qemu-img map gluster://..." with a
build of the current master branch of glusterfs. Using a Fedora cloud
image (in raw format) shows many SEEK procudure calls going back and
forth over the network. The output of "qemu map" matches the output when
run against the image on the local filesystem.

v2 based on feedback from Jeff Cody:
- Replace compile time detection by runtime detection
- Update return pointer (new argument) for .bdrv_co_get_block_status
---
 block/gluster.c | 182 
 1 file changed, 182 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 65077a0..a4f0628 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -23,6 +23,7 @@ typedef struct GlusterAIOCB {
 typedef struct BDRVGlusterState {
 struct glfs *glfs;
 struct glfs_fd *fd;
+bool supports_seek_data;
 } BDRVGlusterState;
 
 typedef struct GlusterConf {
@@ -286,6 +287,28 @@ static void qemu_gluster_parse_flags(int bdrv_flags, int 
*open_flags)
 }
 }
 
+/*
+ * Do SEEK_DATA/HOLE to detect if it is functional. Older broken versions of
+ * gfapi incorrectly return the current offset when SEEK_DATA/HOLE is used.
+ * - Corrected versions return -1 and set errno to EINVAL.
+ * - Versions that support SEEK_DATA/HOLE correctly, will return -1 and set
+ *   errno to ENXIO when SEEK_DATA is called with a position of EOF.
+ */
+static bool qemu_gluster_test_seek(struct glfs_fd *fd)
+{
+off_t ret, eof;
+
+eof = glfs_lseek(fd, 0, SEEK_END);
+if (eof < 0) {
+/* this should never occur */
+return false;
+}
+
+/* this should always fail with ENXIO if SEEK_DATA is supported */
+ret = glfs_lseek(fd, eof, SEEK_DATA);
+return (ret < 0) && (errno == ENXIO);
+}
+
 static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
  int bdrv_flags, Error **errp)
 {
@@ -320,6 +343,8 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 ret = -errno;
 }
 
+s->supports_seek_data = qemu_gluster_test_seek(s->fd);
+
 out:
 qemu_opts_del(opts);
 qemu_gluster_gconf_free(gconf);
@@ -677,6 +702,159 @@ static int qemu_gluster_has_zero_init(BlockDriverState 
*bs)
 return 0;
 }
 
+/*
+ * Find allocation range in @bs around offset @start.
+ * May change underlying file descriptor's file offset.
+ * If @start is not in a hole, store @start in @data, and the
+ * beginning of the next hole in @hole, and return 0.
+ * If @start is in a non-trailing hole, store @start in @hole and the
+ * beginning of the next non-hole in @data, and return 0.
+ * If @start is in a trailing hole or beyond EOF, return -ENXIO.
+ * If we can't find out, return a negative errno other than -ENXIO.
+ *
+ * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
+ */
+static int find_allocation(BlockDriverState *bs, off_t start,
+   off_t *data, off_t *hole)
+{
+BDRVGlusterState *s = bs->opaque;
+off_t offs;
+
+if (!s->supports_seek_data) {
+return -ENOTSUP;
+}
+
+/*
+ * SEEK_DATA cases:
+ * D1. offs == start: start is in data
+ * D2. offs > start: start is in a hole, next data at offs
+ * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
+ *  or start is beyond EOF
+ * If the latter happens, the file has been truncated behind
+ * our back since we opened it.  All bets are off then.
+ * Treating like a trailing hole is simplest.
+ * D4. offs < 0, errno != ENXIO: we learned nothing
+ */
+offs = glfs_lseek(s->fd, start, SEEK_DATA);
+if (offs < 0) {
+return -errno;  /* D3 or D4 */
+}
+assert(offs >= start);
+
+if (offs > start) {
+/* D2: in hole, next data at offs */
+*hole = start;
+*data = offs;
+return 0;
+}
+
+/* D1: in data, end not yet known */
+
+/*
+ * SEEK_HOLE cases:
+ * H1. offs == start: start is in a hole
+ * If this happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H2. offs > start: either start is in data, next hole at offs,
+ *   or start is in trailing hole, EOF at offs
+ * Linux treats trailing holes like any other hole: offs ==
+ * start.  Solaris seeks to EOF instead: offs > start (blech).
+ * If that happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H3. offs < 0, errno = ENXIO: start is beyond EOF
+ * If this happens, the file has been truncated behind our
+ * back since we opened it.  Treat it like a trailing 

Re: [Qemu-block] [PATCH v2] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-09 Thread Niels de Vos
On Wed, Mar 09, 2016 at 10:46:02AM -0500, Jeff Cody wrote:
> On Wed, Mar 09, 2016 at 01:30:14PM +0100, Niels de Vos wrote:
> > GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
> > it possible to detect sparse areas in files.
> > 
> > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > 
> > ---
> > Tested by compiling and running "qemu-img map gluster://..." with a
> > build of the current master branch of glusterfs. Using a Fedora cloud
> > image (in raw format) shows many SEEK procudure calls going back and
> > forth over the network. The output of "qemu map" matches the output when
> > run against the image on the local filesystem.
> > 
> > v2 based on feedback from Jeff Cody:
> > - Replace compile time detection by runtime detection
> > - Update return pointer (new argument) for .bdrv_co_get_block_status
> > ---
> >  block/gluster.c | 186 
> > 
> >  1 file changed, 186 insertions(+)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 65077a0..b01ab52 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -23,6 +23,7 @@ typedef struct GlusterAIOCB {
> >  typedef struct BDRVGlusterState {
> >  struct glfs *glfs;
> >  struct glfs_fd *fd;
> > +bool supports_seek_data;
> >  } BDRVGlusterState;
> >  
> >  typedef struct GlusterConf {
> > @@ -286,6 +287,33 @@ static void qemu_gluster_parse_flags(int bdrv_flags, 
> > int *open_flags)
> >  }
> >  }
> >  
> > +/*
> > + * Do SEEK_DATA/HOLE to detect if it is functional. In order to be usable, 
> > it
> > + * should return different values for the start of data and the start of a
> > + * hole. There are three different cases to handle:
> > + *
> > + *  - the same position is returned for data/hole (indicates broken gfapi)
> > + *  - an error is returned:
> > + * - ENXIO only gets returned if there is valid support on 
> > client+server
> > + * - EINVAL is returned when gfapi or the server does not support it
> > + */
> > +static bool qemu_gluster_test_seek(struct glfs_fd *fd)
> > +{
> > +off_t start_data, start_hole;
> > +bool supports_seek_data = false;
> > +
> > +start_data = glfs_lseek(fd, 0, SEEK_DATA);
> > +if (start_data != -1) {
> 
> I recommend just checking if the returned value is >= 0.

Ok, I can change that.

> > +start_hole = glfs_lseek(fd, 0, SEEK_HOLE);
> > +if (start_hole != -1)
> 
> Minor formatting nit: per QEMU coding standard, all conditional
> statements require brackets.

Ah, sure, will do.

> > +supports_seek_data = !(start_data == start_hole);
> > +} else if (errno == ENXIO) {
> 
> This errno check for ENXIO won't catch the case if an ENXIO error
> occurs in the SEEK_HOLE call.

I'm not sure if I'm following. lseek(SEEK_DATA) returns -1 and sets
errno to ENXIO when the position in the filedescriptor is EOF. In this
test, we check from position=0, so when ENXIO is returned, we know the
file is empty and the return value+errno has gone through the whole
Gluster stack.

EINVAL would be an other error that is expected, in case either (a
current) gfapi or the server do not support SEEK_DATA.

I do not think there is a need to check for ENXIO on lseek(SEEK_HOLE).

Do you have a preference on how I should change it?

> > +supports_seek_data = true;
> > +}
> > +
> > +return supports_seek_data;
> > +}
> > +
> >  static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
> >   int bdrv_flags, Error **errp)
> >  {
> > @@ -320,6 +348,8 @@ static int qemu_gluster_open(BlockDriverState *bs,  
> > QDict *options,
> >  ret = -errno;
> >  }
> >  
> > +s->supports_seek_data = qemu_gluster_test_seek(s->fd);
> > +
> >  out:
> >  qemu_opts_del(opts);
> >  qemu_gluster_gconf_free(gconf);
> > @@ -677,6 +707,158 @@ static int 
> > qemu_gluster_has_zero_init(BlockDriverState *bs)
> >  return 0;
> >  }
> >  
> > +/*
> > + * Find allocation range in @bs around offset @start.
> > + * May change underlying file descriptor's file offset.
> > + * If @start is not in a hole, store @start in @data, and the
> > + * beginning of the next hole in @hole, and return 0.
> > + * If @start is in a non-trailing hole, store @start in @hole and the
> > + * beginning of the next non-hole in @data, and return 0.
> > + * If @start 

[Qemu-block] [PATCH v2] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-09 Thread Niels de Vos
GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
it possible to detect sparse areas in files.

Signed-off-by: Niels de Vos <nde...@redhat.com>

---
Tested by compiling and running "qemu-img map gluster://..." with a
build of the current master branch of glusterfs. Using a Fedora cloud
image (in raw format) shows many SEEK procudure calls going back and
forth over the network. The output of "qemu map" matches the output when
run against the image on the local filesystem.

v2 based on feedback from Jeff Cody:
- Replace compile time detection by runtime detection
- Update return pointer (new argument) for .bdrv_co_get_block_status
---
 block/gluster.c | 186 
 1 file changed, 186 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 65077a0..b01ab52 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -23,6 +23,7 @@ typedef struct GlusterAIOCB {
 typedef struct BDRVGlusterState {
 struct glfs *glfs;
 struct glfs_fd *fd;
+bool supports_seek_data;
 } BDRVGlusterState;
 
 typedef struct GlusterConf {
@@ -286,6 +287,33 @@ static void qemu_gluster_parse_flags(int bdrv_flags, int 
*open_flags)
 }
 }
 
+/*
+ * Do SEEK_DATA/HOLE to detect if it is functional. In order to be usable, it
+ * should return different values for the start of data and the start of a
+ * hole. There are three different cases to handle:
+ *
+ *  - the same position is returned for data/hole (indicates broken gfapi)
+ *  - an error is returned:
+ * - ENXIO only gets returned if there is valid support on client+server
+ * - EINVAL is returned when gfapi or the server does not support it
+ */
+static bool qemu_gluster_test_seek(struct glfs_fd *fd)
+{
+off_t start_data, start_hole;
+bool supports_seek_data = false;
+
+start_data = glfs_lseek(fd, 0, SEEK_DATA);
+if (start_data != -1) {
+start_hole = glfs_lseek(fd, 0, SEEK_HOLE);
+if (start_hole != -1)
+supports_seek_data = !(start_data == start_hole);
+} else if (errno == ENXIO) {
+supports_seek_data = true;
+}
+
+return supports_seek_data;
+}
+
 static int qemu_gluster_open(BlockDriverState *bs,  QDict *options,
  int bdrv_flags, Error **errp)
 {
@@ -320,6 +348,8 @@ static int qemu_gluster_open(BlockDriverState *bs,  QDict 
*options,
 ret = -errno;
 }
 
+s->supports_seek_data = qemu_gluster_test_seek(s->fd);
+
 out:
 qemu_opts_del(opts);
 qemu_gluster_gconf_free(gconf);
@@ -677,6 +707,158 @@ static int qemu_gluster_has_zero_init(BlockDriverState 
*bs)
 return 0;
 }
 
+/*
+ * Find allocation range in @bs around offset @start.
+ * May change underlying file descriptor's file offset.
+ * If @start is not in a hole, store @start in @data, and the
+ * beginning of the next hole in @hole, and return 0.
+ * If @start is in a non-trailing hole, store @start in @hole and the
+ * beginning of the next non-hole in @data, and return 0.
+ * If @start is in a trailing hole or beyond EOF, return -ENXIO.
+ * If we can't find out, return a negative errno other than -ENXIO.
+ *
+ * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
+ */
+static int find_allocation(BlockDriverState *bs, off_t start,
+   off_t *data, off_t *hole)
+{
+BDRVGlusterState *s = bs->opaque;
+off_t offs;
+
+if (!s->supports_seek_data)
+return -EINVAL;
+
+/*
+ * SEEK_DATA cases:
+ * D1. offs == start: start is in data
+ * D2. offs > start: start is in a hole, next data at offs
+ * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
+ *  or start is beyond EOF
+ * If the latter happens, the file has been truncated behind
+ * our back since we opened it.  All bets are off then.
+ * Treating like a trailing hole is simplest.
+ * D4. offs < 0, errno != ENXIO: we learned nothing
+ */
+offs = glfs_lseek(s->fd, start, SEEK_DATA);
+if (offs < 0) {
+return -errno;  /* D3 or D4 */
+}
+assert(offs >= start);
+
+if (offs > start) {
+/* D2: in hole, next data at offs */
+*hole = start;
+*data = offs;
+return 0;
+}
+
+/* D1: in data, end not yet known */
+
+/*
+ * SEEK_HOLE cases:
+ * H1. offs == start: start is in a hole
+ * If this happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H2. offs > start: either start is in data, next hole at offs,
+ *   or start is in trailing hole, EOF at offs
+ * Linux treats trailing holes like any other hole: offs ==
+ * start.  Solaris seeks to EOF instead: offs > start (blech).
+ * If that happens here, a hole has been dug behind our back
+ * since the previous 

Re: [Qemu-block] [PATCH] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-08 Thread Niels de Vos
On Tue, Mar 08, 2016 at 01:53:26PM +0100, Kevin Wolf wrote:
> Am 08.03.2016 um 05:21 hat Niels de Vos geschrieben:
> > On Mon, Mar 07, 2016 at 01:27:38PM -0500, Jeff Cody wrote:
> > > On Mon, Mar 07, 2016 at 07:04:15PM +0100, Niels de Vos wrote:
> > > > GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
> > > > it possible to detect sparse areas in files.
> > > > 
> > > > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > > > 
> > > > --
> > > > Tested by compiling and running "qemu-img map gluster://..." with a
> > > > build of the current master branch of glusterfs. Using a Fedora
> > > > cloud image (in raw format) shows many SEEK procudure calls going back
> > > > and forth over the network. The output of "qemu map" matches the output
> > > > when run against the image on the local filesystem.
> > > > ---
> > > >  block/gluster.c | 159 
> > > > 
> > > >  configure   |  25 +
> > > >  2 files changed, 184 insertions(+)
> > > > 
> > > > diff --git a/block/gluster.c b/block/gluster.c
> > > > index 65077a0..1430010 100644
> > > > --- a/block/gluster.c
> > > > +++ b/block/gluster.c
> > > > @@ -677,6 +677,153 @@ static int 
> > > > qemu_gluster_has_zero_init(BlockDriverState *bs)
> > > >  return 0;
> > > >  }
> > > >  
> > > > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA
> > > 
> > > Why do we need to make this a compile-time option?  Version checking
> > > is problematic; for instance, different distributions may have
> > > backported bug fixes / features, that are not reflected by the
> > > reported version number, etc..  Ideally, we can determine
> > > functionality during runtime, and behave accordingly.
> > 
> > This will not get backported to older Gluster versions, it required a
> > protocol change.
> > 
> > > If SEEK_DATA and SEEK_HOLE are not supported,
> > > qemu_gluster_co_get_block_status can return that sectors are all
> > > allocated (which is what happens in block/io.c anyway if the driver
> > > doesn't support the function).
> > 
> > Ok, good to know.
> > 
> > > As long as glfs_lseek() will return error (e.g. EINVAL) for an invalid
> > > whence value,  we can handle it runtime.  Does glfs_lseek() behave
> > > sanely?
> > 
> > Unfortunately older versions of libgfapi do not return EINVAL when
> > SEEK_DATA/HOLE is used. It is something we'll need to fix in the stable
> > releases. We can not assume that all users have installed a version of
> > the library that handles SEEK_DATA/HOLE correctly (return EINVAL) when
> > there is no support in the network protocol or on the server.
> > 
> > To be sure that we don't get some undefined behaviour, the compile time
> > check is needed.
> 
> The code could be compiled on a host with newer libgfapi, but run on a
> different host with an older version. This is why having (only) compile
> time checks is rarely a good idea.

Oh, yes, that is possible. glfs_lseek() is not a new function, so the
symbol version did not need to change.

> Jeff's suggestion to probe the actual behaviour on the host we're
> running on in .bdrv_open() sounds reasonable to me.

Yes, it sure is. I'll send a v2 patch soon.

Thanks,
Niels



Re: [Qemu-block] [PATCH] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-08 Thread Niels de Vos
On Tue, Mar 08, 2016 at 07:33:26AM -0500, Jeff Cody wrote:
> On Tue, Mar 08, 2016 at 05:21:48AM +0100, Niels de Vos wrote:
> > On Mon, Mar 07, 2016 at 01:27:38PM -0500, Jeff Cody wrote:
> > > On Mon, Mar 07, 2016 at 07:04:15PM +0100, Niels de Vos wrote:
> > > > GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
> > > > it possible to detect sparse areas in files.
> > > > 
> > > > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > > > 
> > > > --
> > > > Tested by compiling and running "qemu-img map gluster://..." with a
> > > > build of the current master branch of glusterfs. Using a Fedora
> > > > cloud image (in raw format) shows many SEEK procudure calls going back
> > > > and forth over the network. The output of "qemu map" matches the output
> > > > when run against the image on the local filesystem.
> > > > ---
> > > >  block/gluster.c | 159 
> > > > 
> > > >  configure   |  25 +
> > > >  2 files changed, 184 insertions(+)
> > > > 
> > > > diff --git a/block/gluster.c b/block/gluster.c
> > > > index 65077a0..1430010 100644
> > > > --- a/block/gluster.c
> > > > +++ b/block/gluster.c
> > > > @@ -677,6 +677,153 @@ static int 
> > > > qemu_gluster_has_zero_init(BlockDriverState *bs)
> > > >  return 0;
> > > >  }
> > > >  
> > > > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA
> > > 
> > > Why do we need to make this a compile-time option?  Version checking
> > > is problematic; for instance, different distributions may have
> > > backported bug fixes / features, that are not reflected by the
> > > reported version number, etc..  Ideally, we can determine
> > > functionality during runtime, and behave accordingly.
> > 
> > This will not get backported to older Gluster versions, it required a
> > protocol change.
> > 
> > > If SEEK_DATA and SEEK_HOLE are not supported,
> > > qemu_gluster_co_get_block_status can return that sectors are all
> > > allocated (which is what happens in block/io.c anyway if the driver
> > > doesn't support the function).
> > 
> > Ok, good to know.
> > 
> > > As long as glfs_lseek() will return error (e.g. EINVAL) for an invalid
> > > whence value,  we can handle it runtime.  Does glfs_lseek() behave
> > > sanely?
> > 
> > Unfortunately older versions of libgfapi do not return EINVAL when
> > SEEK_DATA/HOLE is used. It is something we'll need to fix in the stable
> > releases. We can not assume that all users have installed a version of
> > the library that handles SEEK_DATA/HOLE correctly (return EINVAL) when
> > there is no support in the network protocol or on the server.
> > 
> > To be sure that we don't get some undefined behaviour, the compile time
> > check is needed.
> >
> 
> That's unfortunate.
> 
> However, we may still be able to work around that with some probing in
> the qemu_gluster_open() function.  I peeked at the libgfapi code, and
> it looks like for an unknown whence, the current offset is just
> returned by glfs_lseek() - is that correct?

Yes, that is correct.

> With the same offset input, an lseek should always return something
> different for a SEEK_DATA and SEEK_HOLE whence (SEEK_HOLE will return
> EOF offset if there are no further holes).
> 
> In qemu_gluster_open(), we can then probe for support.  if we call
> glfs_lseek() twice with the same offset, first with SEEK_DATA and
> then with SEEK_HOLE, then we can see if the libgfapi version
> supports SEEK_DATA/SEEK_HOLE.  If the resulting offsets from the calls
> are equal, it doesn't support it.  If the offsets differ, then
> presumably it does.  We can then set and remember that in the
> BDRVGlusterState struct (e.g. bool supports_seek_data).

Hmm, yes, that should work. And it'll obviously catch the case where
glfs_lseek() returns EINVAL too.

Thanks for the idea, I'll try to get this done over the next few days.

Niels


> 
> > > > +/*
> > > > + * Find allocation range in @bs around offset @start.
> > > > + * May change underlying file descriptor's file offset.
> > > > + * If @start is not in a hole, store @start in @data, and the
> > > > + * beginning of the next hole in @hole, and return 0.
> > > > + * If @start is in a non-trailing hole, store @start in @hole and the
> > > > + * beginning of the next

Re: [Qemu-block] [PATCH] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-07 Thread Niels de Vos
On Mon, Mar 07, 2016 at 01:27:38PM -0500, Jeff Cody wrote:
> On Mon, Mar 07, 2016 at 07:04:15PM +0100, Niels de Vos wrote:
> > GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
> > it possible to detect sparse areas in files.
> > 
> > Signed-off-by: Niels de Vos <nde...@redhat.com>
> > 
> > --
> > Tested by compiling and running "qemu-img map gluster://..." with a
> > build of the current master branch of glusterfs. Using a Fedora
> > cloud image (in raw format) shows many SEEK procudure calls going back
> > and forth over the network. The output of "qemu map" matches the output
> > when run against the image on the local filesystem.
> > ---
> >  block/gluster.c | 159 
> > 
> >  configure   |  25 +
> >  2 files changed, 184 insertions(+)
> > 
> > diff --git a/block/gluster.c b/block/gluster.c
> > index 65077a0..1430010 100644
> > --- a/block/gluster.c
> > +++ b/block/gluster.c
> > @@ -677,6 +677,153 @@ static int 
> > qemu_gluster_has_zero_init(BlockDriverState *bs)
> >  return 0;
> >  }
> >  
> > +#ifdef CONFIG_GLUSTERFS_SEEK_DATA
> 
> Why do we need to make this a compile-time option?  Version checking
> is problematic; for instance, different distributions may have
> backported bug fixes / features, that are not reflected by the
> reported version number, etc..  Ideally, we can determine
> functionality during runtime, and behave accordingly.

This will not get backported to older Gluster versions, it required a
protocol change.

> If SEEK_DATA and SEEK_HOLE are not supported,
> qemu_gluster_co_get_block_status can return that sectors are all
> allocated (which is what happens in block/io.c anyway if the driver
> doesn't support the function).

Ok, good to know.

> As long as glfs_lseek() will return error (e.g. EINVAL) for an invalid
> whence value,  we can handle it runtime.  Does glfs_lseek() behave
> sanely?

Unfortunately older versions of libgfapi do not return EINVAL when
SEEK_DATA/HOLE is used. It is something we'll need to fix in the stable
releases. We can not assume that all users have installed a version of
the library that handles SEEK_DATA/HOLE correctly (return EINVAL) when
there is no support in the network protocol or on the server.

To be sure that we don't get some undefined behaviour, the compile time
check is needed.

Niels

> > +/*
> > + * Find allocation range in @bs around offset @start.
> > + * May change underlying file descriptor's file offset.
> > + * If @start is not in a hole, store @start in @data, and the
> > + * beginning of the next hole in @hole, and return 0.
> > + * If @start is in a non-trailing hole, store @start in @hole and the
> > + * beginning of the next non-hole in @data, and return 0.
> > + * If @start is in a trailing hole or beyond EOF, return -ENXIO.
> > + * If we can't find out, return a negative errno other than -ENXIO.
> > + *
> > + * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
> > + */
> > +static int find_allocation(BlockDriverState *bs, off_t start,
> > +   off_t *data, off_t *hole)
> > +{
> > +BDRVGlusterState *s = bs->opaque;
> > +off_t offs;
> > +
> > +/*
> > + * SEEK_DATA cases:
> > + * D1. offs == start: start is in data
> > + * D2. offs > start: start is in a hole, next data at offs
> > + * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
> > + *  or start is beyond EOF
> > + * If the latter happens, the file has been truncated behind
> > + * our back since we opened it.  All bets are off then.
> > + * Treating like a trailing hole is simplest.
> > + * D4. offs < 0, errno != ENXIO: we learned nothing
> > + */
> > +offs = glfs_lseek(s->fd, start, SEEK_DATA);
> > +if (offs < 0) {
> > +return -errno;  /* D3 or D4 */
> > +}
> > +assert(offs >= start);
> > +
> > +if (offs > start) {
> > +/* D2: in hole, next data at offs */
> > +*hole = start;
> > +*data = offs;
> > +return 0;
> > +}
> > +
> > +/* D1: in data, end not yet known */
> > +
> > +/*
> > + * SEEK_HOLE cases:
> > + * H1. offs == start: start is in a hole
> > + * If this happens here, a hole has been dug behind our back
> > + * since the previous lseek().
> > + * H2. offs > start: either start i

[Qemu-block] [PATCH] block/gluster: add support for SEEK_DATA/SEEK_HOLE

2016-03-07 Thread Niels de Vos
GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
it possible to detect sparse areas in files.

Signed-off-by: Niels de Vos <nde...@redhat.com>

--
Tested by compiling and running "qemu-img map gluster://..." with a
build of the current master branch of glusterfs. Using a Fedora
cloud image (in raw format) shows many SEEK procudure calls going back
and forth over the network. The output of "qemu map" matches the output
when run against the image on the local filesystem.
---
 block/gluster.c | 159 
 configure   |  25 +
 2 files changed, 184 insertions(+)

diff --git a/block/gluster.c b/block/gluster.c
index 65077a0..1430010 100644
--- a/block/gluster.c
+++ b/block/gluster.c
@@ -677,6 +677,153 @@ static int qemu_gluster_has_zero_init(BlockDriverState 
*bs)
 return 0;
 }
 
+#ifdef CONFIG_GLUSTERFS_SEEK_DATA
+/*
+ * Find allocation range in @bs around offset @start.
+ * May change underlying file descriptor's file offset.
+ * If @start is not in a hole, store @start in @data, and the
+ * beginning of the next hole in @hole, and return 0.
+ * If @start is in a non-trailing hole, store @start in @hole and the
+ * beginning of the next non-hole in @data, and return 0.
+ * If @start is in a trailing hole or beyond EOF, return -ENXIO.
+ * If we can't find out, return a negative errno other than -ENXIO.
+ *
+ * (Shamefully copied from raw-posix.c, only miniscule adaptions.)
+ */
+static int find_allocation(BlockDriverState *bs, off_t start,
+   off_t *data, off_t *hole)
+{
+BDRVGlusterState *s = bs->opaque;
+off_t offs;
+
+/*
+ * SEEK_DATA cases:
+ * D1. offs == start: start is in data
+ * D2. offs > start: start is in a hole, next data at offs
+ * D3. offs < 0, errno = ENXIO: either start is in a trailing hole
+ *  or start is beyond EOF
+ * If the latter happens, the file has been truncated behind
+ * our back since we opened it.  All bets are off then.
+ * Treating like a trailing hole is simplest.
+ * D4. offs < 0, errno != ENXIO: we learned nothing
+ */
+offs = glfs_lseek(s->fd, start, SEEK_DATA);
+if (offs < 0) {
+return -errno;  /* D3 or D4 */
+}
+assert(offs >= start);
+
+if (offs > start) {
+/* D2: in hole, next data at offs */
+*hole = start;
+*data = offs;
+return 0;
+}
+
+/* D1: in data, end not yet known */
+
+/*
+ * SEEK_HOLE cases:
+ * H1. offs == start: start is in a hole
+ * If this happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H2. offs > start: either start is in data, next hole at offs,
+ *   or start is in trailing hole, EOF at offs
+ * Linux treats trailing holes like any other hole: offs ==
+ * start.  Solaris seeks to EOF instead: offs > start (blech).
+ * If that happens here, a hole has been dug behind our back
+ * since the previous lseek().
+ * H3. offs < 0, errno = ENXIO: start is beyond EOF
+ * If this happens, the file has been truncated behind our
+ * back since we opened it.  Treat it like a trailing hole.
+ * H4. offs < 0, errno != ENXIO: we learned nothing
+ * Pretend we know nothing at all, i.e. "forget" about D1.
+ */
+offs = glfs_lseek(s->fd, start, SEEK_HOLE);
+if (offs < 0) {
+return -errno;  /* D1 and (H3 or H4) */
+}
+assert(offs >= start);
+
+if (offs > start) {
+/*
+ * D1 and H2: either in data, next hole at offs, or it was in
+ * data but is now in a trailing hole.  In the latter case,
+ * all bets are off.  Treating it as if it there was data all
+ * the way to EOF is safe, so simply do that.
+ */
+*data = start;
+*hole = offs;
+return 0;
+}
+
+/* D1 and H1 */
+return -EBUSY;
+}
+
+/*
+ * Returns the allocation status of the specified sectors.
+ *
+ * If 'sector_num' is beyond the end of the disk image the return value is 0
+ * and 'pnum' is set to 0.
+ *
+ * 'pnum' is set to the number of sectors (including and immediately following
+ * the specified sector) that are known to be in the same
+ * allocated/unallocated state.
+ *
+ * 'nb_sectors' is the max value 'pnum' should be set to.  If nb_sectors goes
+ * beyond the end of the disk image it will be clamped.
+ *
+ * (Based on raw_co_get_block_status() from raw-posix.c.)
+ */
+static int64_t coroutine_fn qemu_gluster_co_get_block_status(
+BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum)
+{
+BDRVGlusterState *s = bs->opaque;
+off_t start, data = 0, hole = 0;
+int64_t total_size;
+int ret = -EINVAL;
+
+if (!s->fd) {