Re: [Libguestfs] [nbdkit PATCH v2] server: Don't assert on send if client hangs up early

2023-02-27 Thread Eric Blake
On Sun, Feb 26, 2023 at 10:16:08AM +0100, Laszlo Ersek wrote:
> On 2/24/23 23:59, Eric Blake wrote:
> > @@ -752,14 +742,15 @@ protocol_recv_request_send_reply (void)
> >(cmd == NBD_CMD_READ || cmd == NBD_CMD_BLOCK_STATUS)) {
> >  if (!error) {
> >if (cmd == NBD_CMD_READ)
> > -send_structured_reply_read (request.handle, cmd, buf, count, 
> > offset);
> > +return send_structured_reply_read (request.handle, cmd, buf, count,
> > +   offset);
> >else /* NBD_CMD_BLOCK_STATUS */
> > -send_structured_reply_block_status (request.handle, cmd, flags,
> > -count, offset, extents);
> > +return send_structured_reply_block_status (request.handle, cmd, 
> > flags,
> > +   count, offset, extents);
> >  }
> >  else
> > -  send_structured_reply_error (request.handle, cmd, flags, error);
> > +  return send_structured_reply_error (request.handle, cmd, flags, 
> > error);
> >}
> >else
> > -send_simple_reply (request.handle, cmd, flags, buf, count, error);
> > +return send_simple_reply (request.handle, cmd, flags, buf, count, 
> > error);
> >  }
> 
> I think this would look better:
> 
>   if (!conn->structured_replies ||
>   (cmd != NBD_CMD_READ && cmd != NBD_CMD_BLOCK_STATUS))
> return send_simple_reply (request.handle, cmd, flags, buf, count, error);
> 
>   if (error) {
> return send_structured_reply_error (request.handle, cmd, flags, error);
> 
>   if (cmd == NBD_CMD_READ)
> return send_structured_reply_read (request.handle, cmd, buf, count, 
> offset);
> 
>   /* NBD_CMD_BLOCK_STATUS */
>   return send_structured_reply_block_status (request.handle, cmd, flags,
>  count, offset, extents);

Indeed it does.

> 
> Either way:
> 
> Reviewed-by: Laszlo Ersek 

Thanks.  With that tweak, this is now in as commit 9d82aa4f

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



[Libguestfs] Checksums and other verification

2023-02-27 Thread Richard W.M. Jones


https://github.com/kubevirt/containerized-data-importer/issues/1520

Hi Eric,

We had a question from the Kubevirt team related to the above issue.
The question is roughly if it's possible to calculate the checksum of
an image as an nbdkit filter and/or in the qemu block layer.

Supplemental #1: could qemu-img convert calculate a checksum as it goes
along?

Supplemental #2: could we detect various sorts of common errors, such
a webserver that is incorrectly configured and serves up an error page
containing ""; or something which is supposed to be a disk image
but does not "look like" (in some ill-defined sense) a disk image,
eg. it has no partition table.

I'm not sure if qemu has any existing features covering the above (and
I know for sure that nbdkit doesn't).

One issue is that calculating a checksum involves a linear scan of the
image, although we can at least skip holes.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
nbdkit - Flexible, fast NBD server with plugins
https://gitlab.com/nbdkit/nbdkit
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] [libnbd PATCH v5 3/5] force semicolon after DEFINE_VECTOR_TYPE() macro invocations

2023-02-27 Thread Eric Blake
On Sun, Feb 26, 2023 at 12:45:27PM +0100, Laszlo Ersek wrote:
> Currently, a semicolon after a DEFINE_VECTOR_TYPE(...) macro invocation is
> not needed, nor does our documentation in "common/utils/vector.h"
> prescribe one. Furthermore, it breaks ISO C, which gcc reports with
> "-Wpedantic":
> 
> > warning: ISO C does not allow extra ‘;’ outside of a function
> > [-Wpedantic]
> 
> Our current usage is inconsistent; a proper subset of all our
> DEFINE_VECTOR_TYPE() invocations is succeeded by a semicolon. We could
> remove these semicolons, but that doesn't play nicely with Emacs's C

I'm not sure if the possessive is spelled "Emacs'".  Either way,

Reviewed-by: Eric Blake 

> language parser. Thus, force the semicolon instead.
> 
> Signed-off-by: Laszlo Ersek 
> ---
> 
> Notes:
> v5:
> - force semicolon after DEFINE_VECTOR_TYPE() [Eric]
> - reword commit message accordingly
> 
> v4:
> - new patch in v4
> 
>  common/utils/vector.h | 7 +--
>  lib/uri.c | 2 +-
>  copy/file-ops.c   | 2 +-
>  copy/nbd-ops.c| 2 +-
>  dump/dump.c   | 2 +-
>  fuse/nbdfuse.h| 2 +-
>  info/list.c   | 2 +-
>  info/map.c| 2 +-
>  ublk/nbdublk.h| 2 +-
>  ublk/tgt.c| 2 +-
>  10 files changed, 14 insertions(+), 11 deletions(-)

I grepped to make sure no other files had the pattern, even if they
weren't being compiled by your choice of configure options.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs


Re: [Libguestfs] [libnbd PATCH v5 4/5] vector: introduce DEFINE_POINTER_VECTOR_TYPE()

2023-02-27 Thread Eric Blake
On Sun, Feb 26, 2023 at 12:45:28PM +0100, Laszlo Ersek wrote:
> The "name##_iter" function is used 11 times in libnbd; in all those cases,
> "name" is "string_vector", and the "f" callback is "free":
> 
>   string_vector_iter (..., (void *) free);
> 
> Casting "free" to (void*) is ugly. (Well-defined by POSIX, but still.)
>
...
> 
> Switch "string_vector" to DEFINE_POINTER_VECTOR_TYPE(). The 11
> string_vector_iter() call sites continue working; they will be converted
> to string_vector_empty() in a subsequent patch.
> 
> Signed-off-by: Laszlo Ersek 
> ---
> 
> Notes:
> v5:
> 
> - resolve conflict from update to previous patch
> 
> - force semicolon after ADD_VECTOR_EMPTY_METHOD() and
>   DEFINE_POINTER_VECTOR_TYPE() too
> 
> - update example macro invocations in the commit message

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] Checksums and other verification

2023-02-27 Thread Nir Soffer
On Mon, Feb 27, 2023 at 3:56 PM Richard W.M. Jones  wrote:
>
>
> https://github.com/kubevirt/containerized-data-importer/issues/1520
>
> Hi Eric,
>
> We had a question from the Kubevirt team related to the above issue.
> The question is roughly if it's possible to calculate the checksum of
> an image as an nbdkit filter and/or in the qemu block layer.
>
> Supplemental #1: could qemu-img convert calculate a checksum as it goes
> along?
>
> Supplemental #2: could we detect various sorts of common errors, such
> a webserver that is incorrectly configured and serves up an error page
> containing ""; or something which is supposed to be a disk image
> but does not "look like" (in some ill-defined sense) a disk image,
> eg. it has no partition table.
>
> I'm not sure if qemu has any existing features covering the above (and
> I know for sure that nbdkit doesn't).
>
> One issue is that calculating a checksum involves a linear scan of the
> image, although we can at least skip holes.

Kubvirt can use blksum
https://fosdem.org/2023/schedule/event/vai_blkhash_fast_disk/

But we need to package it for Fedora/CentOS Stream.

I also work on "qemu-img checksum", getting more reviews on this can help:
Lastest version:
https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00971.html
Last reveiw are here:
https://lists.nongnu.org/archive/html/qemu-block/2022-12/

More work is needed on the testing framework changes.

Nir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs


Re: [Libguestfs] Checksums and other verification

2023-02-27 Thread Eric Blake
On Mon, Feb 27, 2023 at 01:56:26PM +, Richard W.M. Jones wrote:
> 
> https://github.com/kubevirt/containerized-data-importer/issues/1520
> 
> Hi Eric,
> 
> We had a question from the Kubevirt team related to the above issue.
> The question is roughly if it's possible to calculate the checksum of
> an image as an nbdkit filter and/or in the qemu block layer.

In the qemu block layer - yes: see Nir's https://gitlab.com/nirs/blkhash

Note that there is a huge difference between a block-based checksum (a
checksum of the block data the guest will see) and a checksum of the
original file (bytes as visible on the source, although with non-raw
files, more than one image may hash to the same guest-visible contents
despite having different host checksums).

Also, it may prove to be more efficient to generate a Merkle Tree hash
of an image (an image is divided into smaller portions in a
binary-tree fanout, where the hash of the entire image is computed by
combining hashes of child nodes up to the root of the tree - which
allows downloading blocks out of order).  [You may be more familiar
with Merkle Trees than you realize - every git commit id is ultimately
a Merkle Tree hash of all prior commits]

As for nbdkit being able to do hashing as a filter, we don't have such
a filter now, but I think it would be technically possible to
implement one.  The trickiest part would be figuring out a way to
expose the checksum to the client once the client has finally read
through the entire image.  It would be easy to have nbdkit output the
resulting hash in a secondary file for consumption by the end client,
harder but potentially more useful would be extending the NBD protocol
itself to allow the NBD client to issue a query to the server to
provide the hash directly (or an indication that the hash is not yet
known because not all blocks have been hashed yet).

> 
> Supplemental #1: could qemu-img convert calculate a checksum as it goes
> along?

Nir's work on blkhash seems like that is doable.

> 
> Supplemental #2: could we detect various sorts of common errors, such
> a webserver that is incorrectly configured and serves up an error page
> containing ""; or something which is supposed to be a disk image
> but does not "look like" (in some ill-defined sense) a disk image,
> eg. it has no partition table.
> 
> I'm not sure if qemu has any existing features covering the above (and
> I know for sure that nbdkit doesn't).

Indeed.  But adding a filter that does a pre-read of the plugin's
firsts 1M during .prepare to look for an expected signature (what is
sufficient, seeing if there is a partition table?) and refuses to let
the client connect if the plugin is serving wrong data seems fairly
straightforward.

> 
> One issue is that calculating a checksum involves a linear scan of the
> image, although we can at least skip holes.

Or intentionally choose a hash that can be computed out-of-order, such
as a Merkle Tree.  But we'd need a standard setup for all parties to
agree on how the hash is to be computed and checked, if it is going to
be anything more than just a linear hash of the entire guest-visible
contents.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] Checksums and other verification

2023-02-27 Thread Richard W.M. Jones
On Mon, Feb 27, 2023 at 04:24:33PM +0200, Nir Soffer wrote:
> On Mon, Feb 27, 2023 at 3:56 PM Richard W.M. Jones  wrote:
> >
> >
> > https://github.com/kubevirt/containerized-data-importer/issues/1520
> >
> > Hi Eric,
> >
> > We had a question from the Kubevirt team related to the above issue.
> > The question is roughly if it's possible to calculate the checksum of
> > an image as an nbdkit filter and/or in the qemu block layer.
> >
> > Supplemental #1: could qemu-img convert calculate a checksum as it goes
> > along?
> >
> > Supplemental #2: could we detect various sorts of common errors, such
> > a webserver that is incorrectly configured and serves up an error page
> > containing ""; or something which is supposed to be a disk image
> > but does not "look like" (in some ill-defined sense) a disk image,
> > eg. it has no partition table.
> >
> > I'm not sure if qemu has any existing features covering the above (and
> > I know for sure that nbdkit doesn't).
> >
> > One issue is that calculating a checksum involves a linear scan of the
> > image, although we can at least skip holes.
> 
> Kubvirt can use blksum
> https://fosdem.org/2023/schedule/event/vai_blkhash_fast_disk/
> 
> But we need to package it for Fedora/CentOS Stream.
> 
> I also work on "qemu-img checksum", getting more reviews on this can help:
> Lastest version:
> https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00971.html
> Last reveiw are here:
> https://lists.nongnu.org/archive/html/qemu-block/2022-12/
> 
> More work is needed on the testing framework changes.

I think it would be more useful if (or in addition) it could compute
the checksum of a stream which is being converted with 'qemu-img
convert'.  Extra points if it can compute the checksum over either the
input or output stream.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs


Re: [Libguestfs] Checksums and other verification

2023-02-27 Thread Richard W.M. Jones
On Mon, Feb 27, 2023 at 08:42:23AM -0600, Eric Blake wrote:
> Or intentionally choose a hash that can be computed out-of-order, such
> as a Merkle Tree.  But we'd need a standard setup for all parties to
> agree on how the hash is to be computed and checked, if it is going to
> be anything more than just a linear hash of the entire guest-visible
> contents.

Unfortunately I suspect that by far the easiest way for people who
host images to compute checksums is to run 'shaXXXsum' on them or sign
them with a GPG signature, rather than engaging in a novel hash
function.  Indeed that's what is happening now:

https://alt.fedoraproject.org/en/verify.html

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] Checksums and other verification

2023-02-27 Thread Nir Soffer
On Mon, Feb 27, 2023 at 6:41 PM Richard W.M. Jones  wrote:
>
> On Mon, Feb 27, 2023 at 04:24:33PM +0200, Nir Soffer wrote:
> > On Mon, Feb 27, 2023 at 3:56 PM Richard W.M. Jones  
> > wrote:
> > >
> > >
> > > https://github.com/kubevirt/containerized-data-importer/issues/1520
> > >
> > > Hi Eric,
> > >
> > > We had a question from the Kubevirt team related to the above issue.
> > > The question is roughly if it's possible to calculate the checksum of
> > > an image as an nbdkit filter and/or in the qemu block layer.
> > >
> > > Supplemental #1: could qemu-img convert calculate a checksum as it goes
> > > along?
> > >
> > > Supplemental #2: could we detect various sorts of common errors, such
> > > a webserver that is incorrectly configured and serves up an error page
> > > containing ""; or something which is supposed to be a disk image
> > > but does not "look like" (in some ill-defined sense) a disk image,
> > > eg. it has no partition table.
> > >
> > > I'm not sure if qemu has any existing features covering the above (and
> > > I know for sure that nbdkit doesn't).
> > >
> > > One issue is that calculating a checksum involves a linear scan of the
> > > image, although we can at least skip holes.
> >
> > Kubvirt can use blksum
> > https://fosdem.org/2023/schedule/event/vai_blkhash_fast_disk/
> >
> > But we need to package it for Fedora/CentOS Stream.
> >
> > I also work on "qemu-img checksum", getting more reviews on this can help:
> > Lastest version:
> > https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00971.html
> > Last reveiw are here:
> > https://lists.nongnu.org/archive/html/qemu-block/2022-12/
> >
> > More work is needed on the testing framework changes.
>
> I think it would be more useful if (or in addition) it could compute
> the checksum of a stream which is being converted with 'qemu-img
> convert'.  Extra points if it can compute the checksum over either the
> input or output stream.

I thought about this, it could be a filter that you add in the graph
that gives you checksum as a side effect of copying. But this requires
disabling unordered writes, which is pretty bad for performance.

But even if you compute the checksum during a transfer, you want to
verify it by reading the transferred data from storage. Once you computed
the checksum you can keep it for verifying the same image in the future.

Nir

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs


Re: [Libguestfs] [libnbd PATCH v5 3/5] force semicolon after DEFINE_VECTOR_TYPE() macro invocations

2023-02-27 Thread Laszlo Ersek
On 2/27/23 15:04, Eric Blake wrote:
> On Sun, Feb 26, 2023 at 12:45:27PM +0100, Laszlo Ersek wrote:
>> Currently, a semicolon after a DEFINE_VECTOR_TYPE(...) macro invocation is
>> not needed, nor does our documentation in "common/utils/vector.h"
>> prescribe one. Furthermore, it breaks ISO C, which gcc reports with
>> "-Wpedantic":
>>
>>> warning: ISO C does not allow extra ‘;’ outside of a function
>>> [-Wpedantic]
>>
>> Our current usage is inconsistent; a proper subset of all our
>> DEFINE_VECTOR_TYPE() invocations is succeeded by a semicolon. We could
>> remove these semicolons, but that doesn't play nicely with Emacs's C
> 
> I'm not sure if the possessive is spelled "Emacs'".

Right, I had gotten interested in that question a while ago.

https://en.wikipedia.org/wiki/English_possessive
https://en.wikipedia.org/wiki/Apostrophe#Possessive_apostrophe

I guess one part of the equation is whether we now consider "Emacs" an
"ossified" acronym, i.e., a simple proper noun, or if we still "feel"
the plural embedded in it ("Editor MACroS"). In the former case, the
forms quoted by the first article as valid ("Mitch's", "James's",
"Charles's") seem to apply. If we consider "Emacs" to be the plural form
of "Emac" (an editor macro), then it should be "Emacs'", i.e. "Editor
Macros' C language parser".

Personally, I consider "Emacs" the word a proper noun; even for the
above, I had to look up its etimology.

>  Either way,
> 
> Reviewed-by: Eric Blake 

Thanks, and...

> 
>> language parser. Thus, force the semicolon instead.
>>
>> Signed-off-by: Laszlo Ersek 
>> ---
>>
>> Notes:
>> v5:
>> - force semicolon after DEFINE_VECTOR_TYPE() [Eric]
>> - reword commit message accordingly
>> 
>> v4:
>> - new patch in v4
>>
>>  common/utils/vector.h | 7 +--
>>  lib/uri.c | 2 +-
>>  copy/file-ops.c   | 2 +-
>>  copy/nbd-ops.c| 2 +-
>>  dump/dump.c   | 2 +-
>>  fuse/nbdfuse.h| 2 +-
>>  info/list.c   | 2 +-
>>  info/map.c| 2 +-
>>  ublk/nbdublk.h| 2 +-
>>  ublk/tgt.c| 2 +-
>>  10 files changed, 14 insertions(+), 11 deletions(-)
> 
> I grepped to make sure no other files had the pattern, even if they
> weren't being compiled by your choice of configure options.
> 

Thanks! Yes, I did grep too.

Laszlo
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs