Michael,

Continuing the discussion, I posted RFC patch adding a public
ImageioClient class:
https://gerrit.ovirt.org/c/110068

This is an early version for discussion, the API may change based on
the feedback
we get from users.

I posted an example showing how the client can used:
https://gerrit.ovirt.org/c/110069

Because the streaming use case seems to be what you want, I used
the stream format mentioned in the previous mail for this example.

You can review the patches here on in https://gerrit.ovirt.org/ (I
think you need to create a user).

If you want to test this code, you can use git:
$ git clone https://gerrit.ovirt.org/ovirt-imageio
$ git fetch https://gerrit.ovirt.org/ovirt-imageio
refs/changes/69/110069/1 && git checkout FETCH_HEAD

Then build and install imageio:
$ make rpm
$ dnf upgrade daemon/dist/*.rpm

Or you can install this build:
https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/3228/artifact/build-artifacts.py3.el8.x86_64/

By adding this repo file:

$ cat /etc/yum.repos.d/imageio-testing.repo
[ovirt-imageio-testing]
name=ovirt-imageio testing repo
baseurl=https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/3228/artifact/build-artifacts.py3.el8.x86_64/
enabled=1
gpgcheck=0

If you want to test latest commits before they are released, you can
enable ovirt-imageio-prevew repo:

$ dnf copr enable nsoffer/ovirt-imageio-preview

Looking forward to your feedback.

Nir

On Wed, Jul 1, 2020 at 8:43 PM Nir Soffer <nsof...@redhat.com> wrote:
>
> On Tue, Jun 30, 2020 at 10:22 PM Michael Ablassmeier <a...@grinser.de> wrote:
> >
> > hi,
> >
> > On Tue, Jun 30, 2020 at 04:49:01PM +0300, Nir Soffer wrote:
> > > On Tue, Jun 30, 2020 at 10:32 AM Michael Ablassmeier <a...@grinser.de> 
> > > wrote:
> > > >  
> > > > https://tranfer_node:54322/images/d471c659-889f-4e7f-b55a-a475649c48a6/extents
> > > >
> > > > As i failed to find them, are there any existing functions/api calls
> > > > that could be used to download only the used extents to a file/fifo
> > > > pipe?
> > >
> > > To use _internal.io.copy to copy the image to tape, we need to solve
> > > several issues:
> > >
> > > 1. how do you write the extents to tape so that you can extract them 
> > > later?
> > > 2. provide a backend that knows how to stream data to tape in the right 
> > > format
> > > 3. fix client.download() to consider the number of writers allowed by
> > > the backend,
> > >    since streaming to tape using multiple writers will not be possible.
> >
> > so, speaking as someone who works for a backup vendor, issue  1 and 2 are
> > already solved by our software, the backend is there, we just need an
> > way to extract the data from the api without storing it into a file
> > first. Something like:
> >
> >  backup_vm.py full <vm_uuid> pipe
> >
> > is already sufficient, as our backup client software would simply read
> > the data from the pipe, sending it to our backend which does all the
> > stuff regarding tape communication and format.
>
> Great, but piping the data is not so simple, see below.
>
> > The old implementation used the snapshot/attach feature, where our
> > backup client is reading directly from the attached storage device,
> > sending the data to the backend, which cares about multiplexing to tape,
> > possible dedpulication, etc..
>
> In this case you read a complete disk, including the unallocated areas which
> read as zeroes. This is not efficient, creating lots of I/O and
> network bandwidth
> on the way to the backup software, where you do deduplication etc.
>
> > Tape is not the only use case here, most of the times our customers want
> > to write data to storage devices which do not expose an regular file
> > system (such as dedup services, StoreOnce, Virtual Tape solutions etc).
> >
> > > To restore this backup, you need to:
> > > 1. find the tar in the tape (I have no idea how you would do this)
> > > 2. extract backup info from the tar
> > > 3. extract extents from the tar
> >
> > 1-3 are not an issue here and handled by our backend
> >
> > > 4. start an upload transfer
> > > 5. for each data extent:
> > >     read data from the tar member, and send to imageio using the right
> > >     offset and size
> >
> > that is some good information, so it is possible to create an empty disk
> > with the same size using the API and then directly send the extents with
> > their propper offset. How does it look with an incremental backup on top
> > of an just restored full backup. Does the imageio backend automatically
> > rebase and commit the data from the incremental backup during upload?
>
> No, during upload you get the similar interface - you can write to any offset
> or zero a byte range.
>
> imageio API is mostly like a remote file descriptor. Instead of integer  
> (fd=42)
> you get a random URL
> (https://host:port/images/efb761c6-2b06-4b46-bf50-2c40677ea419).
> Using URL you can read, write or zero a byte range.
>
> During restore, you need to write back the data that should be on the disk
> at a specific point in time.
>
> Ideally your backup software can provide a similar interface to pull data for
> a specific point in time, so you can push it to storage. If your backup 
> software
> can only return data from specific backup, you can restore the disk state
> using this flow
>
> 1. Copy data from the last full backup before the restore point to storage
> 2. For each incremental backup since that full backup:
>         copy data from the backup to storage
> 3. Zero all the areas that were not written in the previous steps.
>
> This is not not the most efficient way since you may copy the same area
> several times, so this should ideally be handled by the backup software.
>
> > As i understand it, requesting the extents directly and writing them to
> > a file, leaves you with an image in raw format, which then needs to be
> > properly re-aligned with zeros and converted to qcow2, beeing able to
> > commit any of the incremental backups i have stored somewhere.
>
> If you write the extent to a file in raw format, you will have holes
> in the image.
> If you want to pipe the data you cannot have holes, unless you want to 
> generate
> zeroes for the holes, and pipe the zeroes, which is not efficient.
>
> Example:
>
>     [
>         {"start": 0, "length": 65536, "zero": False},
>         {"start: 65536, "length": 1073741824, "zero": True},
>     ]
>
> If you pipe the zeros you are going to push 1g of zeros to your pipe.
>
> This can not work for incremental backup since in this case you get only the
> extents that were modified since the last backup, and you cannot fill the 
> space
> between these extents with zeros.
>
>     [
>         {"start": 0, "length": 65536, "dirty": True},
>         {"start: 65536, "length": 1073741824, "dirty": False},
>     ]
>
> You must preserve the hole, so when you restore you can skip this extent.
>
> If you want to pipe the data, you must encode the data in some way so you
> can push the data and the holes to your pipe.
>
> One way that we considered in the past is to support a chunked-like format,
> stream of data extents and hole extents.
>
> For example:
>
> data 0000000040000000\r\n
> <1 GiB of data>\r\n
> hole 0000000000100000\r\n
> \r\n
>
> This is similar to the incremental backup provided by ceph:
> https://docs.ceph.com/docs/master/dev/rbd-diff/
>
> We did not implement it since providing a list of extents and a way to
> read the extents
> seems a more generic solution that can make it easier to integrate
> with many backup
> vendors that may use different solutions to storage and manage the data.
>
> So you can read data from imageio and push it to your pipe in similar format.
> If you do this the http backend a better way to pipe the data would be:
>
>     backend.write_to(writer, length, buf)
>
> which accept an object implementing write(buf), and push length bytes from
> the server to this object. Your writer can be sys.stdout if you want to pipe 
> the
> backup to some other process.
>
> In this case your backup loop may be:
>
>    for extent in extents:
>        write_extent_header(writer, extent)
>        if not extent.zero:
>            backend.write_to(writer, extents.length, buf)
>
> And you restore loop would be something like:
>
>    for extent in extents:
>        backend.seek(extent.start)
>        if extent.zero:
>            backend.zero(extent.length)
>        else:
>            backend.read_from(reader, extent.length, buf)
>
> read_from() like write_to(), but works in the other direction.
>
> > As during
> > upload, an convert is possible, that means we dont have to rebuild the
> > full/inc chain using a temporary file which we then upload?
>
> If your backup backend can stream the data for a specific point in
> time, considering
> all the backups since the last full backup, you don't need any temporary 
> files.
>
> The convert step in upload is done on the server side. The upload pipeline is:
>
>    backup storage -> restore program -> imageio server -> qemu-nbd -> volume
>
> imageio server accepts write and zero request and convert them to
> NBD_CMD_WRITE and
> NBD_CMD_WRITE_ZEROES to qemu-nbd, and qemu-nbd is writing the data and the 
> zeros
> to the image using qcow2 or raw drivers.
>
> The backup pipeline is similar:
>
>     volume -> qemu -> imageio server -> backup program -> backup storage
>
> imageio server accepts extents request and converts it to
> NBD_CMD_BLOCK_STATUS requests
> to qemu. Then it accept read requests and convert it to NBD_CMD_READ
> to qemu and return
> the data qemu returns.
>
> > > So the missing part is to create a connection to imageio and reading the 
> > > data.
> > >
> > > The easiest way is to use imageio._internal.backends.http, but note that 
> > > this
> > > is internal now, so you should not use it outside of imageio. It is fine 
> > > for
> > > writing proof of concept, and if you can show a good use case we can work
> > > on public API.
> >
> > yes, that is what i noticed. My current solution would be to use the
> > interal functions to query the extent information and then continue
> > extracting them, to be able to pipe the data into our backend.
> >
> > > You can write this using http.client.HTTPSConnection without using
> > > the http backend, but it would be a lot of code.
> >
> > thanks for your example, i will give it a try during POC implementation.
> >
> > > We probably need to expose the backends or a simplified interface
> > > in the client public API to make it easier to write such applications.
> > >
> > > Maybe something like:
> > >
> > >      client.coy(src, dst)
> > >
> > > Where src and dst are objects implementing imageio backend interface.
> > >
> > > But before we do this we need to see some examples of real programs
> > > using imageio, to understand the requirements better.
> >
> > the main feature for us would be to be able to read the data and
> > pipe it somewhere, which works by using the _internal api
> > functions, but having a stable interface for it would be really
> > good for any kind of backup vendor to implement a client for
> > the new api into their software.
>
> The main challenge is to find a generic format supporting streaming
> that most vendors
> can use. If we will have such a format we can support it in the client
> and in the server.
>
> For example we can provide:
>
>     GET /images/xxx-yyy?format=sparse&context=zero
>
> This can return stream of data/zero extents that can be piped using
> standard tools
> like curl.
>
> And we can support restore using:
>
>    PUT /images/xxx-yyy?format=sparse
>
> So you can pussh the same stream back - using one request.
>
> The disadvantage is that your system must understand this sprse format. Parse 
> it
> during backup, and maybe construct it during restore.
>
> If this looks like a useful way please file RFE to implement it.
>
> > If anyone is interested to hear more thoughts about that, also from
> > redhat, dont hesitate to contact me directly for having a call.
>
> Good idea.
>
> Cheers,
> Nir
_______________________________________________
Devel mailing list -- devel@ovirt.org
To unsubscribe send an email to devel-le...@ovirt.org
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/devel@ovirt.org/message/YE4GI77TTKMOUJK6BABEQGS7D2Z26JNZ/

Reply via email to