Michael, Continuing the discussion, I posted RFC patch adding a public ImageioClient class: https://gerrit.ovirt.org/c/110068
This is an early version for discussion, the API may change based on the feedback we get from users. I posted an example showing how the client can used: https://gerrit.ovirt.org/c/110069 Because the streaming use case seems to be what you want, I used the stream format mentioned in the previous mail for this example. You can review the patches here on in https://gerrit.ovirt.org/ (I think you need to create a user). If you want to test this code, you can use git: $ git clone https://gerrit.ovirt.org/ovirt-imageio $ git fetch https://gerrit.ovirt.org/ovirt-imageio refs/changes/69/110069/1 && git checkout FETCH_HEAD Then build and install imageio: $ make rpm $ dnf upgrade daemon/dist/*.rpm Or you can install this build: https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/3228/artifact/build-artifacts.py3.el8.x86_64/ By adding this repo file: $ cat /etc/yum.repos.d/imageio-testing.repo [ovirt-imageio-testing] name=ovirt-imageio testing repo baseurl=https://jenkins.ovirt.org/job/ovirt-imageio_standard-check-patch/3228/artifact/build-artifacts.py3.el8.x86_64/ enabled=1 gpgcheck=0 If you want to test latest commits before they are released, you can enable ovirt-imageio-prevew repo: $ dnf copr enable nsoffer/ovirt-imageio-preview Looking forward to your feedback. Nir On Wed, Jul 1, 2020 at 8:43 PM Nir Soffer <nsof...@redhat.com> wrote: > > On Tue, Jun 30, 2020 at 10:22 PM Michael Ablassmeier <a...@grinser.de> wrote: > > > > hi, > > > > On Tue, Jun 30, 2020 at 04:49:01PM +0300, Nir Soffer wrote: > > > On Tue, Jun 30, 2020 at 10:32 AM Michael Ablassmeier <a...@grinser.de> > > > wrote: > > > > > > > > https://tranfer_node:54322/images/d471c659-889f-4e7f-b55a-a475649c48a6/extents > > > > > > > > As i failed to find them, are there any existing functions/api calls > > > > that could be used to download only the used extents to a file/fifo > > > > pipe? > > > > > > To use _internal.io.copy to copy the image to tape, we need to solve > > > several issues: > > > > > > 1. how do you write the extents to tape so that you can extract them > > > later? > > > 2. provide a backend that knows how to stream data to tape in the right > > > format > > > 3. fix client.download() to consider the number of writers allowed by > > > the backend, > > > since streaming to tape using multiple writers will not be possible. > > > > so, speaking as someone who works for a backup vendor, issue 1 and 2 are > > already solved by our software, the backend is there, we just need an > > way to extract the data from the api without storing it into a file > > first. Something like: > > > > backup_vm.py full <vm_uuid> pipe > > > > is already sufficient, as our backup client software would simply read > > the data from the pipe, sending it to our backend which does all the > > stuff regarding tape communication and format. > > Great, but piping the data is not so simple, see below. > > > The old implementation used the snapshot/attach feature, where our > > backup client is reading directly from the attached storage device, > > sending the data to the backend, which cares about multiplexing to tape, > > possible dedpulication, etc.. > > In this case you read a complete disk, including the unallocated areas which > read as zeroes. This is not efficient, creating lots of I/O and > network bandwidth > on the way to the backup software, where you do deduplication etc. > > > Tape is not the only use case here, most of the times our customers want > > to write data to storage devices which do not expose an regular file > > system (such as dedup services, StoreOnce, Virtual Tape solutions etc). > > > > > To restore this backup, you need to: > > > 1. find the tar in the tape (I have no idea how you would do this) > > > 2. extract backup info from the tar > > > 3. extract extents from the tar > > > > 1-3 are not an issue here and handled by our backend > > > > > 4. start an upload transfer > > > 5. for each data extent: > > > read data from the tar member, and send to imageio using the right > > > offset and size > > > > that is some good information, so it is possible to create an empty disk > > with the same size using the API and then directly send the extents with > > their propper offset. How does it look with an incremental backup on top > > of an just restored full backup. Does the imageio backend automatically > > rebase and commit the data from the incremental backup during upload? > > No, during upload you get the similar interface - you can write to any offset > or zero a byte range. > > imageio API is mostly like a remote file descriptor. Instead of integer > (fd=42) > you get a random URL > (https://host:port/images/efb761c6-2b06-4b46-bf50-2c40677ea419). > Using URL you can read, write or zero a byte range. > > During restore, you need to write back the data that should be on the disk > at a specific point in time. > > Ideally your backup software can provide a similar interface to pull data for > a specific point in time, so you can push it to storage. If your backup > software > can only return data from specific backup, you can restore the disk state > using this flow > > 1. Copy data from the last full backup before the restore point to storage > 2. For each incremental backup since that full backup: > copy data from the backup to storage > 3. Zero all the areas that were not written in the previous steps. > > This is not not the most efficient way since you may copy the same area > several times, so this should ideally be handled by the backup software. > > > As i understand it, requesting the extents directly and writing them to > > a file, leaves you with an image in raw format, which then needs to be > > properly re-aligned with zeros and converted to qcow2, beeing able to > > commit any of the incremental backups i have stored somewhere. > > If you write the extent to a file in raw format, you will have holes > in the image. > If you want to pipe the data you cannot have holes, unless you want to > generate > zeroes for the holes, and pipe the zeroes, which is not efficient. > > Example: > > [ > {"start": 0, "length": 65536, "zero": False}, > {"start: 65536, "length": 1073741824, "zero": True}, > ] > > If you pipe the zeros you are going to push 1g of zeros to your pipe. > > This can not work for incremental backup since in this case you get only the > extents that were modified since the last backup, and you cannot fill the > space > between these extents with zeros. > > [ > {"start": 0, "length": 65536, "dirty": True}, > {"start: 65536, "length": 1073741824, "dirty": False}, > ] > > You must preserve the hole, so when you restore you can skip this extent. > > If you want to pipe the data, you must encode the data in some way so you > can push the data and the holes to your pipe. > > One way that we considered in the past is to support a chunked-like format, > stream of data extents and hole extents. > > For example: > > data 0000000040000000\r\n > <1 GiB of data>\r\n > hole 0000000000100000\r\n > \r\n > > This is similar to the incremental backup provided by ceph: > https://docs.ceph.com/docs/master/dev/rbd-diff/ > > We did not implement it since providing a list of extents and a way to > read the extents > seems a more generic solution that can make it easier to integrate > with many backup > vendors that may use different solutions to storage and manage the data. > > So you can read data from imageio and push it to your pipe in similar format. > If you do this the http backend a better way to pipe the data would be: > > backend.write_to(writer, length, buf) > > which accept an object implementing write(buf), and push length bytes from > the server to this object. Your writer can be sys.stdout if you want to pipe > the > backup to some other process. > > In this case your backup loop may be: > > for extent in extents: > write_extent_header(writer, extent) > if not extent.zero: > backend.write_to(writer, extents.length, buf) > > And you restore loop would be something like: > > for extent in extents: > backend.seek(extent.start) > if extent.zero: > backend.zero(extent.length) > else: > backend.read_from(reader, extent.length, buf) > > read_from() like write_to(), but works in the other direction. > > > As during > > upload, an convert is possible, that means we dont have to rebuild the > > full/inc chain using a temporary file which we then upload? > > If your backup backend can stream the data for a specific point in > time, considering > all the backups since the last full backup, you don't need any temporary > files. > > The convert step in upload is done on the server side. The upload pipeline is: > > backup storage -> restore program -> imageio server -> qemu-nbd -> volume > > imageio server accepts write and zero request and convert them to > NBD_CMD_WRITE and > NBD_CMD_WRITE_ZEROES to qemu-nbd, and qemu-nbd is writing the data and the > zeros > to the image using qcow2 or raw drivers. > > The backup pipeline is similar: > > volume -> qemu -> imageio server -> backup program -> backup storage > > imageio server accepts extents request and converts it to > NBD_CMD_BLOCK_STATUS requests > to qemu. Then it accept read requests and convert it to NBD_CMD_READ > to qemu and return > the data qemu returns. > > > > So the missing part is to create a connection to imageio and reading the > > > data. > > > > > > The easiest way is to use imageio._internal.backends.http, but note that > > > this > > > is internal now, so you should not use it outside of imageio. It is fine > > > for > > > writing proof of concept, and if you can show a good use case we can work > > > on public API. > > > > yes, that is what i noticed. My current solution would be to use the > > interal functions to query the extent information and then continue > > extracting them, to be able to pipe the data into our backend. > > > > > You can write this using http.client.HTTPSConnection without using > > > the http backend, but it would be a lot of code. > > > > thanks for your example, i will give it a try during POC implementation. > > > > > We probably need to expose the backends or a simplified interface > > > in the client public API to make it easier to write such applications. > > > > > > Maybe something like: > > > > > > client.coy(src, dst) > > > > > > Where src and dst are objects implementing imageio backend interface. > > > > > > But before we do this we need to see some examples of real programs > > > using imageio, to understand the requirements better. > > > > the main feature for us would be to be able to read the data and > > pipe it somewhere, which works by using the _internal api > > functions, but having a stable interface for it would be really > > good for any kind of backup vendor to implement a client for > > the new api into their software. > > The main challenge is to find a generic format supporting streaming > that most vendors > can use. If we will have such a format we can support it in the client > and in the server. > > For example we can provide: > > GET /images/xxx-yyy?format=sparse&context=zero > > This can return stream of data/zero extents that can be piped using > standard tools > like curl. > > And we can support restore using: > > PUT /images/xxx-yyy?format=sparse > > So you can pussh the same stream back - using one request. > > The disadvantage is that your system must understand this sprse format. Parse > it > during backup, and maybe construct it during restore. > > If this looks like a useful way please file RFE to implement it. > > > If anyone is interested to hear more thoughts about that, also from > > redhat, dont hesitate to contact me directly for having a call. > > Good idea. > > Cheers, > Nir _______________________________________________ Devel mailing list -- devel@ovirt.org To unsubscribe send an email to devel-le...@ovirt.org Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/devel@ovirt.org/message/YE4GI77TTKMOUJK6BABEQGS7D2Z26JNZ/