Re: [PR] Introduce new native backup provider (KNIB) [cloudstack]

via GitHub Mon, 09 Mar 2026 09:55:05 -0700


JoaoJandre commented on PR #12758:
URL: https://github.com/apache/cloudstack/pull/12758#issuecomment-4025239234


   > Hi Joao,
   
   Hello, @abh1sar
   
   > 
   > This looks promising. Incremental backups, quick restore and file restore 
features have been missing from CloudStack KVM.
   > 
   > I am having trouble understanding some of the design choices though:
   > 
   >     1. What’s the reason behind strong coupling with secondary storage?
   >        
   >        * I am wondering if the Backup Repository will provide a more 
flexible alternative. The user would be free to add an external storage server 
or use the secondary storage by simply adding it as a backup repository? It 
will be very easy for user to have multiple backup repository attached to 
multiple backup offerings which can be assigned to instance as required.
   
   I don't see why we should force the coupling of backup offerings with backup 
repositories, what is the benefit?
   
   >          This will also be consistent with other backup providers like 
Veeam and NAS which have the concept of backup repository.
   >          The backup repository feature also comes with a separate capacity 
tracking and email alerts.
   
   The secondary storage also has both features. Although the capacity is not 
reported to the users currently.
   
   >        * If a secondary storage is needed just for backup’s purpose, how 
will it be ensured that templates and snapshots are not copied over to it?
   
   The secondary storage selectors feature (introduced in 2023 through #7659) 
allows you to specialize secondary storages. Quoting from the PR description: 
`"This PR aims to add the possibility to direct resources (Volumes, Templates, 
Snapshots and ISOs) to a specific secondary storage through rules written in 
JavaScript that will only affect new allocated resources"`. For a few years it 
has been possible to have secondary storages that only receive snapshots or 
templates for example. This PR introduces the possibility to add selectors for 
backups, so that you have secondary storages that are specific for backups.
   
   Furthermore, my colleagues are working on a feature to allow using 
alternative secondary storage solutions, such as CephFS, iSCSI and S3, while 
preserving compatibility with features destined to NFS storages. This feature 
may be extended in the future to allow essentially any type of secondary 
storage. Thus, the flexibility for secondary storages will soon grow.
   
   >     2. About Qemu compression
   >        
   >        * Have you measured / compared the performance of qemu-img 
compression with other compression methods?
   
   Using any other type of backup-level compression will be worse then using 
qemu-img compression. This is because when restoring the backup, we must have 
access to the whole backing chain. If we use other types of compression, we 
will have to decompress the whole chain before restoring. Using qemu-img, the 
backing files are still valid and do not need to be decompressed, we actually 
never have to decompress ever. This is the great benefit of using qemu-img.
   
   In any case, here is a brief comparison of using qemu-img with the zstd 
library and 8 threads and using the `pigz` implementation of multi-threaded 
compression, also using 8 threads. The original file is the root volume of a VM 
that I use.
   | Command | Time | Original file size | Final file size |
   | --- | --- | --- | --- | 
   | `qemu-img convert -c -p -W -m 8 -f qcow2 -O qcow2 -o 
compression_type=zstd` | real 3m51.944s - user 16m11.970s - sys 4m14.987s | 43 
G | 35G |
   | `pigz -p8` | real 6m13.799s - user 44m33.300s - sys 1m54.801s | 43G | 34G |
   | `pigz --zip -p8` | real 6m2.729s - user 44m38.401s - sys 1m47.663s | 43G | 
34G |
   
   Compression using qemu-img was a lot faster, with a bit smaller compression 
ratio. **Furthermore, we have to consider that the qemu-img compressed image 
can be used as-is, while the other images must be decompressed, further adding 
to the processing time of backing up/restoring a backup.**
   
   >        * As I understand, qemu-img compresses the qcow2 file at a cluster 
granularity (usually 64kb). That might not fare well when compared to storage 
level compression. In production environments, the operator might choose to 
have compression at the storage layer if they are using an enterprise storage 
like NetApp. Even something open source like ZFS might perform better than 
qemu-img compress due to the granularity limitation that qemu compression has.
   
   The compression feature is optional, if you are using storage-level 
compression, you probably will not use backup-level compression. However, many 
environments do not have storage-level compression, thus having the possibility 
of backup-level compression is still very interesting.
   
   >        * I am making this point because the compression part is 
introducing a fair bit of complexity due to the interaction with SSVM, and I am 
just wondering if the gains are worth the trouble and should compression be 
offloaded to the storage completely.
   
   The compression does not add any interaction with the SSVM. 
   
   > 
   >     3. Do we need a separate backup offering table and api?
   >        
   >        * Why not add column or details to backup_offering or 
backup_offering_details? Other offerings can also benefit from these settings.
   
   I did not want to add dozens of parameters to the import backup offering API 
which are only really going to be used for one provider. This way, the original 
design of the API is preserved.
   
   Furthermore, you may note that the APIs are intentionally not called 
`createKnibBackupOffering`, but `createNativeBackupOffering`. If other native 
providers want to use these offerings, they may do so by extending their 
implementations. 
   
   > 
   >     4. What’s the reason behind using virDomainSnapshotCreate to create 
backup files and not virDomainBackupBegin like incremental volume snapshots and 
NAS backup?
   >        
   >        * Did you face any issues with checkpoints and bitmaps?
   
   There are two main issues with using bitmaps:
   1. They are prone to corruption, while this can be mitigated in some ways, 
since the incremental volume snapshot feature was added, we have noticed 
multiple cases of bitmap corruption with different causes. It is possible to 
detect the corruption and delete corrupt bitmaps, but this would add more 
complexity to the feature. 
   2. Using bitmaps is not compatible with the file-based incremental VM 
snapshot feature added in #10632. After some internal discussion and feedback 
from users, we have come to the conclusion that being able to use both the 
incremental VM snapshot and backup features at the same time is very 
interesting.
   
   At the end of the day, this PR adds a new backup provider option for users. 
They will be free to choose the provider that best fits their needs. This is 
one of the reasons why it was done as a new backup provider; KNIB and other 
backup providers do not have to cancel each-other out.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Introduce new native backup provider (KNIB) [cloudstack]

Reply via email to