JoaoJandre commented on PR #12758: URL: https://github.com/apache/cloudstack/pull/12758#issuecomment-4025239234
> Hi Joao, Hello, @abh1sar > > This looks promising. Incremental backups, quick restore and file restore features have been missing from CloudStack KVM. > > I am having trouble understanding some of the design choices though: > > 1. What’s the reason behind strong coupling with secondary storage? > > * I am wondering if the Backup Repository will provide a more flexible alternative. The user would be free to add an external storage server or use the secondary storage by simply adding it as a backup repository? It will be very easy for user to have multiple backup repository attached to multiple backup offerings which can be assigned to instance as required. I don't see why we should force the coupling of backup offerings with backup repositories, what is the benefit? > This will also be consistent with other backup providers like Veeam and NAS which have the concept of backup repository. > The backup repository feature also comes with a separate capacity tracking and email alerts. The secondary storage also has both features. Although the capacity is not reported to the users currently. > * If a secondary storage is needed just for backup’s purpose, how will it be ensured that templates and snapshots are not copied over to it? The secondary storage selectors feature (introduced in 2023 through #7659) allows you to specialize secondary storages. Quoting from the PR description: `"This PR aims to add the possibility to direct resources (Volumes, Templates, Snapshots and ISOs) to a specific secondary storage through rules written in JavaScript that will only affect new allocated resources"`. For a few years it has been possible to have secondary storages that only receive snapshots or templates for example. This PR introduces the possibility to add selectors for backups, so that you have secondary storages that are specific for backups. Furthermore, my colleagues are working on a feature to allow using alternative secondary storage solutions, such as CephFS, iSCSI and S3, while preserving compatibility with features destined to NFS storages. This feature may be extended in the future to allow essentially any type of secondary storage. Thus, the flexibility for secondary storages will soon grow. > 2. About Qemu compression > > * Have you measured / compared the performance of qemu-img compression with other compression methods? Using any other type of backup-level compression will be worse then using qemu-img compression. This is because when restoring the backup, we must have access to the whole backing chain. If we use other types of compression, we will have to decompress the whole chain before restoring. Using qemu-img, the backing files are still valid and do not need to be decompressed, we actually never have to decompress ever. This is the great benefit of using qemu-img. In any case, here is a brief comparison of using qemu-img with the zstd library and 8 threads and using the `pigz` implementation of multi-threaded compression, also using 8 threads. The original file is the root volume of a VM that I use. | Command | Time | Original file size | Final file size | | --- | --- | --- | --- | | `qemu-img convert -c -p -W -m 8 -f qcow2 -O qcow2 -o compression_type=zstd` | real 3m51.944s - user 16m11.970s - sys 4m14.987s | 43 G | 35G | | `pigz -p8` | real 6m13.799s - user 44m33.300s - sys 1m54.801s | 43G | 34G | | `pigz --zip -p8` | real 6m2.729s - user 44m38.401s - sys 1m47.663s | 43G | 34G | Compression using qemu-img was a lot faster, with a bit smaller compression ratio. **Furthermore, we have to consider that the qemu-img compressed image can be used as-is, while the other images must be decompressed, further adding to the processing time of backing up/restoring a backup.** > * As I understand, qemu-img compresses the qcow2 file at a cluster granularity (usually 64kb). That might not fare well when compared to storage level compression. In production environments, the operator might choose to have compression at the storage layer if they are using an enterprise storage like NetApp. Even something open source like ZFS might perform better than qemu-img compress due to the granularity limitation that qemu compression has. The compression feature is optional, if you are using storage-level compression, you probably will not use backup-level compression. However, many environments do not have storage-level compression, thus having the possibility of backup-level compression is still very interesting. > * I am making this point because the compression part is introducing a fair bit of complexity due to the interaction with SSVM, and I am just wondering if the gains are worth the trouble and should compression be offloaded to the storage completely. The compression does not add any interaction with the SSVM. > > 3. Do we need a separate backup offering table and api? > > * Why not add column or details to backup_offering or backup_offering_details? Other offerings can also benefit from these settings. I did not want to add dozens of parameters to the import backup offering API which are only really going to be used for one provider. This way, the original design of the API is preserved. Furthermore, you may note that the APIs are intentionally not called `createKnibBackupOffering`, but `createNativeBackupOffering`. If other native providers want to use these offerings, they may do so by extending their implementations. > > 4. What’s the reason behind using virDomainSnapshotCreate to create backup files and not virDomainBackupBegin like incremental volume snapshots and NAS backup? > > * Did you face any issues with checkpoints and bitmaps? There are two main issues with using bitmaps: 1. They are prone to corruption, while this can be mitigated in some ways, since the incremental volume snapshot feature was added, we have noticed multiple cases of bitmap corruption with different causes. It is possible to detect the corruption and delete corrupt bitmaps, but this would add more complexity to the feature. 2. Using bitmaps is not compatible with the file-based incremental VM snapshot feature added in #10632. After some internal discussion and feedback from users, we have come to the conclusion that being able to use both the incremental VM snapshot and backup features at the same time is very interesting. At the end of the day, this PR adds a new backup provider option for users. They will be free to choose the provider that best fits their needs. This is one of the reasons why it was done as a new backup provider; KNIB and other backup providers do not have to cancel each-other out. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
