GutoVeronezi commented on issue #6836: URL: https://github.com/apache/cloudstack/issues/6836#issuecomment-1285524454
Hello, @whitetiger264 KVM disk snapshots in ACS are divided in two processes: take the snapshot, and backup the snapshot to the secondary storage. In ACS 4.17.0.0, a new way of taking disk snapshots for KVM was introduced (you can read the full specification in #5124). In this new process, a delta is created for the disk, the base file is copied to the snapshot directory in the primary storage `<primary-storage-path>/snapshots/<snapshot-uuid>` (which generates data transference between the KVM host and the primary storage), and the delta is merged to the base file. Then, if the first process finishes with success, the second process is executed according to your configuration. Here follows a diagram of the create snapshot process:  The timeout error you see in the MS is related to the job that waits for the result from the agent; in the KVM agent, the process keeps running (as there is no timeout configured for the copy base file step) and finishes successfully as we can see in https://github.com/apache/cloudstack/issues/6836#issuecomment-1283521520. Therefore, it took more than 1 hour to copy the file `635fd5b5-bf14-4cf4-979c-be8a649d5189` (15G) to the snapshot directory and the job in the MS expired. In https://github.com/apache/cloudstack/issues/6836#issuecomment-1283540475 you said that your KVM host and the MS are connected via public network. The KVM host and the primary storage are also connected via public network? What is the bandwidth of the connection between the KVM host and the primary storage? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
