HI,
Please find my review comments on Snapshot Improvement Functional Spec:
1. is this spec is applicable to all hypervisors or specific to Xen.
2. What are the max concurrent snapshots can be run on a host.is there any
limit for the threshold? Max/Min/default values per hypervisor.
3. Does the snapshot job queue consider the job retry in case of failures.
If yes can you please share how it works.
4. Does this concurrent threshold applies to Backup Snapshot command also.
5. Does this change impact the performance? If yes in what forms?
6. what will be the expected behaviour for the Schedule snapshots jobs on
queue for below cases
a) VM state changed from running to stopped &
running to destroyed
b) corresponding host went to maintenance mode
in a cluster having more than 2 hosts.
7. The time for job.expire.min will be consider as when we initiate the
createSnapshot command or when waiting job dequeued from the sync_queue_item
table.
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Snapshot+improvements+FS
Regards
Sadhu
-----Original Message-----
From: Alena Prokharchyk [mailto:[email protected]]
Sent: 10 October 2012 00:21
To: [email protected]
Subject: FS on cloudStack createSnapshot synchronization improvement
Hi All,
I'm planning to introduce some changes to create snapshot behavior for the
future cloudStack release (the changes will go to asf/master branch). The fix
is fixing the problem described below:
"With the current code for snapshots, cloudStack always creates snapshot on
the host where vm is Running (for vms in Running state) or on the host where
vm used to run the last time (for vms in Stopped state). As the createSnapshot
commands are not synchronized on the agent side, the case when multiple
commands are send to the backend at the same time can lead to the performance
issues on the hypervisor side. At the end there is a high possibility that
createSnapshot command might time out on the Xen side.
The solution is to synchronize number of concurrent snapshots per host basis.
The threshold should be configurable as the customer usually knows how many
snapshots at a time the backend can handle.
While the concurrent snapshots are being processed by the backend, all
subsequent snapshot commands scheduled for execution on the same host, should
wait in the queue"
Here is the feature FS available for the review:
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Snapshot+improvement
s+FS
If you have any comments/suggestions/questions on the implementation, please
let me know.
-Alena.