Hi Preston, Replies to some of your cinder-related questions: 1. Creating a snapshot isn't usually an I/O intensive operation. Are you seeing I/O spike or CPU? If you're seeing CPU load, I've seen the CPU usage of cinder-api spike sometimes - not sure why. 2. The 'dd' processes that you see are Cinder wiping the volumes during deletion. You can either disable this in cinder.conf, or you can use a relatively new option to manage the bandwidth used for this.
IMHO, deployments should be optimized to not do very long/intensive management operations - for example, use backends with efficient snapshots, use CoW operations wherever possible rather than copying full volumes/images, disabling wipe on delete, etc. Thanks, Avishay On Sun, Oct 19, 2014 at 1:41 PM, Preston L. Bannister <pres...@bannister.us> wrote: > OK, I am fairly new here (to OpenStack). Maybe I am missing something. Or > not. > > Have a DevStack, running in a VM (VirtualBox), backed by a single flash > drive (on my current generation MacBook). Could be I have something off in > my setup. > > Testing nova backup - first the existing implementation, then my (much > changed) replacement. > > Simple scripts for testing. Create images. Create instances (five). Run > backup on all instances. > > Currently found in: > https://github.com/dreadedhill-work/stack-backup/tree/master/backup-scripts > > First time I started backups of all (five) instances, load on the Devstack > VM went insane, and all but one backup failed. Seems that all of the > backups were performed immediately (or attempted), without any sort of > queuing or load management. Huh. Well, maybe just the backup implementation > is naive... > > I will write on this at greater length, but backup should interfere as > little as possible with foreground processing. Overloading a host is > entirely unacceptable. > > Replaced the backup implementation so it does proper queuing (among other > things). Iterating forward - implementing and testing. > > Fired off snapshots on five Cinder volumes (attached to five instances). > Again the load shot very high. Huh. Well, in a full-scale OpenStack setup, > maybe storage can handle that much I/O more gracefully ... or not. Again, > should taking snapshots interfere with foreground activity? I would say, > most often not. Queuing and serializing snapshots would strictly limit the > interference with foreground. Also, very high end storage can perform > snapshots *very* quickly, so serialized snapshots will not be slow. My take > is that the default behavior should be to queue and serialize all heavy I/O > operations, with non-default allowances for limited concurrency. > > Cleaned up (which required reboot/unstack/stack and more). Tried again. > > Ran two test backups (which in the current iteration create Cinder volume > snapshots). Asked Cinder to delete the snapshots. Again, very high load > factors, and in "top" I can see two long-running "dd" processes. (Given I > have a single disk, more than one "dd" is not good.) > > Running too many heavyweight operations against storage can lead to > thrashing. Queuing can strictly limit that load, and insure better and > reliable performance. I am not seeing evidence of this thought in my > OpenStack testing. > > So far it looks like there is no thought to managing the impact of disk > intensive management operations. Am I missing something? > > > > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev