On Wed, 22 Aug 2018 at 10:47, Gorka Eguileor <gegui...@redhat.com> wrote: > > On 20/08, Matthew Booth wrote: > > For those who aren't familiar with it, nova's volume-update (also > > called swap volume by nova devs) is the nova part of the > > implementation of cinder's live migration (also called retype). > > Volume-update is essentially an internal cinder<->nova api, but as > > that's not a thing it's also unfortunately exposed to users. Some > > users have found it and are using it, but because it's essentially an > > internal cinder<->nova api it breaks pretty easily if you don't treat > > it like a special snowflake. It looks like we've finally found a way > > it's broken for non-cinder callers that we can't fix, even with a > > dirty hack. > > > > volume-update <server> <old> <new> essentially does a live copy of the > > data on <old> volume to <new> volume, then seamlessly swaps the > > attachment to <server> from <old> to <new>. The guest OS on <server> > > will not notice anything at all as the hypervisor swaps the storage > > backing an attached volume underneath it. > > > > When called by cinder, as intended, cinder does some post-operation > > cleanup such that <old> is deleted and <new> inherits the same > > volume_id; that is <old> effectively becomes <new>. When called any > > other way, however, this cleanup doesn't happen, which breaks a bunch > > of assumptions. One of these is that a disk's serial number is the > > same as the attached volume_id. Disk serial number, in KVM at least, > > is immutable, so can't be updated during volume-update. This is fine > > if we were called via cinder, because the cinder cleanup means the > > volume_id stays the same. If called any other way, however, they no > > longer match, at least until a hard reboot when it will be reset to > > the new volume_id. It turns out this breaks live migration, but > > probably other things too. We can't think of a workaround. > > > > I wondered why users would want to do this anyway. It turns out that > > sometimes cinder won't let you migrate a volume, but nova > > volume-update doesn't do those checks (as they're specific to cinder > > internals, none of nova's business, and duplicating them would be > > fragile, so we're not adding them!). Specifically we know that cinder > > won't let you migrate a volume with snapshots. There may be other > > reasons. If cinder won't let you migrate your volume, you can still > > move your data by using nova's volume-update, even though you'll end > > up with a new volume on the destination, and a slightly broken > > instance. Apparently the former is a trade-off worth making, but the > > latter has been reported as a bug. > > > > Hi Matt, > > As you know, I'm in favor of making this REST API call only authorized > for Cinder to avoid messing the cloud. > > I know you wanted Cinder to have a solution to do live migrations of > volumes with snapshots, and while this is not possible to do in a > reasonable fashion, I kept thinking about it given your strong feelings > to provide a solution for users that really need this, and I think we > may have a "reasonable" compromise. > > The solution is conceptually simple. We add a new API microversion in > Cinder that adds and optional parameter called "generic_keep_source" > (defaults to False) to both migrate and retype operations. > > This means that if the driver optimized migration cannot do the > migration and the generic migration code is the one doing the migration, > then, instead of our final step being to swap the volume id's and > deleting the source volume, what we would do is to swap the volume id's > and move all the snapshots to reference the new volume. Then we would > create a user message with the new ID of the volume. > > This way we can preserve the old volume with all its snapshots and do > the live migration. > > The implementation is a little bit tricky, as we'll have to add anew > "update_migrated_volume" mechanism to support the renaming of both > volumes, since the old one wouldn't work with this among other things, > but it's doable. > > Unfortunately I don't have the time right now to work on this...
Sounds promising, and honestly more than I'd have hoped for. Matt -- Matthew Booth Red Hat OpenStack Engineer, Compute DFG Phone: +442070094448 (UK) __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev