I think in our case we’d only migrate between cells if we know the network and 
storage is accessible and would never do it if not. 
Thinking moving from old to new hardware at a cell level.

If storage and network isn’t available ideally it would fail at the api request.

There is also ceph backed instances and so this is also something to take into 
account which nova would be responsible for.

I’ll be in Denver so we can discuss more there too.

Cheers,
Sam





> On 23 Aug 2018, at 11:23 am, Matt Riedemann <mriede...@gmail.com> wrote:
> 
> Hi everyone,
> 
> I have started an etherpad for cells topics at the Stein PTG [1]. The main 
> issue in there right now is dealing with cross-cell cold migration in nova.
> 
> At a high level, I am going off these requirements:
> 
> * Cells can shard across flavors (and hardware type) so operators would like 
> to move users off the old flavors/hardware (old cell) to new flavors in a new 
> cell.
> 
> * There is network isolation between compute hosts in different cells, so no 
> ssh'ing the disk around like we do today. But the image service is global to 
> all cells.
> 
> Based on this, for the initial support for cross-cell cold migration, I am 
> proposing that we leverage something like shelve offload/unshelve 
> masquerading as resize. We shelve offload from the source cell and unshelve 
> in the target cell. This should work for both volume-backed and 
> non-volume-backed servers (we use snapshots for shelved offloaded 
> non-volume-backed servers).
> 
> There are, of course, some complications. The main ones that I need help with 
> right now are what happens with volumes and ports attached to the server. 
> Today we detach from the source and attach at the target, but that's assuming 
> the storage backend and network are available to both hosts involved in the 
> move of the server. Will that be the case across cells? I am assuming that 
> depends on the network topology (are routed networks being used?) and storage 
> backend (routed storage?). If the network and/or storage backend are not 
> available across cells, how do we migrate volumes and ports? Cinder has a 
> volume migrate API for admins but I do not know how nova would know the 
> proper affinity per-cell to migrate the volume to the proper host (cinder 
> does not have a routed storage concept like routed provider networks in 
> neutron, correct?). And as far as I know, there is no such thing as port 
> migration in Neutron.
> 
> Could Placement help with the volume/port migration stuff? Neutron routed 
> provider networks rely on placement aggregates to schedule the VM to a 
> compute host in the same network segment as the port used to create the VM, 
> however, if that segment does not span cells we are kind of stuck, correct?
> 
> To summarize the issues as I see them (today):
> 
> * How to deal with the targeted cell during scheduling? This is so we can 
> even get out of the source cell in nova.
> 
> * How does the API deal with the same instance being in two DBs at the same 
> time during the move?
> 
> * How to handle revert resize?
> 
> * How are volumes and ports handled?
> 
> I can get feedback from my company's operators based on what their deployment 
> will look like for this, but that does not mean it will work for others, so I 
> need as much feedback from operators, especially those running with multiple 
> cells today, as possible. Thanks in advance.
> 
> [1] https://etherpad.openstack.org/p/nova-ptg-stein-cells
> 
> -- 
> 
> Thanks,
> 
> Matt


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to