----- Original Message ----- > From: "Simon Grinberg" <[email protected]> > To: "Mark Wu" <[email protected]>, "Doron Fediuck" > <[email protected]> > Cc: "Orit Wasserman" <[email protected]>, "Laine Stump" > <[email protected]>, "Yuval M" <[email protected]>, "Limor > Gavish" <[email protected]>, [email protected], "Dan Kenigsberg" > <[email protected]> > Sent: Thursday, January 10, 2013 10:38:56 AM > Subject: Re: feature suggestion: migration network > > > > ----- Original Message ----- > > From: "Mark Wu" <[email protected]> > > To: "Dan Kenigsberg" <[email protected]> > > Cc: "Simon Grinberg" <[email protected]>, "Orit Wasserman" > > <[email protected]>, "Laine Stump" <[email protected]>, > > "Yuval M" <[email protected]>, "Limor Gavish" <[email protected]>, > > [email protected] > > Sent: Thursday, January 10, 2013 5:13:23 AM > > Subject: Re: feature suggestion: migration network > > > > On 01/09/2013 03:34 AM, Dan Kenigsberg wrote: > > > On Tue, Jan 08, 2013 at 01:23:02PM -0500, Simon Grinberg wrote: > > >> > > >> ----- Original Message ----- > > >>> From: "Yaniv Kaul" <[email protected]> > > >>> To: "Dan Kenigsberg" <[email protected]> > > >>> Cc: "Limor Gavish" <[email protected]>, "Yuval M" > > >>> <[email protected]>, [email protected], "Simon Grinberg" > > >>> <[email protected]> > > >>> Sent: Tuesday, January 8, 2013 4:46:10 PM > > >>> Subject: Re: feature suggestion: migration network > > >>> > > >>> On 08/01/13 15:04, Dan Kenigsberg wrote: > > >>>> There's talk about this for ages, so it's time to have proper > > >>>> discussion > > >>>> and a feature page about it: let us have a "migration" network > > >>>> role, and > > >>>> use such networks to carry migration data > > >>>> > > >>>> When Engine requests to migrate a VM from one node to another, > > >>>> the > > >>>> VM > > >>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP > > >>>> connection > > >>>> that is opened from the source qemu process to the destination > > >>>> qemu. > > >>>> Currently, destination qemu listens for the incoming > > >>>> connection > > >>>> on > > >>>> the > > >>>> management IP address of the destination host. This has > > >>>> serious > > >>>> downsides: a "migration storm" may choke the destination's > > >>>> management > > >>>> interface; migration is plaintext and ovirtmgmt includes > > >>>> Engine > > >>>> which > > >>>> sits may sit the node cluster. > > >>>> > > >>>> With this feature, a cluster administrator may grant the > > >>>> "migration" > > >>>> role to one of the cluster networks. Engine would use that > > >>>> network's IP > > >>>> address on the destination host when it requests a migration > > >>>> of > > >>>> a > > >>>> VM. > > >>>> With proper network setup, migration data would be separated > > >>>> to > > >>>> that > > >>>> network. > > >>>> > > >>>> === Benefit to oVirt === > > >>>> * Users would be able to define and dedicate a separate > > >>>> network > > >>>> for > > >>>> migration. Users that need quick migration would use nics > > >>>> with > > >>>> high > > >>>> bandwidth. Users who want to cap the bandwidth consumed by > > >>>> migration > > >>>> could define a migration network over nics with bandwidth > > >>>> limitation. > > >>>> * Migration data can be limited to a separate network, that > > >>>> has > > >>>> no > > >>>> layer-2 access from Engine > > >>>> > > >>>> === Vdsm === > > >>>> The "migrate" verb should be extended with an additional > > >>>> parameter, > > >>>> specifying the address that the remote qemu process should > > >>>> listen > > >>>> on. A > > >>>> new argument is to be added to the currently-defined migration > > >>>> arguments: > > >>>> * vmId: UUID > > >>>> * dst: management address of destination host > > >>>> * dstparams: hibernation volumes definition > > >>>> * mode: migration/hibernation > > >>>> * method: rotten legacy > > >>>> * ''New'': migration uri, according to > > >>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2 > > >>>> such as tcp://<ip of migration network on remote node> > > >>>> > > >>>> === Engine === > > >>>> As usual, complexity lies here, and several changes are > > >>>> required: > > >>>> > > >>>> 1. Network definition. > > >>>> 1.1 A new network role - not unlike "display network" should > > >>>> be > > >>>> added.Only one migration network should be defined on a > > >>>> cluster. > > >> We are considering multiple display networks already, then why > > >> not > > >> the > > >> same for migration? > > > What is the motivation of having multiple migration networks? > > > Extending > > > the bandwidth (and thus, any network can be taken when needed) or > > > data separation (and thus, a migration network should be assigned > > > to > > > each VM in the cluster)? Or another morivation with consequence? > > My suggestion is making the migration network role determined > > dynamically on each migrate. If we only define one migration > > network > > per cluster, > > the migration storm could happen to that network. It could cause > > some > > bad impact on VM applications. So I think engine could choose the > > network which > > has lower traffic load on migration, or leave the choice to user. > > Dynamic migration selection is indeed desirable but only from > migration networks - migration traffic is insecure so it's > undesirable to have it mixed with VM traffic unless permitted by the > admin by marking this network as migration network. > > To clarify what I've meant in the previous response to Livnat - When > I've said "...if the customer due to the unsymmetrical nature of > most bonding modes prefers to use muplitple networks for migration > and will ask us to optimize migration across these..." > > But the dynamic selection should be based on SLA which the above is > just part: > 1. Need to consider tenant traffic segregation rules = security > 2. SLA contracts > > If you keep 2, migration storms mitigation is granted. But you are > right that another feature required for #2 above is to control the > migration bandwidth (BW) per migration. We had discussion in the > past for VDSM to do dynamic calculation based on f(Line Speed, Max > Migration BW, Max allowed per VM, Free BW, number of migrating > machines) when starting migration. (I actually wanted to do so years > ago, but never got to that - one of those things you always postpone > to when you'll find the time). We did not think that the engine > should provide some, but coming to think of it, you are right and it > makes sense. For SLA - Max per VM + Min guaranteed should be > provided by the engine to maintain SLA. And it's up to the engine > not to VMs with Min-Guaranteed x number of concurrent migrations > will exceed Max Migration BW. > > Dan this is way too much for initial implementation, but don't you > think we should at least add place holders in the migration API? > Maybe Doron can assist with the required verbs. > > (P.S., I don't want to alarm but we may need SLA parameters for > setupNetworks as well :) unless we want these as separate API tough > it means more calls during set up) >
As with other resources the bare minimum are usually MIN capacity and MAX to avoid choking of other tenants / VMs. In this context we may need to consider other QoS elements (delays, etc) but indeed it can be an additional limitation on top of the basic one. > > > > > > > >> > > >>>> 1.2 If none is defined, the legacy "use ovirtmgmt for > > >>>> migration" > > >>>> behavior would apply. > > >>>> 1.3 A migration network is more likely to be a ''required'' > > >>>> network, but > > >>>> a user may opt for non-required. He may face unpleasant > > >>>> surprises if he > > >>>> wants to migrate his machine, but no candidate host has > > >>>> the > > >>>> network > > >>>> available. > > >> I think the enforcement should be at least one migration network > > >> per host -> in the case we support more then one > > >> Else always required. > > > Fine by me - if we keep backward behavior of ovirtmgmt being a > > > migration > > > network by default. I think that the worst case is that the user > > > finds > > > out - in the least convinient moment - that ovirt 3.3 would not > > > migrate > > > his VMs without explicitly assigning the "migration" role. > > > > > >>>> 1.4 The "migration" role can be granted or taken on-the-fly, > > >>>> when > > >>>> hosts > > >>>> are active, as long as there are no currently-migrating > > >>>> VMs. > > >>>> > > >>>> 2. Scheduler > > >>>> 2.1 when deciding which host should be used for automatic > > >>>> migration, take into account the existence and > > >>>> availability of > > >>>> the > > >>>> migration network on the destination host. > > >>>> 2.2 For manual migration, let user migrate a VM to a host with > > >>>> no > > >>>> migration network - if the admin wants to keep jamming > > >>>> the > > >>>> management network with migration traffic, let her. > > >> Since you send migration network per migration command, why not > > >> allow > > >> to choose any network on the host same as you allow to choose > > >> host? If > > >> host is not selected then allow to choose from cluster's > > >> networks. > > >> The default should be the cluster's migration network. > > > Cool. Added to wiki page. > > > > > >> If you allow for the above, we can waver the enforcement of > > >> migration network per host. No migration network == no automatic > > >> migration to/from this host. > > > again, I'd prefer to keep the current default status of ovirtmgmt > > > as a > > > migration network. Besides that, +1. > > > > > >> > > >>>> 3. VdsBroker migration verb. > > >>>> 3.1 For the a modern cluster level, with migration network > > >>>> defined > > >>>> on > > >>>> the destination host, an additional ''miguri'' parameter > > >>>> should be added > > >>>> to the "migrate" command > > >>>> > > >>>> _______________________________________________ > > >>>> Arch mailing list > > >>>> [email protected] > > >>>> http://lists.ovirt.org/mailman/listinfo/arch > > >>> How is the authentication of the peers handled? Do we need a > > >>> cert > > >>> per > > >>> each source/destination logical interface? > > > I hope Orit or Lain correct me, but I am not aware of any > > > authentication scheme that protects non-tunneled qemu destination > > > from > > > an evil process with network acess to the host. > > > > > > Dan. > > > _______________________________________________ > > > Arch mailing list > > > [email protected] > > > http://lists.ovirt.org/mailman/listinfo/arch > > > > > > > > _______________________________________________ Arch mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/arch
