On Thu, Jan 10, 2013 at 04:43:45AM -0500, Doron Fediuck wrote: > > > ----- Original Message ----- > > From: "Simon Grinberg" <[email protected]> > > To: "Mark Wu" <[email protected]>, "Doron Fediuck" > > <[email protected]> > > Cc: "Orit Wasserman" <[email protected]>, "Laine Stump" > > <[email protected]>, "Yuval M" <[email protected]>, "Limor > > Gavish" <[email protected]>, [email protected], "Dan Kenigsberg" > > <[email protected]> > > Sent: Thursday, January 10, 2013 10:38:56 AM > > Subject: Re: feature suggestion: migration network > > > > > > > > ----- Original Message ----- > > > From: "Mark Wu" <[email protected]> > > > To: "Dan Kenigsberg" <[email protected]> > > > Cc: "Simon Grinberg" <[email protected]>, "Orit Wasserman" > > > <[email protected]>, "Laine Stump" <[email protected]>, > > > "Yuval M" <[email protected]>, "Limor Gavish" <[email protected]>, > > > [email protected] > > > Sent: Thursday, January 10, 2013 5:13:23 AM > > > Subject: Re: feature suggestion: migration network > > > > > > On 01/09/2013 03:34 AM, Dan Kenigsberg wrote: > > > > On Tue, Jan 08, 2013 at 01:23:02PM -0500, Simon Grinberg wrote: > > > >> > > > >> ----- Original Message ----- > > > >>> From: "Yaniv Kaul" <[email protected]> > > > >>> To: "Dan Kenigsberg" <[email protected]> > > > >>> Cc: "Limor Gavish" <[email protected]>, "Yuval M" > > > >>> <[email protected]>, [email protected], "Simon Grinberg" > > > >>> <[email protected]> > > > >>> Sent: Tuesday, January 8, 2013 4:46:10 PM > > > >>> Subject: Re: feature suggestion: migration network > > > >>> > > > >>> On 08/01/13 15:04, Dan Kenigsberg wrote: > > > >>>> There's talk about this for ages, so it's time to have proper > > > >>>> discussion > > > >>>> and a feature page about it: let us have a "migration" network > > > >>>> role, and > > > >>>> use such networks to carry migration data > > > >>>> > > > >>>> When Engine requests to migrate a VM from one node to another, > > > >>>> the > > > >>>> VM > > > >>>> state (Bios, IO devices, RAM) is transferred over a TCP/IP > > > >>>> connection > > > >>>> that is opened from the source qemu process to the destination > > > >>>> qemu. > > > >>>> Currently, destination qemu listens for the incoming > > > >>>> connection > > > >>>> on > > > >>>> the > > > >>>> management IP address of the destination host. This has > > > >>>> serious > > > >>>> downsides: a "migration storm" may choke the destination's > > > >>>> management > > > >>>> interface; migration is plaintext and ovirtmgmt includes > > > >>>> Engine > > > >>>> which > > > >>>> sits may sit the node cluster. > > > >>>> > > > >>>> With this feature, a cluster administrator may grant the > > > >>>> "migration" > > > >>>> role to one of the cluster networks. Engine would use that > > > >>>> network's IP > > > >>>> address on the destination host when it requests a migration > > > >>>> of > > > >>>> a > > > >>>> VM. > > > >>>> With proper network setup, migration data would be separated > > > >>>> to > > > >>>> that > > > >>>> network. > > > >>>> > > > >>>> === Benefit to oVirt === > > > >>>> * Users would be able to define and dedicate a separate > > > >>>> network > > > >>>> for > > > >>>> migration. Users that need quick migration would use nics > > > >>>> with > > > >>>> high > > > >>>> bandwidth. Users who want to cap the bandwidth consumed by > > > >>>> migration > > > >>>> could define a migration network over nics with bandwidth > > > >>>> limitation. > > > >>>> * Migration data can be limited to a separate network, that > > > >>>> has > > > >>>> no > > > >>>> layer-2 access from Engine > > > >>>> > > > >>>> === Vdsm === > > > >>>> The "migrate" verb should be extended with an additional > > > >>>> parameter, > > > >>>> specifying the address that the remote qemu process should > > > >>>> listen > > > >>>> on. A > > > >>>> new argument is to be added to the currently-defined migration > > > >>>> arguments: > > > >>>> * vmId: UUID > > > >>>> * dst: management address of destination host > > > >>>> * dstparams: hibernation volumes definition > > > >>>> * mode: migration/hibernation > > > >>>> * method: rotten legacy > > > >>>> * ''New'': migration uri, according to > > > >>>> http://libvirt.org/html/libvirt-libvirt.html#virDomainMigrateToURI2 > > > >>>> such as tcp://<ip of migration network on remote node> > > > >>>> > > > >>>> === Engine === > > > >>>> As usual, complexity lies here, and several changes are > > > >>>> required: > > > >>>> > > > >>>> 1. Network definition. > > > >>>> 1.1 A new network role - not unlike "display network" should > > > >>>> be > > > >>>> added.Only one migration network should be defined on a > > > >>>> cluster. > > > >> We are considering multiple display networks already, then why > > > >> not > > > >> the > > > >> same for migration? > > > > What is the motivation of having multiple migration networks? > > > > Extending > > > > the bandwidth (and thus, any network can be taken when needed) or > > > > data separation (and thus, a migration network should be assigned > > > > to > > > > each VM in the cluster)? Or another morivation with consequence? > > > My suggestion is making the migration network role determined > > > dynamically on each migrate. If we only define one migration > > > network > > > per cluster, > > > the migration storm could happen to that network. It could cause > > > some > > > bad impact on VM applications. So I think engine could choose the > > > network which > > > has lower traffic load on migration, or leave the choice to user. > > > > Dynamic migration selection is indeed desirable but only from > > migration networks - migration traffic is insecure so it's > > undesirable to have it mixed with VM traffic unless permitted by the > > admin by marking this network as migration network. > > > > To clarify what I've meant in the previous response to Livnat - When > > I've said "...if the customer due to the unsymmetrical nature of > > most bonding modes prefers to use muplitple networks for migration > > and will ask us to optimize migration across these..." > > > > But the dynamic selection should be based on SLA which the above is > > just part: > > 1. Need to consider tenant traffic segregation rules = security > > 2. SLA contracts
We could devise a complex logic of assigning each Vm a pool of applicable migration networks, where one of them is chosen by Engine upon migration startup. I am, however, not at all sure that extending the migration bandwidth by means of multiple migration networks is worth the design hassle and the GUI noise. A simpler solution would be to build a single migration network on top of a fat bond, tweaked by a fine-tuned SLA. > > > > If you keep 2, migration storms mitigation is granted. But you are > > right that another feature required for #2 above is to control the > > migration bandwidth (BW) per migration. We had discussion in the > > past for VDSM to do dynamic calculation based on f(Line Speed, Max > > Migration BW, Max allowed per VM, Free BW, number of migrating > > machines) when starting migration. (I actually wanted to do so years > > ago, but never got to that - one of those things you always postpone > > to when you'll find the time). We did not think that the engine > > should provide some, but coming to think of it, you are right and it > > makes sense. For SLA - Max per VM + Min guaranteed should be > > provided by the engine to maintain SLA. And it's up to the engine > > not to VMs with Min-Guaranteed x number of concurrent migrations > > will exceed Max Migration BW. > > > > Dan this is way too much for initial implementation, but don't you > > think we should at least add place holders in the migration API? In my opinion this should wait for another feature. For each VM, I'd like to see a means to define the SLA of each of its vNIC. When we have that, we should similarly define how much bandwidth does it have for migration > > Maybe Doron can assist with the required verbs. > > > > (P.S., I don't want to alarm but we may need SLA parameters for > > setupNetworks as well :) unless we want these as separate API tough > > it means more calls during set up) Exactly - when we have a migration network concept, and when we have general network SLA defition, we could easily apply the latter on the former. > > > > As with other resources the bare minimum are usually MIN capacity and > MAX to avoid choking of other tenants / VMs. In this context we may need > to consider other QoS elements (delays, etc) but indeed it can be an > additional > limitation on top of the basic one. > _______________________________________________ Arch mailing list [email protected] http://lists.ovirt.org/mailman/listinfo/arch
