On 21 July 2015 at 12:11, John Belamaric <jbelama...@infoblox.com> wrote:
> Wow, a lot to digest in these threads. If I can summarize my > understanding of the two proposals. Let me know whether I get this right. > There are a couple problems that need to be solved: > > a. Scheduling based on host reachability to the segments > So actually this is something Assaf and I were debating on IRC, and I think it depends what you're aiming for. Imagine you have connectivity for a 'network' to every host, but that connectivity only works if you get a host-specific address because the address range is different per host. This seems to be the use case we come back to. (There's a corner case of this where the network is not available on every host and that gets you different requirements, but for now, this.) You *can* use the current mechanism: allocate address, schedule, run - providing your scheduler respects the address you've been allocated and puts you on a host that can reach this address. This is a silly approach. You can't tell when getting the address (for a port that is entirely disassociated from the VM it's going to be attached to, via Neutron when most of the scheduling constraints live in Nova) that the address is on a machine that can even run the VM. You can delay address allocation - then the machine can be scheduled anywhere because the address it has is not a constraint. This saves any change to scheduling at all - normal scheduling rules apply, excepting the case where addresses are exhausted on that machine, and in that case we'd probably use the retry mechanism as a fallback to find a better place until someone works out it's not really just a *nova* scheduler. > b. Floating IP functionality across the segments. I am not sure I am > clear on this one but it sounds like you want the routers attached to the > segments to advertise routes to the specific floating IPs. Presumably then > they would do NAT or the instance would assign both the fixed IP and the > floating IP to its interface? > That's the summary. And I don't think anyone is clear on this and I also don't know that anyone has specifically requested this. In Proposal 1, (a) is solved by associating segments to the front network > via a router - that association is used to provide a single hook into the > existing API that limits the scope of segment selection to those associated > with the front network. (b) is solved by tying the floating IP ranges to > the same front network and managing the reachability with dynamic routing. > > In Proposal 2, (a) is solved by tagging each network with some meta-data > that the IPAM system uses to make a selection. > The distinction is actually pretty small. The same backing data exists for the IPAM to use - the difference is only that in (1) it's there as a misuse of networks and in (2) it's not specified. > This implies an IP allocation request that passes something other than a > network/port to the IPAM subsystem. > This is where I started - there is nothing to pass when I run 'neutron port-create' except for a network and this is where address allocation happens today. We need a mechanism to defer address allocation and indicate that the port has no address right now. > This fine from the IPAM point of view but there is no corresponding API > for this right now. To solve (b) either the IPAM system has to publish the > routes > It needs to ensure there's enough information on the port that the network controller can push the routes, is the way I think of it. > or the higher level management has to ALSO be aware of the mappings > (rather than just IPAM). > > To throw some fuel on the fire, I would argue also that (a) is not > sufficient and address availability needs to be considered as well (as > described in [1]). Selecting a host based on reachability alone will fail > when addresses are exhausted. Similarly, with (b) I think there needs to be > consideration during association of a floating IP to the effect on routing. > That is, rather than a huge number of host routes it would be ideal to > allocate the floating IPs in blocks that can be associated with the backing > networks (though we would want to be able to split these blocks as small as > a /32 if necessary - but avoid it/optimize as much as possible). > Again - the scheduler is simplistic and nova-centric as things stand, and I think we all recgonise this. The current fallbacks work, but they're fallbacks. In fact, I think that these proposals are more or less the same - it's just > in #1 the meta-data used to tie the backing networks together is another > network. > Yup. > This allows it to fit in neatly with the existing APIs. You would still > need to implement something prior to IPAM or within IPAM that would select > the appropriate backing network. > > As a (gulp) third alternative, we should consider that the front network > here is in essence a layer 3 domain, and we have modeled layer 3 domains as > address scopes in Liberty. The user is essentially saying "give me an > address that is routable in this scope" - they don't care which actual > subnet it gets allocated on. This is conceptually more in-line with [2] - > modeling L3 domain separately from the existing Neutron concept of a > network being a broadcast domain. > Again, the issue is that when you ask for an address you tend to have quite a strong opinion of what that address should be if it's location-specific. > > Fundamentally, however we associate the segments together, this comes down > to a scheduling problem. > It's not *solely* a scheduling problem, and that is my issue with this statement (Assaf has been saying the same). You *can* solve this *exclusively* with scheduling (allocate the address up front, hope that the address has space for a VM with all its constraints met) - but that solution is horrible; or you can solve this largely with allocation where scheduling helps to deal with pool exchaustion, where it is mainly another sort of problem but scheduling plays a part. Nova needs to be able to incorporate data from Neutron in its scheduling > decision. Rather than solving this with a single piece of meta-data like > network_id as described in proposal 1, it probably makes more sense to > build out the general concept of utilizing network data for nova > scheduling. We could still model this as in #1, or using address scopes, or > some arbitrary data as in #2. But the harder problem to solve is the > scheduling, not how we tag these things to inform that scheduling. > > The optimization of routing for floating IPs is also a scheduling > problem, though one that would require a lot more changes to how FIP are > allocated and associated to solve. > > John > > [1] https://review.openstack.org/#/c/180803/ > [2] https://bugs.launchpad.net/neutron/+bug/1458890/comments/7 > > > > > On Jul 21, 2015, at 10:52 AM, Carl Baldwin <c...@ecbaldwin.net> wrote: > > On Jul 20, 2015 4:26 PM, "Ian Wells" <ijw.ubu...@cack.org.uk> wrote: > > > > There are two routed network models: > > > > - I give my VM an address that bears no relation to its location and > ensure the routed fabric routes packets there - this is very much the > routing protocol method for doing things where I have injected a route into > the network and it needs to propagate. It's also pretty useless because > there are too many host routes in any reasonable sized cloud. > > > > - I give my VM an address that is based on its location, which only > becomes apparent at binding time. This means that the semantics of a port > changes - a port has no address of any meaning until binding, because its > location is related to what it does - and it leaves open questions about > what to do when you migrate. > > > > Now, you seem to generally be thinking in terms of the latter model, > particularly since the provider network model you're talking about fits > there. But then you say: > > Actually, both. For example, GoDaddy assigns each vm an ip from the > location based address blocks and optionally one from the routed location > agnostic ones. I would also like to assign router ports out of the > location based blocks which could host floating ips from the other blocks. > > > On 20 July 2015 at 10:33, Carl Baldwin <c...@ecbaldwin.net> wrote: > >> > >> When creating a > >> port, the binding information would be sent to the IPAM system and the > >> system would choose an appropriate address block for the allocation. > > Implicit in both is a need to provide at least a hint at host binding. > Or, delay address assignment until binding. I didn't mention it because my > email was already long. > This is something and discussed but applies equally to both proposals. > > > No, it wouldn't, because creating and binding a port are separate > operations. I can't give the port a location-specific address on creation > - not until it's bound, in fact, which happens much later. > > > > On proposal 1: consider the cost of adding a datamodel to Neutron. It > has to be respected by all developers, it frequently has to be deployed by > all operators, and every future change has to align with it. Plus either > it has to be generic or optional, and if optional it's a burden to some > proportion of Neutron developers and users. I accept proposal 1 is easy, > but it's not universally applicable. It doesn't work with Neil Jerram's > plans, it doesn't work with multiple interfaces per host, and it doesn't > work with the IPv6 routed-network model I worked on. > > Please be more specific. I'm not following your argument here. My > proposal doesn't really add much new data model. > > We've discussed this with Neil at length. I haven't been able to > reconcile our respective approaches in to one model that works for both of > us and still provides value. The routed segments model needs to somehow > handle the L2 details of the underlying network. Neil's model confines L2 > to the port and routes to it. The two models can't just be squished > together unless I'm missing something. > > Could you provide some links so that I can brush up on your ipv6 routed > network model? I'd like to consider it but I don't know much about it. > > > Given that, I wonder whether proposal 2 could be rephrased. > > > > 1: some network types don't allow unbound ports to have addresses, they > just get placeholder addresses for each subnet until they're bound > > 2: 'subnets' on these networks are more special than subnets on other > networks. (More accurately, they dont use subnets. It's a shame subnets > are core Neutron, because they're pretty horrible and yet hard to replace.) > > 3: there's an independent (in an extension? In another API endpoint?) > datamodel that the network points to and that IPAM refers to to find a port > an address. Bonus, people who aren't using funky network types can disable > this extension. > > 4: when the port is bound, the IPAM is referred to, and it's told the > binding information of the port. > > 5: when binding the port, once IPAM has returned its address, the > network controller probably does stuff with that address when it completes > the binding (like initialising routing). > > 6: live migration either has to renumber a port or forward old traffic > to the new address via route injection. This is an open question now, so > I'm mentioning it rather than solving it. > > I left out the migration issue from my email also because it also affects > both proposals equally. > > > In fact, adding that hook to IPAM at binding plus setting aside a 'not > set' IP address might be all you need to do to make it possible. The IPAM > needs data to work out what an address is, but that doesn't have to take > the form of existing Neutron constructs. > > What about the L2 network for each segment? I suggested creating provider > networks for these. Do you have a different suggestion? > > What about distinguishing the bound address blocks from the mobile address > blocks? For example, the address blocks bound to the segments could be > from a private space. A router port may get an address from this private > space and be the next hop for public addresses. Or, GoDaddy's model where > vms get an address from the segment network and optionally a floating ip > which is routed. > > Carl > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev