Re: [openstack-dev] [Neutron][Heat] The Neutron API and orchestration

Zane Bitter Tue, 08 Apr 2014 10:43:32 -0700

On 07/04/14 21:52, Kevin Benton wrote:

I will just provide a few quick points of clarity.


 >Instinctively, I want a Subnet to be something like a virtual VLAN�

That's what a network is. The network is the broadcast domain. That's
why you attach ports to the network. The subnet is just blocks of IP
addresses to use on this network. If you have two subnets on the same
network, they will be sharing a broadcast domain. Since DHCP servers do
not answer anonymous queries, there are no conflicts, which is where the
subnet requirement comes in when creating a port.�

Oh! Thank you, I don't think I would ever have guessed that from theNetwork/Subnet terminology. Looking back at the documentation, I seethat this is stated in the Glossary, so we can put this down to a caseof Déformation professionnelle on my part.

To attach a port to a network and give it an IP from a specific subnet
on that network, you would use the *--fixed-ip subnet_id *option.
Otherwise, the create port request will use the first subnet it finds
attached to that network to allocate the port an IP address. This is why
you are encountering the port-> subnet-> network chain. Subnets provide
the addresses. Networks are the actual layer 2 boundaries.�

It sounds like maybe Subnets need to be created independently ofNetworks and then passed as a list to the Network when it is created. InHeat there's no way to even predict which Subnet will be "first" unlessthe user adds explicit "depends_on" annotations (and even then, a Subnetcould have been created outside of the template already).

�if I can do a create call immediately followed by an update call then

the Neutron API can certainly do this internally

Are you sure you can do that in an update_router call? There are
separate methods to add and remove router interfaces, none of which seem
to be referenced from the update_router method.�

You're right, I misinterpreted the docs because there's a line break inthe middle of a URL fragment. That doesn't really change my argument though.

It's not exactly clear why the external gateway is special enough that

you can have only one interface of this type on a Router,�but not so
special that it would be considered a separate thing.

This is because it currently doubles as an indicator of the default
route. If you had multiple external networks, you would need another
method of specifying which to use for outbound traffic. The reason it's
not a regular port is because�the port created for the external gateway
cannot be owned by the tenant since it's attaching to a network that the
tenant does not own. A special port is created for this gateway which
the tenant does not have direct control over so they can't mess with the
external network.

This makes sense, thanks. How does the user find out the UUID of theexternal network? Would it be just as easy to supply them with the UUIDof the special port that's created for them?

 >An extra route doesn't behave at all like a static RIB entry (with a
weight and an administrative distance)

You and I have discussed this at lengths, but I will document it here
for the mailing list. :-)

This allows you to create static routes, which most certainly may live
in the RIB with IP addresses as the next hop. It's up to the neighbor
(or adjacency) discovery components to translate this to an L2 address
(or interface) when it's installed in the RIB. It is very rare to find a

(ZB) I assume you meant FIB here            ^

modern router that doesn't let you configure static routes with IP
addresses.�

Agree, but the reason the RIB needs the IP address is so that it canselect from multiple possible routes based on availability. So you canadd a bunch of static routes for the same prefix but with differentcosts to the RIB, and the router will select the one with the lowestcost that it can resolve and insert only that *MAC* address as thenexthop in the FIB. (Of course the RIB also contains routes from othersources like e.g. RIP or OSPF, and generally you can specify theadministrative distance of static routes to adjust their ranking withrespect to other sources of routes.)

ExtraRoutes doesn't do anything like that; there's no way to specify acost (or an administrative distance). If your route appears inExtraRoutes it _will_ be installed (if the nexthop is available,presumably - or does this generate an error? Hopefully traffic getsblackholed if the IP address isn't available... I wouldn't be a happycamper if my VPN traffic got sent to the Internet just because the VPNgateway was down), like a FIB but unlike a RIB.

So, given that ExtraRoutes is not designed to do anything useful withthe IP address (like select from multiple nexthops based onavailability), why does it need an IP address as an argument?

Correct me if I'm wrong, but the fundamental contention is that you
think routes should never be allowed to point to anything not managed by
OpenStack.

It's a little more subtle than that; I don't mind what it points to, aslong as OpenStack knows how to reach it. So my position is that to theextent that routes depend on things that *are* managed by OpenStack,that dependency needs to be explicitly reflected in the API (or, failingthat, at least in the Heat resource).

I have to admit that I still don't really understand this concept of"not managed by OpenStack". This is a network completely defined insoftware... where even the Internet has its own UUID... how is it evenpossible that there are devices attached that the controller knowsnothing about?

This constraint gives heat the ability to reference neutron
port objects as next hops, which is very useful for resolving
dependencies. However, this gain in dependency management comes at the
cost of tenant routers never being allowed to use devices outside of
neutron as next-hop devices. This may cover many of the use cases, but
it is a breaking change due to the loss of generality.

I'll give an example of why I think this is a big deal. Let's say thatyou use some obscure VPN that doesn't have a client available as aVPNaaS in Neutron. You want to spin up a Nova server to connect to yourcorporate network. Let's also say that you now want to make a change tothat server (maybe a software upgrade), which will mean replacing theNova server with a newly-built one.

For starters, let's assume that the ExtraRoute takes as an argumenteither a Port or a Server UUID, rather than an IP address, as a nexthop(in AWS, routes take a port, server or gateway as the nexthop).


Router <---
           \
Server <---- Route

The good news is that just by changing the configuration of the existingServer in your template, Heat will automatically:

1) Create a replacement
2) Switch all references over to the new server
3) Delete the original

It does this by traversing the dependencies of the new template increate order, followed by the dependencies of the old template in thereverse (i.e. delete) order to clean up.


Create        Update       Delete
Server' <---- Route  <---- Server

So we just replaced our VPN gateway without interruption, and withoutdoing anything special in the template or needing in any way to know howit works. We hooked everything up in the only way that we conceivablycould, and Heat just did the Right Thing.

If we look at the current plugin in /contrib, it takes an IP address andthere is no dependency relationship:


Router <---- Route

Server

Our first thought would probably be to give the server a static IP. Wewon't be able to perform this update, however, because I assume thatNeutron won't allow us to assign the same static IP to two differentservers, and even if it did that would probably cause chaos in thenetwork. So we'll have to change the static IP, and therefore the Route,during the update. Here's what happens:


Update
Route

Create        Delete
Server' <---- Server

So the Route gets updated before the new server has even been created. Iassume this doesn't cause an error, but it at least guarantees adisruption of service.

There are ways around this, including using "get_attr" to retrieve theIP address of the nexthop from the Server (which creates a dependency).But they're not obvious, require a detailed level of knowledge of thesystem, and you'll most likely only find out that you got it wrong whenyou go to upgrade your VPN endpoint in production.

Looking back at the history of your patch, I see that the hiddendependencies have been whittled down to two - the RouterGateway (whichwe have fixed by deprecating it), and the RouterInterfaces which, I havediscovered in the process of researching this thread, suffer fromexactly the same design problem and need to be deprecated in exactly thesame way. So my objection to your patch on the grounds of hiddendependencies on RouterInterfaces was misplaced. The race conditionsremain a concern, but I guess from a Heat perspective one that we canclaim is Not Our Fault. I would still prefer to see some sort ofinterface that results in the dependencies mentioned above occurringnaturally in the template - the alternative is to heavily document howto do this yourself using get_attr to get, e.g. the IP address of aServer or the peer address of a VPNaaS gateway... but if it's possibleto do this manually it seems like it should also be possible toautomatically retrieve the IP addresses in Heat.


cheers,
Zane.

--
Kevin Benton


On Mon, Apr 7, 2014 at 5:28 PM, Zane Bitter <[email protected]
<mailto:[email protected]>> wrote:

    The Neutron API is a constant cause of pain for us as Heat
    developers, but afaik we've never attempted to bring up the issues
    we have found in a cross-project forum. I've recently been doing
    some more investigation and I want to document the exact ways in
    which the current Neutron API breaks orchestration, both in the hope
    that a future version of it might be better and as a guide for other
    API authors.

    BTW it's my contention that an API that is bad for orchestration is
    also hard to use for the ordinary user as well. When you're trying
    to figure out the order of operations you need to do, there are two
    times at which you could find out you've got it wrong:

    1) Before you run the command, when you realise you don't have all
    of the required data yet; or
    2) After you run the command, when you get a cryptic error message.

    Not only is (1) *mandatory* for a data-driven orchestration system
    like Heat, it offers orders-of-magnitude better user experience for
    everyone.

    I should say at the outset that I know next to nothing about
    Neutron, and one of the goals of this message is to find out which
    parts I am completely wrong about. I did know a little bit about
    traditional networking at one time, and even remember some of it ;)


    Neutron has a little documentation on workflow, so let's begin
    there:
    
http://docs.openstack.org/api/__openstack-network/2.0/content/__Overview-d1e71.html#Theory
    
<http://docs.openstack.org/api/openstack-network/2.0/content/Overview-d1e71.html#Theory>

    (1) Create a network
    Instinctively, I want a Network to be something like a virtual VRF
    (VVRF?): a separate namespace with it's own route table, within
    which subnet prefixes are not overlapping, but which is completely
    independent of other Networks that may contain overlapping subnets.
    As far as I can tell, this basically seems to be the case. The
    difference, of course, is that instead of having to configure a VRF
    on every switch/router and make sure they're all in sync and
    connected up in the right ways, I just define it in one place
    globally and Neutron does the rest. I call this #winning. Nice work,
    Neutron.

    (2) Associate a subnet with the network
    Slightly odd choice of words, because you're actually creating a new
    Subnet (there's no such thing as a Subnet not associated with a
    Network), but this is probably just a minor documentation nit.
    Instinctively, I want a Subnet to be something like a virtual VLAN
    (VVLAN?): at its most basic level, just a group of ports that share
    a broadcast domain, but also having other properties (e.g. if L3 is
    in use, all IP addresses in the subnet should be in the same CIDR).
    This doesn't seem to be the case, though, it's just a CIDR prefix,
    which leaves me wondering how L2 traffic will be treated, as well as
    how I would do things like use both IPv4 and IPv6 on a single port
    (by assigning a port to multiple Subnets?). Looking at the docs,
    there is a much bigger emphasis on DHCP client settings than I
    expected - surely I might want to want to give two sets of ports in
    the same Subnet different DHCP configs? Still, this is not bad - the
    DHCP configuration is done by the time the Subnet is created, so
    there's no problem in connecting stuff to it immediately after.

    (3) Boot a VM and attach it to the network
    Here's where you completely lost me. I just created a Subnet - maybe
    a bunch of Subnets. I don't want to attach my VM just anywhere in
    the *Network*, I want to attach it to a *particular* Subnet. It's
    not at all obvious where my instance will get attached (at random?),
    because this API just plain takes the Wrong data type. As a user,
    I'm irritated and confused.

    The situation for orchestration, though, is much, much worse.
    Because the server takes a reference to a network, the dependency
    graph generated from my template will look like this:

    � �Network <---------- Subnet
    � � � � �^
    � � � � �\
    � � � � � ------------ Server

    And yet if the Server is created before the Subnet (as will happen
    ~50% of the time), it will fail. And vice-versa on delete, where the
    server must be removed before the subnet. The dependency graph we
    needed to create was this:

    � �Network <---------- Subnet <---------- Server

    The solution used here was to jury-rig the resource types in Heat
    with a hidden dependency. We can't know which Subnet the server will
    end up attached to, so we create hidden dependencies on all of the
    ones defined in the same template. There's nothing we can do about
    Subnets defined in different templates (Heat allows a tree of
    templates to be instantiated with a single command) - I'm not sure,
    but it may be possible even now to create a tree of stacks that in
    practice could never be successfully deleted.

    The Neutron models in Heat are so riddled with these kinds of
    invisible special-case hacks that all of our public documentation
    about how Heat can be expected to respond to a particular template
    is rendered effectively meaningless with respect to Neutron.

    I should add that we can't blame Nova here, because explicitly
    creating a Port doesn't help - it too takes only a network argument,
    despite _requiring_ a Subnet that it will be attached to, presumably
    at random. In fact using a Port makes things even worse, because
    although there is an API for it Nova and Neutron seem to assume that
    nobody would ever use it, and therefore even if you create a port
    explicitly and pass it to Nova to connect a Server, when you
    disconnect the Server again the Port will be deleted at the same
    time as if you had let Nova create it implicitly for you. This issue
    is currently breaking stack updates because we tend to assume that
    once we've explicitly created something, it stays created.

    Evidently there is a mechanism for associating a Port with a Subnet,
    and that's by assigning a fixed IP - which is hardly ever what I
    want. There's no middle ground that I can find between specifying
    the exact, fixed IP for a port and just letting it end up somewhere
    - anywhere - on the network, entirely at random.


    Let's move on to the L3 extension, starting with Routers. There's
    kind of an inconsistency here, because Routers are virtual devices
    that I need to manage. Hitherto, the point of Neutron was to free me
    from managing individual devices and let me manage the network as a
    whole. Is there a reason I wouldn't want all of the Subnets in the
    Network to just do the Right Thing and make sure everywhere is
    reachable efficiently from everywhere else? If I want something
    separate, wouldn't I use a different Network? (It's not like I have
    any control over where in a Network ports get attached anyway.)

    Nonetheless, Routers exist and it appears I have to create one to
    route packets between Subnets. From an orchestration perspective,
    I'd like Router to take a list of Ports to attach to (and of course
    I'd like each Port to be explicitly associated with a Subnet!). I'd
    be out of luck though, because even though the Port list is a
    property of a Router, you can't set it at creation time, only
    through an update. This is by definition possible to do at creation
    time (if I can do a create call immediately followed by an update
    call then the Neutron API can certainly do this internally), so it's
    very strange to see it disallowed. Following this API led us to
    implement it wrong in Heat as well, leading to headaches with
    floating IPs, about which more later. We also mistakenly used a
    similar design for the Router's external gateway, but later
    corrected it by making it a property of the Router, as it is in the
    API (though we still have to live with a lengthy deprecation
    period). We'll probably end up doing the same with the interfaces.

    Of course it goes without saying that the router gateway is just a
    reference to another network and, once again, requires a hidden
    dependency on all of the Subnets in the hopes of picking up the
    right one. BTW I'm just assuming that the definition of the gateway
    is "interface to another Network over which I will do NAT"? I assume
    that because of the generic way in which Floating IPs are handled,
    with a reference to an external network (I guess the operator
    provides the user with the Network UUID for the Internet?) It's not
    exactly clear why the external gateway is special enough that you
    can have only one interface of this type on a Router, but not so
    special that it would be considered a separate thing. There is also
    a separate Network Gateway, and I have no idea what that is...

    The big problem with Floating IPs is that you can't create them
    until all the necessary hops in the internetwork have been set up.
    And, once again, there's nothing in the creation parameters that
    would naturally order them - you just pass a reference to the
    external network. We still have a bug open on this, but what we will
    have to do is create a hidden dependency on any RouterInterfaces
    that connect any Routers whose external gateway is the same network
    where the floating IP is allocated. That's about as horrible as it
    sounds. A Floating IP needs to take as an argument a reference to
    the Router/Gateway which does the NAT:

    External � � � External
    Network �<---- Subnet � <---- (gateway)
    � � � � � � � � � � � � � � �\
    � � � � � � � � � � � � � � Router <---- Floating IP
    Internal � � � � � � � � � � / � � � � � � � /
    Network �<---- Subnet <------<---- Port <----

    The bane of my existence during Icehouse development has been the
    ExtraRoutes table. First off, this is broken in a way completely
    unrelated to orchestration: you can't add, remove or change an entry
    in the table without rewriting the whole table, so the whole API is
    a giant race condition waiting to happen. (This can, and IMHO
    should, be fixed - at least for those using the official client -
    with an ETags header and the 409 return code.) Everything about this
    API, though, is strange. It's another one of those only-on-update
    properties of a Router, though in this case that's forced by the
    fact that you can't attach the Router to its Subnets during its
    creation. An extra route doesn't behave at all like a static RIB
    entry (with a weight and an administrative distance), but much like
    a FIB entry (i.e. it's for routes that have already been selected to
    be active). That part makes sense, but the next hop for a FIB entry
    is a layer 2 address and this takes an IP address. That makes no
    sense to me, since the IP address(es) assigned to the nexthop play
    no part in how packets are forwarded. And, of course, it creates
    massive dependency issues, because we don't know which ports are
    going to end up with the IP addresses required. This API should take
    a reference to a Port as the nexthop. I've been told we can't even
    simulate this in Heat at the moment because a VPN connection doesn't
    have a port associated with it. (If the API accepted _either_ a Port
    or a VPN connection, that would be fine by me though.) So far we've
    been unable to merge ExtraRoutes into Heat, except for a plugin in
    /contrib, for want of a way to make this reliably work in the
    correct dependency order without resorting to progressively worse hacks.

    I'm sure fresh horrors await in corners I have not yet dug into. I
    must say that the VPN Service, happily, is one that seems to have
    done things right. Firewall looks pretty good in itself, although
    the fact that it is completely disjoint from any other configuration
    - i.e. you can't even specify which network it applies to, let alone
    which gateway - is incomprehensible.


    Over the past couple of development cycles, we've seen a number of
    proposals to push orchestration-like features into Neutron itself.
    It is now clear to me why: because the Neutron API is illegible to
    external orchestration tools, this leads to people wanting to do an
    end run around it.

    I don't expect that the current API can be fixed without breaking
    backwards compatibility, but I hope that folks will take these
    concepts into account the next time the Neutron API gets revised. (I
    also hope we won't see any more proposals to effectively reimplement
    Heat behind the Neutron API ;) Please fell free to include [Heat] in
    any discussion along those lines, we'd be happy to give feedback on
    any given API designs. In exchange, if any Neutron folks are able to
    explain the exact ways in which my ideas about how the current
    Neutron API does and/or should work are wrong and/or crazy, I would
    be most appreciative :)

    cheers,
    Zane.

    _________________________________________________
    OpenStack-dev mailing list
    [email protected].__org
    <mailto:[email protected]>
    http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>




--
Kevin Benton


_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Neutron][Heat] The Neutron API and orchestration

Reply via email to