Re: [openstack-dev] [neutron][heat] - making Neutron more friendly for orchestration

Monty Taylor Fri, 19 May 2017 14:15:48 -0700

On 05/19/2017 04:03 PM, Kevin Benton wrote:

I split this conversation off of the "Is the pendulum swinging on PaaS
layers?" thread [1] to discuss some improvements we can make to Neutron
to make orchestration easier.


There are some pain points that heat has when working with the Neutron
API. I would like to get them converted into requests for enhancements
in Neutron so the wider community is aware of them.

Starting with the port/subnet/network relationship - it's important to
understand that IP addresses are not required on a port.

So knowing now that a Network is a layer-2 network segment and a Subnet

is... effectively a glorified DHCP address pool

Yes, a Subnet controls IP address allocation as well as setting up
routing for routers, which is why routers reference subnets instead of
networks (different routers can route for different subnets on the same
network). It essentially dictates things related to L3 addressing and
provides information for L3 reachability.

But at the end of the day, I still can't create a Port until a Subnet exists


This is only true if you want an IP address on the port. This sounds
silly for most use cases, but there are a non-trivial portion of NFV
workloads that do not want IP addresses at all so they create a network
and just attach ports without creating any subnets.

I still don't know what Subnet a Port will be attached to (unless the

user specifies it explicitly using the --fixed-ip option... regardless
of whether they actually specify a fixed IP),

So what would you like Neutron to do differently here? Always force a
user to pick which subnet they want an allocation from if there are
multiple? If so, can't you just force that explicitness in Heat?

and I have no way in general of telling which Subnets can be deleted before a 
given Port is and which will fail to delete until the Port disappears.


A given port will only block subnet deletions from subnets it is
attached to. Conversely, you can see all ports with allocations from a
subnet with 'neutron port-list --fixed-ips subnet_id=<subnet-UUID>'.  So
is the issue here that the dependency wasn't made explicit in the heat
modeling (leading to the problem above and this one)?


For the individual bugs you highlighted, it would be good if you can
provide some details about what changes we could make to help.


https://bugs.launchpad.net/heat/+bug/1442121 - This looks like a result
of partially specified floating IPs (no fixed_ip). What can we
add/change here to help? Or can heat just always force the user to
specify a fixed IP for the case where disambiguation on multiple
fixed_ip ports is needed?

If the server has more than one fixed_ip ports, it's possible that onlyone of them will be able to receive a floating ip. The subnet a portcomes from must have gateway_ip set for a floating_ip to attach to it.So if you have a server, you can poke and find the right fixed_ip in allcases except when the server has more than one fixed_ip and each of themare from a subnet with a gateway_ip. In that case, a user _must_ providea fixed_ip, because there is no way to know what they intend.

https://launchpad.net/bugs/1626607 - I see this is about a dependency
between RouterGateways and RouterInterfaces, but it's not clear to me
why that dependency exists. Is it to solve a lack of visibility into the
interfaces required for a floating IP?

https://bugs.launchpad.net/heat/+bug/1626619,
https://bugs.launchpad.net/heat/+bug/1626630, and
https://bugs.launchpad.net/heat/+bug/1626634 - These seems similar to
1626607. Can we just expose the interfaces/router a floating IP is
depending on explicitly in the API for you to fix these? If not, what
can we do to help here?


1. http://lists.openstack.org/pipermail/openstack-dev/2017-May/117106.html

Cheers,
Kevin Benton

On Fri, May 19, 2017 at 1:05 PM, Zane Bitter <[email protected]
<mailto:[email protected]>> wrote:

    On 19/05/17 15:06, Kevin Benton wrote:

            Don't even get me started on Neutron.[2]


        It seems to me the conclusion to that thread was that the
        majority of
        your issues stemmed from the fact that we had poor documentation
        at the
        time.  A major component of the complaints resulted from you
        misunderstanding the difference between networks/subnets in Neutron.


    It's true that I was completely off base as to what the various
    primitives in Neutron actually do. (Thanks for educating me!) The
    implications for orchestration are largely unchanged though. It's a
    giant pain that we have to infer implicit dependencies between stuff
    to get them to create/delete in the right order, pretty much
    independently of what that stuff does.

    So knowing now that a Network is a layer-2 network segment and a
    Subnet is... effectively a glorified DHCP address pool, I understand
    better why it probably seemed like a good idea to hook stuff up
    magically. But at the end of the day, I still can't create a Port
    until a Subnet exists, I still don't know what Subnet a Port will be
    attached to (unless the user specifies it explicitly using the
    --fixed-ip option... regardless of whether they actually specify a
    fixed IP), and I have no way in general of telling which Subnets can
    be deleted before a given Port is and which will fail to delete
    until the Port disappears.

        There are some legitimate issues in there about the extra routes
        extension being replace-only and the routers API not accepting a
        list of
        interfaces in POST.  However, it hardly seems that those are
        worthy of
        "Don't even get me started on Neutron."


    https://launchpad.net/bugs/1626607 <https://launchpad.net/bugs/1626607>
    https://launchpad.net/bugs/1442121 <https://launchpad.net/bugs/1442121>
    https://launchpad.net/bugs/1626619 <https://launchpad.net/bugs/1626619>
    https://launchpad.net/bugs/1626630 <https://launchpad.net/bugs/1626630>
    https://launchpad.net/bugs/1626634 <https://launchpad.net/bugs/1626634>

        It would be nice if you could write up something about current
        gaps that
        would make Heat's life easier, because a large chunk of that initial
        email is incorrect and linking to it as a big list of "issues" is
        counter-productive.


    Yes, agreed. I wish I had a clean thread to link to. It's a huge
    amount of work to research it all though.

    cheers,
    Zane.

        On Fri, May 19, 2017 at 7:36 AM, Zane Bitter <[email protected]
        <mailto:[email protected]>
        <mailto:[email protected] <mailto:[email protected]>>> wrote:

            On 18/05/17 20:19, Matt Riedemann wrote:

                I just wanted to blurt this out since it hit me a few
        times at the
                summit, and see if I'm misreading the rooms.

                For the last few years, Nova has pushed back on adding
                orchestration to
                the compute API, and even define a policy for it since
        it comes
                up so
                much [1]. The stance is that the compute API should expose
                capabilities
                that a higher-level orchestration service can stitch
        together
                for a more
                fluid end user experience.


            I think this is a wise policy.

                One simple example that comes up time and again is
        allowing a
                user to
                pass volume type to the compute API when booting from volume
                such that
                when nova creates the backing volume in Cinder, it passes
                through the
                volume type. If you need a non-default volume type for
        boot from
                volume,
                the way you do this today is first create the volume
        with said
                type in
                Cinder and then provide that volume to the compute API when
                creating the
                server. However, people claim that is bad UX or hard for
        users to
                understand, something like that (at least from a command
        line, I
                assume
                Horizon hides this, and basic users should probably be
        using Horizon
                anyway right?).


            As always, there's a trade-off between simplicity and
        flexibility. I
            can certainly understand the logic in wanting to make the simple
            stuff simple. But users also need to be able to progress
        from simple
            stuff to more complex stuff without having to give up and start
            over. There's a danger of leading them down the garden path.

                While talking about claims in the scheduler and a top-level
                conductor
                for cells v2 deployments, we've talked about the desire
        to eliminate
                "up-calls" from the compute service to the top-level
        controller
                services
                (nova-api, nova-conductor and nova-scheduler). Build
        retries is
                one such
                up-call. CERN disables build retries, but others rely on
        them,
                because
                of how racy claims in the computes are (that's another
        story and why
                we're working on fixing it). While talking about this,
        we asked,
                "why
                not just do away with build retries in nova altogether?
        If the
                scheduler
                picks a host and the build fails, it fails, and you have to
                retry/rebuild/delete/recreate from a top-level service."


            (FWIW Heat does this for you already.)

                But during several different Forum sessions, like user API
                improvements
                [2] but also the cells v2 and claims in the scheduler
        sessions,
                I was
                hearing about how operators only wanted to expose the
        base IaaS
                services
                and APIs and end API users wanted to only use those,
        which means any
                improvements in those APIs would have to be in the base
        APIs (nova,
                cinder, etc). To me, that generally means any orchestration
                would have
                to be baked into the compute API if you're not using Heat or
                something
                similar.


            The problem is that orchestration done inside APIs is very
        easy to
            do badly in ways that cause lots of downstream pain for
        users and
            external orchestrators. For example, Nova already does some
            orchestration: it creates a Neutron port for a server if you
        don't
            specify one. (And then promptly forgets that it has done
        so.) There
            is literally an entire inner platform, an orchestrator within an
            orchestrator, inside Heat to try to manage the fallout from
        this.
            And the inner platform shares none of the elegance, such as
        it is,
            of Heat itself, but is rather a collection of
        cobbled-together hacks
            to deal with the seemingly infinite explosion of edge cases
        that we
            kept running into over a period of at least 5 releases.

            The get-me-a-network thing is... better, but there's no
        provision
            for changes after the server is created, which means we have to
            copy-paste the Nova implementation into Heat to deal with
        update.[1]
            Which sounds like a maintenance nightmare in the making.
        That seems
            to be a common mistake: to assume that once users create
        something
            they'll never need to touch it again, except to delete it when
            they're done.

            Don't even get me started on Neutron.[2]

            Any orchestration that is done behind-the-scenes needs to be
        done
            superbly well, provide transparency for external
        orchestration tools
            that need to hook in to the data flow, and should be
        developed in
            consultation with potential consumers like Shade and Heat.

                Am I missing the point, or is the pendulum really
        swinging away from
                PaaS layer services which abstract the dirty details of the
                lower-level
                IaaS APIs? Or was this always something people wanted
        and I've just
                never made the connection until now?


            (Aside: can we stop using the term 'PaaS' to refer to
        "everything
            that Nova doesn't do"? This habit is not helping us to
        communicate
            clearly.)

            cheers,
            Zane.

            [1] https://review.openstack.org/#/c/407328/
        <https://review.openstack.org/#/c/407328/>
            <https://review.openstack.org/#/c/407328/
        <https://review.openstack.org/#/c/407328/>>
            [2]

        
http://lists.openstack.org/pipermail/openstack-dev/2014-April/032098.html
        
<http://lists.openstack.org/pipermail/openstack-dev/2014-April/032098.html>

        
<http://lists.openstack.org/pipermail/openstack-dev/2014-April/032098.html
        
<http://lists.openstack.org/pipermail/openstack-dev/2014-April/032098.html>>



        
__________________________________________________________________________
            OpenStack Development Mailing List (not for usage questions)
            Unsubscribe:

        [email protected]?subject:unsubscribe
        <http://[email protected]?subject:unsubscribe>

        <http://[email protected]?subject:unsubscribe
        <http://[email protected]?subject:unsubscribe>>

        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>

        <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
        <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>>




        
__________________________________________________________________________
        OpenStack Development Mailing List (not for usage questions)
        Unsubscribe:
        [email protected]?subject:unsubscribe
        <http://[email protected]?subject:unsubscribe>
        http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev 
<http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>



    __________________________________________________________________________
    OpenStack Development Mailing List (not for usage questions)
    Unsubscribe:
    [email protected]?subject:unsubscribe
    <http://[email protected]?subject:unsubscribe>
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
    <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>




__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [neutron][heat] - making Neutron more friendly for orchestration

Reply via email to