Re: [openstack-dev] [Openstack-operators] [nova][glance] Who needs multiple api_servers?

Eric Fried Mon, 01 May 2017 07:35:48 -0700

Matt-

    Yeah, clearly other projects have the same issuethis blueprint is
trying to solve in nova.  I think the idea is that, once the
infrastructure is in place and nova has demonstrated the concept, other
projects can climbaboard.


    It's conceivable that the new get_service_url() method could be
moved to a more common lib (ksaor os-client-config perhaps) in the
future to facilitate this.

            Eric (efried)

On 05/01/2017 09:17 AM, Matthew Treinish wrote:
> On Mon, May 01, 2017 at 05:00:17AM -0700, Flavio Percoco wrote:
>> On 28/04/17 11:19 -0500, Eric Fried wrote:
>>> If it's *just* glance we're making an exception for, I prefer #1 (don't
>>> deprecate/remove [glance]api_servers).  It's way less code &
>>> infrastructure, and it discourages others from jumping on the
>>> multiple-endpoints bandwagon.  If we provide endpoint_override_list
>>> (handwave), people will think it's okay to use it.
>>>
>>> Anyone aware of any other services that use multiple endpoints?
>> Probably a bit late but yeah, I think this makes sense. I'm not aware of 
>> other
>> projects that have list of api_servers.
> I thought it was just nova too, but it turns out cinder has the same exact
> option as nova: (I hit this in my devstack patch trying to get glance deployed
> as a wsgi app)
>
> https://github.com/openstack/cinder/blob/d47eda3a3ba9971330b27beeeb471e2bc94575ca/cinder/common/config.py#L51-L55
>
> Although from what I can tell you don't have to set it and it will fallback to
> using the catalog, assuming you configured the catalog info for cinder:
>
> https://github.com/openstack/cinder/blob/19d07a1f394c905c23f109c1888c019da830b49e/cinder/image/glance.py#L117-L129
>
>
> -Matt Treinish
>
>
>>> On 04/28/2017 10:46 AM, Mike Dorman wrote:
>>>> Maybe we are talking about two different things here?  I’m a bit confused.
>>>>
>>>> Our Glance config in nova.conf on HV’s looks like this:
>>>>
>>>> [glance]
>>>> api_servers=http://glance1:9292,http://glance2:9292,http://glance3:9292,http://glance4:9292
>>>> glance_api_insecure=True
>>>> glance_num_retries=4
>>>> glance_protocol=http
>>
>> FWIW, this feature is being used as intended. I'm sure there are ways to 
>> achieve
>> this using external tools like haproxy/nginx but that adds an extra burden to
>> OPs that is probably not necessary since this functionality is already there.
>>
>> Flavio
>>
>>>> So we do provide the full URLs, and there is SSL support.  Right?  I am 
>>>> fairly certain we tested this to ensure that if one URL fails, nova goes 
>>>> on to retry the next one.  That failure does not get bubbled up to the 
>>>> user (which is ultimately the goal.)
>>>>
>>>> I don’t disagree with you that the client side choose-a-server-at-random 
>>>> is not a great load balancer.  (But isn’t this roughly the same thing that 
>>>> oslo-messaging does when we give it a list of RMQ servers?)  For us it’s 
>>>> more about the failure handling if one is down than it is about actually 
>>>> equally distributing the load.
>>>>
>>>> In my mind options One and Two are the same, since today we are already 
>>>> providing full URLs and not only server names.  At the end of the day, I 
>>>> don’t feel like there is a compelling argument here to remove this 
>>>> functionality (that people are actively making use of.)
>>>>
>>>> To be clear, I, and I think others, are fine with nova by default getting 
>>>> the Glance endpoint from Keystone.  And that in Keystone there should 
>>>> exist only one Glance endpoint.  What I’d like to see remain is the 
>>>> ability to override that for nova-compute and to target more than one 
>>>> Glance URL for purposes of fail over.
>>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>>
>>>>
>>>>
>>>> On 4/28/17, 8:20 AM, "Monty Taylor" <[email protected]> wrote:
>>>>
>>>>     Thank you both for your feedback - that's really helpful.
>>>>
>>>>     Let me say a few more words about what we're trying to accomplish here
>>>>     overall so that maybe we can figure out what the right way forward is.
>>>>     (it may be keeping the glance api servers setting, but let me at least
>>>>     make the case real quick)
>>>>
>>>>      From a 10,000 foot view, the thing we're trying to do is to get nova's
>>>>     consumption of all of the OpenStack services it uses to be less 
>>>> special.
>>>>
>>>>     The clouds have catalogs which list information about the services -
>>>>     public, admin and internal endpoints and whatnot - and then we're 
>>>> asking
>>>>     admins to not only register that information with the catalog, but to
>>>>     also put it into the nova.conf. That means that any updating of that
>>>>     info needs to be an API call to keystone and also a change to 
>>>> nova.conf.
>>>>     If we, on the other hand, use the catalog, then nova can pick up 
>>>> changes
>>>>     in real time as they're rolled out to the cloud - and there is 
>>>> hopefully
>>>>     a sane set of defaults we could choose (based on operator feedback like
>>>>     what you've given) so that in most cases you don't have to tell nova
>>>>     where to find glance _at_all_ becuase the cloud already knows where it
>>>>     is. (nova would know to look in the catalog for the interal interface 
>>>> of
>>>>     the image service - for instance - there's no need to ask an operator 
>>>> to
>>>>     add to the config "what is the service_type of the image service we
>>>>     should talk to" :) )
>>>>
>>>>     Now - glance, and the thing you like that we don't - is especially 
>>>> hairy
>>>>     because of the api_servers list. The list, as you know, is just a list
>>>>     of servers, not even of URLs. This  means it's not possible to 
>>>> configure
>>>>     nova to talk to glance over SSL (which I know you said works for you,
>>>>     but we'd like for people to be able to choose to SSL all their things)
>>>>     We could add that, but it would be an additional pile of special 
>>>> config.
>>>>     Because of all of that, we also have to attempt to make working URLs
>>>>     from what is usually a list of IP addresses. This is also clunky and
>>>>     prone to failure.
>>>>
>>>>     The implementation on the underside of the api_servers code is the
>>>>     world's dumbest load balancer. It picks a server from the  list at
>>>>     random and uses it. There is no facility for dealing with a server in
>>>>     the list that stops working or for allowing rolling upgrades like there
>>>>     would with a real load-balancer across the set. If one of the API
>>>>     servers goes away, we have no context to know that, so just some of 
>>>> your
>>>>     internal calls to glance fail.
>>>>
>>>>     Those are the issues - basically:
>>>>     - current config is special and fragile
>>>>     - impossible to SSL
>>>>     - unflexible/unpowerful de-facto software loadbalancer
>>>>
>>>>     Now - as is often the case - it turns out the combo of those things is
>>>>     working very well for you -so we need to adjust our thinking on the
>>>>     topic a bit. Let me toss out some alternatives and see what you think:
>>>>
>>>>     Alternative One - Do Both things
>>>>
>>>>     We add the new "consume from catalog" and make it default. (and make it
>>>>     default to consuming the internal interface by default) We have to do
>>>>     that in parallel with the current glance api_servers setting anyway,
>>>>     because of deprecation periods, so the code to support both approaches
>>>>     will exist. Instead of then deprecating the api_servers list, we keep
>>>>     it- but add a big doc warning listing the gotchas and limitations - but
>>>>     for those folks for whom they are not an issue, you've got an out.
>>>>
>>>>     Alternative Two - Hybrid Approach - optional list of URLs
>>>>
>>>>     We go ahead and move to service config being the standard way one lists
>>>>     how to consume a service from the catalog. One of the standard options
>>>>     for consuming services is "endpoint_override" - which is a way an API
>>>>     user can say "hi, please to ignore the catalog and use this endpoint
>>>>     I've given you instead". The endpoint in question is a full URL, so
>>>>     https/http and ports and whatnot are all handled properly.
>>>>
>>>>     We add, in addition, an additional option "endpoint_override_list" 
>>>> which
>>>>     allows you to provide a list of URLs (not API servers) and if you
>>>>     provide that option, we'll keep the logic of choosing one at random at
>>>>     API call time. It's still a poor load balancer, and we'll still put
>>>>     warnings in the docs about it not being a featureful load balancing
>>>>     solution, but again would be available if needed.
>>>>
>>>>     Alternative Three - We ignore you and give you docs
>>>>
>>>>     I'm only including this because in the name of completeness. But we
>>>>     could write a bunch of docs about a recommended way of putting your
>>>>     internal endpoints in a load balancer and registering that with the
>>>>     internal endpoint in keystone. (I would prefer to make the operators
>>>>     happy, so let's say whatever vote I have is not for this option)
>>>>
>>>>     Alternative Four - We update client libs to understand multiple values
>>>>     from keystone for endpoints
>>>>
>>>>     I _really_ don't like this one - as I think us doing dumb software
>>>>     loadbalancing client side is prone to a ton of failures. BUT - right 
>>>> now
>>>>     the assumption when consuming endpoints from the catalog is that one 
>>>> and
>>>>     only one endpoint will be returned for a given
>>>>     service_type/service_name/interface. Rather than special-casing the
>>>>     url roundrobin in nova, we could move that round-robin to be in the 
>>>> base
>>>>     client library, update api consumption docs with round-robin
>>>>     recommendations and then have you register the list of endpoints with
>>>>     keystone.
>>>>
>>>>     I know the keystone team has long been _very_ against using keystone as
>>>>     a list of all the endpoints, and I agree with them. Putting it here for
>>>>     sake of argument.
>>>>
>>>>     Alternative Five - We update keystone to round-robin lists of endpoints
>>>>
>>>>     Potentially even worse than four and even more unlikely given the
>>>>     keystone team's feelings, but we could have keystone continue to only
>>>>     return one endpoint, but have it do the round-robin selection at 
>>>> catalog
>>>>     generation time.
>>>>
>>>>
>>>>     Sorry - you caught me in early morning brainstorm mode.
>>>>
>>>>     I am neither nova core nor keystone core. BUT:
>>>>
>>>>     I think honestly if adding a load balancer in front of your internal
>>>>     endpoints is an undue burden and/or the usefulness of the lists
>>>>     outweighs the limitations they have, we should go with One or Two. (I
>>>>     think three through five are all terrible)
>>>>
>>>>     My personal preference would be for Two - the round-robin code winds up
>>>>     being the same logic in both cases, but at least in Two folks who want
>>>>     to SSL all the way _can_, and it shouldn't be an undue extra burden on
>>>>     those of you using the api_servers now. We also don't have to do the
>>>>     funky things we currently have to do to turn the api_severs list into
>>>>     workable URLs.
>>>>
>>>>
>>>>     On 04/27/2017 11:50 PM, Blair Bethwaite wrote:
>>>>     > We at Nectar are in the same boat as Mike. Our use-case is a little
>>>>     > bit more about geo-distributed operations though - our Cells are in
>>>>     > different States around the country, so the local glance-apis are
>>>>     > particularly important for caching popular images close to the
>>>>     > nova-computes. We consider these glance-apis as part of the 
>>>> underlying
>>>>     > cloud infra rather than user-facing, so I think we'd prefer not to 
>>>> see
>>>>     > them in the service-catalog returned to users either... is there 
>>>> going
>>>>     > to be a (standard) way to hide them?
>>>>     >
>>>>     > On 28 April 2017 at 09:15, Mike Dorman <[email protected]> wrote:
>>>>     >> We make extensive use of the [glance]/api_servers list.  We 
>>>> configure that on hypervisors to direct them to Glance servers which are 
>>>> more “local” network-wise (in order to reduce network traffic across 
>>>> security zones/firewalls/etc.)  This way nova-compute can fail over in 
>>>> case one of the Glance servers in the list is down, without putting them 
>>>> behind a load balancer.  We also don’t run https for these “internal” 
>>>> Glance calls, to save the overhead when transferring images.
>>>>     >>
>>>>     >> End-user calls to Glance DO go through a real load balancer and 
>>>> then are distributed out to the Glance servers on the backend.  From the 
>>>> end-user’s perspective, I totally agree there should be one, and only one 
>>>> URL.
>>>>     >>
>>>>     >> However, we would be disappointed to see the change you’re 
>>>> suggesting implemented.  We would lose the redundancy we get now by 
>>>> providing a list.  Or we would have to shunt all the calls through the 
>>>> user-facing endpoint, which would generate a lot of extra traffic (in 
>>>> places where we don’t want it) for image transfers.
>>>>     >>
>>>>     >> Thanks,
>>>>     >> Mike
>>>>     >>
>>>>     >>
>>>>     >>
>>>>     >> On 4/27/17, 4:02 PM, "Matt Riedemann" <[email protected]> wrote:
>>>>     >>
>>>>     >>     On 4/27/2017 4:52 PM, Eric Fried wrote:
>>>>     >>     > Y'all-
>>>>     >>     >
>>>>     >>     >   TL;DR: Does glance ever really need/use multiple endpoint 
>>>> URLs?
>>>>     >>     >
>>>>     >>     >   I'm working on bp use-service-catalog-for-endpoints[1], 
>>>> which intends
>>>>     >>     > to deprecate disparate conf options in various groups, and 
>>>> centralize
>>>>     >>     > acquisition of service endpoint URLs.  The idea is to 
>>>> introduce
>>>>     >>     > nova.utils.get_service_url(group) -- note singular 'url'.
>>>>     >>     >
>>>>     >>     >   One affected conf option is [glance]api_servers[2], which 
>>>> currently
>>>>     >>     > accepts a *list* of endpoint URLs.  The new API will only 
>>>> ever return *one*.
>>>>     >>     >
>>>>     >>     >   Thus, as planned, this blueprint will have the side effect 
>>>> of
>>>>     >>     > deprecating support for multiple glance endpoint URLs in 
>>>> Pike, and
>>>>     >>     > removing said support in Queens.
>>>>     >>     >
>>>>     >>     >   Some have asserted that there should only ever be one 
>>>> endpoint URL for
>>>>     >>     > a given service_type/interface combo[3].  I'm fine with that 
>>>> - it
>>>>     >>     > simplifies things quite a bit for the bp impl - but wanted to 
>>>> make sure
>>>>     >>     > there were no loudly-dissenting opinions before we get too 
>>>> far down this
>>>>     >>     > path.
>>>>     >>     >
>>>>     >>     > [1]
>>>>     >>     > 
>>>> https://blueprints.launchpad.net/nova/+spec/use-service-catalog-for-endpoints
>>>>     >>     > [2]
>>>>     >>     > 
>>>> https://github.com/openstack/nova/blob/7e7bdb198ed6412273e22dea72e37a6371fce8bd/nova/conf/glance.py#L27-L37
>>>>     >>     > [3]
>>>>     >>     > 
>>>> http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2017-04-27.log.html#t2017-04-27T20:38:29
>>>>     >>     >
>>>>     >>     > Thanks,
>>>>     >>     > Eric Fried (efried)
>>>>     >>     > .
>>>>     >>     >
>>>>     >>     > 
>>>> __________________________________________________________________________
>>>>     >>     > OpenStack Development Mailing List (not for usage questions)
>>>>     >>     > Unsubscribe: 
>>>> [email protected]?subject:unsubscribe
>>>>     >>     > 
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>     >>     >
>>>>     >>
>>>>     >>     +openstack-operators
>>>>     >>
>>>>     >>     --
>>>>     >>
>>>>     >>     Thanks,
>>>>     >>
>>>>     >>     Matt
>>>>     >>
>>>>     >>     
>>>> __________________________________________________________________________
>>>>     >>     OpenStack Development Mailing List (not for usage questions)
>>>>     >>     Unsubscribe: 
>>>> [email protected]?subject:unsubscribe
>>>>     >>     
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>     >>
>>>>     >>
>>>>     >> _______________________________________________
>>>>     >> OpenStack-operators mailing list
>>>>     >> [email protected]
>>>>     >> 
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>>     >
>>>>     >
>>>>     >
>>>>
>>>>
>>>>     
>>>> __________________________________________________________________________
>>>>     OpenStack Development Mailing List (not for usage questions)
>>>>     Unsubscribe: 
>>>> [email protected]?subject:unsubscribe
>>>>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: [email protected]?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> [email protected]
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> -- 
>> @flaper87
>> Flavio Percoco
>
>
>> _______________________________________________
>> OpenStack-operators mailing list
>> [email protected]
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Openstack-operators] [nova][glance] Who needs multiple api_servers?

Reply via email to