On 06/14/2016 05:42 PM, Doug Hellmann wrote: > Excerpts from Matthew Treinish's message of 2016-06-14 15:12:45 -0400: >> On Tue, Jun 14, 2016 at 02:41:10PM -0400, Doug Hellmann wrote: >>> Excerpts from Matthew Treinish's message of 2016-06-14 14:21:27 -0400: >>>> On Tue, Jun 14, 2016 at 10:57:05AM -0700, Chris Hoge wrote: >>>>> Last year, in response to Nova micro-versioning and extension updates[1], >>>>> the QA team added strict API schema checking to Tempest to ensure that >>>>> no additional properties were added to Nova API responses[2][3]. In the >>>>> last year, at least three vendors participating the the OpenStack Powered >>>>> Trademark program have been impacted by this change, two of which >>>>> reported this to the DefCore Working Group mailing list earlier this >>>>> year[4]. >>>>> >>>>> The DefCore Working Group determines guidelines for the OpenStack Powered >>>>> program, which includes capabilities with associated functional tests >>>>> from Tempest that must be passed, and designated sections with associated >>>>> upstream code [5][6]. In determining these guidelines, the working group >>>>> attempts to balance the future direction of development with lagging >>>>> indicators of deployments and user adoption. >>>>> >>>>> After a tremendous amount of consideration, I believe that the DefCore >>>>> Working Group needs to implement a temporary waiver for the strict API >>>>> checking requirements that were introduced last year, to give downstream >>>>> deployers more time to catch up with the strict micro-versioning >>>>> requirements determined by the Nova/Compute team and enforced by the >>>>> Tempest/QA team. >>>> >>>> I'm very much opposed to this being done. If we're actually concerned with >>>> interoperability and verify that things behave in the same manner between >>>> multiple >>>> clouds then doing this would be a big step backwards. The fundamental >>>> disconnect >>>> here is that the vendors who have implemented out of band extensions or >>>> were >>>> taking advantage of previously available places to inject extra attributes >>>> believe that doing so means they're interoperable, which is quite far from >>>> reality. **The API is not a place for vendor differentiation.** >>> >>> This is a temporary measure to address the fact that a large number >>> of existing tests changed their behavior, rather than having new >>> tests added to enforce this new requirement. The result is deployments >>> that previously passed these tests may no longer pass, and in fact >>> we have several cases where that's true with deployers who are >>> trying to maintain their own standard of backwards-compatibility >>> for their end users. >> >> That's not what happened though. The API hasn't changed and the tests haven't >> really changed either. We made our enforcement on Nova's APIs a bit stricter >> to >> ensure nothing unexpected appeared. For the most these tests work on any >> version >> of OpenStack. (we only test it in the gate on supported stable releases, but >> I >> don't expect things to have drastically shifted on older releases) It also >> doesn't matter which version of the API you run, v2.0 or v2.1. Literally, the >> only case it ever fails is when you run something extra, not from the >> community, >> either as an extension (which themselves are going away [1]) or another >> service >> that wraps nova or imitates nova. I'm personally not comfortable saying those >> extras are ever part of the OpenStack APIs. >> >>> We have basically three options. >>> >>> 1. Tell deployers who are trying to do the right for their immediate >>> users that they can't use the trademark. >>> >>> 2. Flag the related tests or remove them from the DefCore enforcement >>> suite entirely. >>> >>> 3. Be flexible about giving consumers of Tempest time to meet the >>> new requirement by providing a way to disable the checks. >>> >>> Option 1 goes against our own backwards compatibility policies. >> >> I don't think backwards compatibility policies really apply to what what >> define >> as the set of tests that as a community we are saying a vendor has to pass to >> say they're OpenStack. From my perspective as a community we either take a >> hard >> stance on this and say to be considered an interoperable cloud (and to get >> the >> trademark) you have to actually have an interoperable product. We slowly >> ratchet >> up the requirements every 6 months, there isn't any implied backwards >> compatibility in doing that. You passed in the past but not in the newer >> stricter >> guidelines. >> >> Also, even if I did think it applied, we're not talking about a change which >> would fall into breaking that. The change was introduced a year and half ago >> during kilo and landed a year ago during liberty: >> >> https://review.openstack.org/#/c/156130/ >> >> That's way longer than our normal deprecation period of 3 months and a >> release >> boundary. >> >>> >>> Option 2 gives us no winners and actually reduces the interoperability >>> guarantees we already have in place. >>> >>> Option 3 applies our usual community standard of slowly rolling >>> forward while maintaining compatibility as broadly as possible. >> >> Except in this case there isn't actually any compatibility being maintained. >> We're saying that we can't make the requirements for interoperability testing >> stricter until all the vendors who were passing in the past are able to pass >> the stricter version. >> >>> >>> No one is suggesting that a permanent, or even open-ended, exception >>> be granted. >> >> Sure, I agree an permanent or open-ended exception would be even worse. But, >> I >> still think as a community we need to draw a hard line in the sand here. Just >> because this measure is temporary doesn't make it any more palatable. >> >> By doing this, even as a temporary measure, we're saying it's ok to call >> things >> an OpenStack API when you add random gorp to the responses. Which is >> something we've >> very clearly said as a community is the exact opposite of the case, which the >> testing reflects. I still contend just because some vendors were running old >> versions of tempest and old versions of openstack where their incompatible >> API >> changes weren't caught doesn't mean they should be given pass now. > > Nobody is saying random gorp is OK, and I'm not sure "line in the > sand" rhetoric is really constructive. The issue is not with the > nature of the API policies, it's with the implementation of those > policies and how they were rolled out. > > DefCore defines its rules using named tests in Tempest. If these > new enforcement policies had been applied by adding new tests to > Tempest, then DefCore could have added them using its processes > over a period of time and we wouldn't have had any issues. That's > not what happened. Instead, the behavior of a bunch of *existing* > tests changed. As a result, deployments that have not changed fail > tests that they used to pass, without any action being taken on the > deployer's part. We've moved the goal posts on our users in a way > that was not easily discoverable, because it couldn't be tracked > through the (admittedly limited) process we have in place for doing > that tracking. > > So, we want a way to get the test results back to their existing > status, which will then let us roll adoption forward smoothly instead > of lurching from "pass" to "fail" to "pass".
I think this is the most important thing to me as it relates to this. I'm obviously a huge proponent of clouds behaving more samely. But I also think that, as Doug nicely describes above, we've sort of backed in to removing something without a deprecation window ... largely because of the complexities involved with the system here - and I'd like to make sure that when we are being clear about behavior changes that we give the warning period so that people can adapt. > We should, separately, address the process issues and the limitations > this situation has exposed. That may mean changing the way DefCore > defines its policies, or tracks things, or uses Tempest. For > example, in the future, we may want tie versions of Tempest to > versions of the trademark more closely, so that it's possible for > someone running the Mitaka version of OpenStack to continue to use > the Mitaka version of Tempest and not have to upgrade Tempest in > order to retain their trademark (maybe that's how it already works?). > We may also need to consider that test implementation details may > change, and have a review process within DefCore to help expose > those changes to make them clearer to deployers. > > Fixing the process issue may also mean changing the way we implement > things in Tempest. In this case, adding a flag helps move ahead > more smoothly. Perhaps we adopt that as a general policy in the > future when we make underlying behavioral changes like this to > existing tests. Perhaps instead we have a policy that we do not > change the behavior of existing tests in such significant ways, at > least if they're tagged as being used by DefCore. I don't know -- > those are things we need to discuss. ++ >> >> -Matt Treinish >> >> [1] http://lists.openstack.org/pipermail/openstack-dev/2016-June/097285.html >>> >>> Doug >>> >>>> >>>> As a user of several clouds myself I can say that having random gorp in a >>>> response makes it much more difficult to use my code against multiple >>>> clouds. I >>>> have to determine which properties being returned are specific to that >>>> vendor's >>>> cloud and if I actually need to depend on them for anything it makes >>>> whatever >>>> code I'm writing incompatible for using against any other cloud. (unless I >>>> special case that block for each cloud) Sean Dague wrote a good post where >>>> a lot >>>> of this was covered a year ago when microversions was starting to pick up >>>> steam: >>>> >>>> https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2 >>>> >>>> I'd recommend giving it a read, he explains the user first perspective more >>>> clearly there. >>>> >>>> I believe Tempest in this case is doing the right thing from an >>>> interoperability >>>> perspective and ensuring that the API is actually the API. Not an API with >>>> extra >>>> bits a vendor decided to add. I don't think a cloud or product that does >>>> this >>>> to the api should be considered an interoperable OpenStack cloud and >>>> failing the >>>> tests is the correct behavior. >>>> >>>> -Matt Treinish >>>> >>>>> >>>>> My reasoning behind this is that while the change that enabled strict >>>>> checking was discussed publicly in the developer community and took >>>>> some time to be implemented, it still landed quickly and broke several >>>>> existing deployments overnight. As Tempest has moved forward with >>>>> bug and UX fixes (some in part to support the interoperability testing >>>>> efforts of the DefCore Working Group), using an older versions of Tempest >>>>> where this strict checking is not enforced is no longer a viable solution >>>>> for downstream deployers. The TC has passed a resolution to advise >>>>> DefCore to use Tempest as the single source of capability testing[7], >>>>> but this naturally introduces tension between the competing goals of >>>>> maintaining upstream functional testing and also tracking lagging >>>>> indicators. >>>>> >>>>> My proposal for addressing this problem approaches it at two levels: >>>>> >>>>> * For the short term, I will submit a blueprint and patch to tempest that >>>>> allows configuration of a grey-list of Nova APIs where strict response >>>>> checking on additional properties will be disabled. So, for example, >>>>> if the 'create servers' API call returned extra properties on that >>>>> call, >>>>> the strict checking on this line[8] would be disabled at runtime. >>>>> Use of this code path will emit a deprecation warning, and the >>>>> code will be scheduled for removal in 2017 directly after the release >>>>> of the 2017.01 guideline. Vendors would be required so submit the >>>>> grey-list of APIs with additional response data that would be >>>>> published to their marketplace entry. >>>>> >>>>> * Longer term, vendors will be expected to work with upstream to update >>>>> the API for returning additional data that is compatible with >>>>> API micro-versioning as defined by the Nova team, and the waiver would >>>>> no longer be allowed after the release of the 2017.01 guideline. >>>>> >>>>> For the next half-year, I feel that this approach strengthens >>>>> interoperability >>>>> by accurately capturing the current state of OpenStack deployments and >>>>> client tools. Before this change, additional properties on responses >>>>> weren't explicitly disallowed, and vendors and deployers took advantage >>>>> of this in production. While this is behavior that the Nova and QA teams >>>>> want to stop, it will take a bit more time to reach downstream. Also, as >>>>> of right now, as far as I know the only client that does strict response >>>>> checking for Nova responses is the Tempest client. Currently, additional >>>>> properties in responses are ignored and do not break existing client >>>>> functionality. There is currently little to no harm done to downstream >>>>> users by temporarily allowing additional data to be returned in responses. >>>>> >>>>> Thanks, >>>>> >>>>> Chris Hoge >>>>> Interop Engineer >>>>> OpenStack Foundation >>>>> >>>>> [1] >>>>> https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/api-microversions.html >>>>> [2] >>>>> http://lists.openstack.org/pipermail/openstack-dev/2015-February/057613.html >>>>> [3] https://review.openstack.org/#/c/156130 >>>>> [4] >>>>> http://lists.openstack.org/pipermail/defcore-committee/2016-January/000986.html >>>>> [5] http://git.openstack.org/cgit/openstack/defcore/tree/2015.07.json >>>>> [6] http://git.openstack.org/cgit/openstack/defcore/tree/2016.01.json >>>>> [7] >>>>> http://git.openstack.org/cgit/openstack/governance/tree/resolutions/20160504-defcore-test-location.rst >>>>> [8] >>>>> http://git.openstack.org/cgit/openstack/tempest-lib/tree/tempest_lib/api_schema/response/compute/v2_1/servers.py#n39 > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev