2014-11-20 20:05, Neil Horman:
> On Thu, Nov 20, 2014 at 10:08:25PM +0100, Thomas Monjalon wrote:
> > 2014-11-20 13:25, Neil Horman:
> > > On Thu, Nov 20, 2014 at 06:09:10PM +0100, Thomas Monjalon wrote:
> > > > 2014-11-19 10:13, Neil Horman:
> > > > > On Wed, Nov 19, 2014 at 11:35:08AM +0000, Bruce Richardson wrote:
> > > > > > On Wed, Nov 19, 2014 at 12:22:14PM +0100, Thomas Monjalon wrote:
> > > > > > > Following the discussion we had with Neil during the conference
> > > > > > > call,
> > > > > > > I suggest this plan, starting with the next release (2.0):
> > > > > > > - add version numbers to libraries
> > > > > > > - add version numbers to functions inside .map files
> > > > > > > - create a git tree dedicated to maintenance and API
> > > > > > > compatibility
> > > > > > >
> > > > > > > It means these version numbers must be incremented when breaking
> > > > > > > the API.
> > > > > > > Though the old code paths will be maintained and tested
> > > > > > > separately by volunteers.
> > > > > > > A mailing list for maintenance purpose could be created if needed.
> > > > > > >
> > > > > > Hi Thomas,
> > > > > >
> > > > > > I really think that the versionning is best handled inside the main
> > > > > > repository
> > > > > > itself. Given that the proposed deprecation policy is over two
> > > > > > releases i.e. an
> > > > > > API is marked deprecated in release X and then removed in X+1, I
> > > > > > don't see the
> > > > > > maintaining of old code paths to be particularly onerous.
> > > > > >
> > > > > > /Bruce
> > > > >
> > > > > I agree with Bruce, even if it is on occasion an added workload, its
> > > > > not the
> > > > > sort of thing that can or should be placed on an alternate developer.
> > > > > Backwards
> > > > > compatibility is the sort of thing that has to be on the mind of the
> > > > > developer
> > > > > when modifying an API, and on the mind of the reviewer when reviewing
> > > > > code. To
> > > > > shunt that responsibility elsewhere invites the opportunity for
> > > > > backwards
> > > > > compatibilty to be a second class citizen who's goal will never be
> > > > > reached,
> > > > > because developers instituting ABI changes will never care about the
> > > > > consequences, and anyone worrying about backwards compatibility will
> > > > > always be
> > > > > playing catch up, possibly allowing ABI breaks to slip through.
> > > > >
> > > > > Neil
> > > >
> > > > Before taking a decision, we should detail every concern.
> > > >
> > > > 1/
> > > > Currently there are not a lot of API refactoring because DPDK is well
> > > > tailored
> > > > for x86 and Intel NICs. But we are seeing that new CPU and new NICs to
> > > > support
> > > > would require some adaptations.
> > > >
> > > Yes, you're absolutely right here. I had hoped that, during my
> > > presentation
> > > that this would happen occasionaly, and that we would need to deal with
> > > it.
> > > What I think you are implying here (correct me if I'm wrong), is that you
> > > would
> > > advocate that we wait to introduce ABI versioning until after such
> > > refactoring
> > > is, for lack of a better term "complete". The problem here is that,
> > > software
> > > that is growing in user base is never "complete". What you are
> > > effectively
> > > saying is that you want to wait until the API is in a state in which no
> > > (or
> > > almost no) more changes are required, then fixate it. Thats quite simply
> > > never
> > > going to happen. And if it does, it obviates the need for versioning at
> > > all.
> >
> > I agree Neil. This point is not about how long we should wait but how the
> > overhead could be estimate for coming releases.
> >
> Well, I understand the desire, but I'm not sure how it can be accomplished.
> For
> a given release, the overhead will be dependent on two factors:
>
> 1) The number off ABI changes in a given release
>
> 2) The extent of the ABI changes that were made.
>
> If we have a way to predict those, then we can estimate the overhead, but
> without that information, you're kinda stuck. That said, if we all concur
> that
> this is a necessecary effort to undertake, then the overhead is, not overly
> important. Whats more important is providing enough time to alot enough time
> to
> do the work for a given project. That is to say, when undertaking a large
> refactoring, or other project that promises to make significant ABI changes,
> that the developer needs to factor in time to design an implement backwards
> compatibility. Put another way, if the developer does their job right, and
> takes backwards compatibility seriously, the overhead to you as a maintainer
> is
> nil. The onus to handle this extra effort needs to be on the developer.
>
> > > > 2/
> > > > I'm curious to know how you would handle a big change like the recent
> > > > mbuf rework.
> > > > Should we duplicate the structure and all the functions using mbuf?
> > >
> > > Several ways, what you suggest above is one way, although thats what I
> > > would
> > > consider to be a pessimal case. Ideally such large changes are
> > > extreemely rare
> > > (a search of the git history I think confirms this). Much more common are
> > > small, limited changes to various API's for which providing multiple
> > > versions of
> > > a function is a much more reasonable approach.
> > >
> > > In the event that we do decide to do a refactor that is so far reaching
> > > that we
> > > simply don't feel like multi-versioning is feasible, the recourse is then
> > > to
> > > deprecate the old API, publish that information on the deprecation
> > > schedule,
> > > wait for a release, then replace it wholesale. When the API is released,
> > > we
> > > bump the DSO version number. Note the versioning policy never guarantees
> > > that
> > > backwards compatibility will always be available, nor does it stipulate
> > > that a
> > > newer version of the API is available prior to removing the old one. The
> > > goal
> > > here is to give distributors and application vendors advanced notice of
> > > ABI
> > > breaking changes so that they can adapt appropriately before they are
> > > caught off
> > > guard. If the new ABI can't be packaged alongside the old, then so be it,
> > > downstream vendors will have to use the upstream git head to test and
> > > validate,
> > > rather than a newer distribution release
> >
> > Seems reasonable.
> >
> > > Ideally though, that shouldn't happen, because it causes downstream
> > > headaches,
> > > and we would really like to avoid that. Thats why I feel its so
> > > important to
> > > keep this work in the main tree. If we segregate it to a separate
> > > location it
> > > will make it all to easy for developers to ignore these needs and just
> > > assume we
> > > constantly drop old ABI versions without providing backwards
> > > compatibility.
> > >
> > > > 3/
> > > > Should we add new fields at the end of its structure to avoid ABI
> > > > breaking?
> > > >
> > > In the common case yes, this usually avoids ABI breakage, though it can't
> > > always
> > > be relied upon (e.g. cases where structures are statically allocated by an
> > > application). And then there are patches that attempt to reduce memory
> > > usage
> > > and increase performance by re-arranging structures. In those cases we
> > > need to
> > > do ABI versioning or announce/delay/release as noted above, though again,
> > > that
> > > should really be avoided if possible.
> >
> > So there is no hope of having fields logically sorted.
> > Not a major problem but we have to know it. And it should probably be
> > documented if we choose this way.
> >
> Sure, though I'm not sure I agree with the statement above. Having fields
> logically sorted seems like it should be a forgone conclusion in that the
> developer should have laid those fields out in some semblance of order in the
> first place. If a large data structure re-ordering is taking place such that
> structure fields are getting rearranged, that in my mind is part of a large
> refactoring for which the entire API that is affected by those data structures
> must have a new version created to provide backward compatibility, or in the
> extreeme case, we may need to preform a warn and deprecate/exchange operation
> as
> noted previously, though again, that is a Nuclear option.
Just to illustrate my thought:
Let's imagine this struct {
fish_name;
fish_taste;
vegetables_name;
}
When adding the new field "fish_cooking", we'll add it at the end to avoid ABI
break.
struct {
fish_name;
fish_taste;
vegetables_name;
fish_cooking;
}
So "fish_*" fields won't be grouped.
It's mostly an esthetic/readability consequence.
Now I'm hungry ;)
> > > > 4/
> > > > Developers contribute because they need some changes. So when breaking
> > > > an API, their application is already ready for the new version.
> > > > I mean the author of such patch is probably not really motivated to
> > > > keep ABI
> > > > compability and duplicate the code path.
> > > >
> > > What? That doesn't make any sense. Its our job to enforce this
> > > requirement on
> > > developers during the review cycle. If you don't feel like we can enforce
> > > coding requirements on the project, we've already lost. I agree that an
> > > application developer submitting a patch for DPDK might not care about ABI
> > > compatibility because they've already modified their application, but
> > > they (and
> > > we) need to recognize that there are more than just a handful of users of
> > > the
> > > DPDK, some of whom don't participate in this community (i.e. are simply
> > > end
> > > users). We need to make sure all users needs are met. Thats the entire
> > > point
> > > of this patch series, to make DPDK available to a wider range of users.
> >
> > Exact. To make it simple, you care about end users and I have to care about
> > developers motivation. But I perfectly understand the end users needs.
> > I don't say we cannot enforce coding requirements. I just think it will be
> > less pleasant.
> >
> I disagree with the assertion that you will loose developers becausee they
> don't
> care about compatibility. You're developer base may change. This is no
> different than any other requirement that you place on a developer. You make
> all sorts of mandates regarding development (they can't break other older
> supported cpu architecture, their code has to compile on all configurations,
> etc). This is no different.
>
> > > > 5/
> > > > Intead of simply modifying an API function, it would appear as a whole
> > > > new
> > > > function with some differences compared to the old one. Such change is
> > > > really
> > > > not convenient to review.
> > >
> > > Um, yes, versioning is the process of creating an additional
> > > function that closely resembles an older version of the same function,
> > > but with
> > > different arguments and a newer version number. Thats what it is by
> > > defintion,
> > > and yes, its additional work. All you're saying here is that, its extra
> > > work
> > > and we shouldn't do it. I thought I made this clear on the call, its
> > > been done
> > > in thousands of other libraries, but if you just don't want to do it,
> > > then you
> > > should abandon distributions as a way to reach a larger community, but if
> > > you
> > > want to see the DPDK reach a larger community, then this is something
> > > that has
> > > to happen, hard or not.
> >
> > The goal of this discussion is to establish all the implications of this
> > decision. We expose the facts. No conclusion.
> >
> You haven't exposed a fact, you've asserted an opinion. Theres is no notion
> of
> something being convienient or inconvienient to review in any quantitative
> way.
> If facts are your goal, you missed the mark here.
Maybe you use a tool that I don't know.
My main material for review is the patch. And I think it's simpler to check an
one-line change than a duplicated code path. But instead of giving my opinion,
I must expose what it looks like for a simple example:
- void cook_fish()
+ void cook_fish(oil_bottle)
{
+ use_oil(oil_bottle);
start_fire();
put_fish();
wait();
stop_fire();
}
vs
- void cook_fish()
+ void __vsym cook_fish_v1()
{
start_fire();
put_fish();
wait();
stop_fire();
}
+ VERSION_SYMBOL(cook_fish, _v1, 1);
+ void cook_fish(oil_bottle)
+ {
+ use_oil(oil_bottle);
+ start_fire();
+ put_fish();
+ wait();
+ stop_fire();
+ }
+ BIND_DEFAULT_SYMBOL(cook_fish, 2);
> > > > 6/
> > > > Testing ABI compatibility could be tricky. We would need a tool to
> > > > check it's
> > > > mostly OK. The good place for such a tool is in app/test. It was
> > > > designed to be
> > > > the unit tests of the API.
> > >
> > > That seems like a reasonable idea, but I'm not sure what the concern is.
> > > Are
> > > you saying that you need to test every old version of the ABI? Thats
> > > fine. I
> > > really don't think it has to be as stringent as the latest version
> > > testing, but
> > > if you want to, it should be as easy as building the latest release of
> > > the DPDK libraries, and the previous version of the test application.
> > > That will
> > > force the previous version code paths to be used by the test app in the
> > > new
> > > library and, if the test fully exercize the api, then you should get
> > > pretty good
> > > coverage.
> >
> > Yes it will provide an unit test to developpers.
> >
> > > > 7/
> > > > This system would allow application developpers to upgrade DPDK to n+1
> > > > without
> > > > rebuilding. But when upgrading to n+2, they should have adapted their
> > > > application to comply with n+1 API (because n will be removed).
> > >
> > > Only assuming that the old ABI facet was deprecated at the same time the
> > > new ABI
> > > was introduced. Theres nothing that says we have to do that, but I
> > > digress.
> > >
> > > > So this solution offers a delay between the upgrade decision and the
> > > > app work. Note that they could prepare their application before
> > > > upgrading.
> > > > Anyway, an upgrade should be tested before doing it effectively. The
> > > > behaviour
> > > > of the application could change and require some adaptations.
> > > >
> > > Um, yes. Whats the concern here?
> >
> > I'm just trying to figure which workflows are eased by progressive ABI
> > deprecation.
> >
> The workflow for end users, in that they are given an alert prior to a
> breaking
> change, and the time to fix it, in a way that distributions can manage without
> having to individually (as distributions) undertake that effort on their own,
> an
> in a way that might one day provide for multi version compatibility.
>
> > > Downstream application developers need 2
> > > things:
> > >
> > > A) The ability to note that ABI changes are comming so that they can
> > > adapt to
> > > the new version
> > >
> > > B) Time to do so
> > >
> > > The deprecation policy, if properly distributed by Distributions provide
> > > (A),
> > > and the ABI versioning provides (B). I.e. they can get all the latest
> > > bug fixes
> > > and enhancements while in parallel adapting to the comming new version.
> > > Note
> > > ideally this will happen rarely, as having to constantly rebuild/adapt
> > > does not
> > > sit will with application vendors who choose to go through distributions,
> > > but
> > > we'll do the best we can.
> >
> > It's an interesting point. In a long-term distribution model like RHEL, do
> > you
> > plan to upgrade DPDK at each new release?
> >
> Given that you intermix hardware support with bug fixes and new features
> (which
> granted is not uncommon), yes, I don't see any way to avoid doing so. We
> could
> of course cherry pick bug fixes and non-ABI-breaking features, to preserve
> compatibility, but doing so diverges from upstream quickly to the point that
> it
> becomes extreemely difficult to maintain. As an example, the one project that
> Red Hat does this on routinely is the kernel, and to do so employs a staff of
> hundreds of engineers. No distribution wants to do that for every user space
> library that they support. They/we are willing to do minor fixes in a given
> release with the foreknoweldge that we can drop them when the next relase
> comes
> out, but beyond that, the logistics just don't scale.
>
> > > > 8/
> > > > How to handle a change in the implementation of a PMD which severely
> > > > impact
> > > > the application? Example: an ol_flag was mis-used and the application
> > > > has
> > > > a workaround to interpret this flag but it's now incompatible with the
> > > > fix.
> > > >
> > > We run into this sometimes in Fedora and RHEL, and doesn't require
> > > versioning.
> > > The problem you describe is one in which something internal to the
> > > library that
> > > an application has come to rely on. Fixing the bug isn't typically
> > > considered
> > > within the purview of versioning, because you're not changing the ABI,
> > > you're
> > > just correcting a bug in the PMD's behavior. Customers who ask for the
> > > behavior
> > > to remain unchanged are asking for what's commonly referred to as "Bug
> > > for Bug
> > > compatibility" and in those cases the application vendor needs to release
> > > a
> > > corresponding fix. Developers can't be required to preserve buggy
> > > behavior.
> > >
> > > It should also be noted that in this case, ABI never changed. All the
> > > data
> > > types/sizes/locations/etc have remained unchanged. Its just a bug in
> > > interpretation of data passed accross the ABI. As such, theres nothing
> > > for ABI
> > > versioning to do here.
> >
> > OK, that's what I thought.
> >
> > > > 9/
> > > > When we don't want to adapt an application, it means the development is
> > > > finished and we don't care about the new features of the library.
> > > > So I wonder if it wouldn't be more appropriate to provide stable
> > > > releases
> > > > with true maintenance to such users. I understood that is what Redhat
> > > > provides
> > > > to their customers.
> > > >
> > > No, thats incorrect, we frequently update packages to the latest upstream
> > > version when at all possible. We are able to do this sepcifically because
> > > upstream library releases provide ABI versioning, so that we can update
> > > with
> > > confidence. If they don't do that, then yes, we are often restricted to
> > > selecting a release and maintaining it for the duration of a major RHEL
> > > release,
> > > which implies that security and feature updates are extreemely limited
> > >
> > > That said, if you wanted to do ongoing maintenence on each release, I
> > > suppsose
> > > you could, in fact its somewhat simmilar to the -stable series that the
> > > kernel
> > > uses, exept that the kernel enoys an extreemly stable user space ABI, and
> > > even
> > > then the kernel -stable series doesn't take internal ABI changing
> > > patches, so
> > > theres alot of divergence. You don't currently have that stable ABI
> > > interface,
> > > and so I think you'll find that that doing this is way more work than just
> > > supporting versioning.
> > >
> > > To illustrate, lets say you want to support maintenence releases the
> > > latest 3
> > > releases of the DPDK with patches. To do this, for every patch that is
> > > posted
> > > to the dpdk that is a bug fix, you will have to apply it four times, one
> > > for
> > > the git head, and again for each of the three releases that you are doing
> > > maintenence on. the patch will of course apply cleanly to the git head,
> > > as
> > > thats what the developer wrote it against, but the other three releases
> > > have
> > > every opportunity to conflict with code introduced in the git head but
> > > that
> > > couldn't be taken into the maintenece releases. Fixing those up is work
> > > that
> > > you will either have to do, or request that the patch author do. And for
> > > this
> > > work you will provide distibutions with about 2 years of ABI stability
> > > (presuming an ~8 month release cycle), after which they are back to just
> > > living
> > > with whatever they stabilized on until the next major relase (note a
> > > single RHEL
> > > major release has a 10+ year life cycle). I would personally rather
> > > avoid that
> > > work, and just do the ABI compatibility, as those patches are far fewer in
> > > number, and it buys for the effort.
> >
> > Interesting point of view.
> > Note that there is no plan to maintain stable version on dpdk.org.
> > But if some volunteers want absolutely to do it (even after reading your
> > comment),
> > we cannot forbid it.
> >
> Certainly, and as I noted the kernel does that. But given the rate of change
> that the DPDK undergoes, and the current size of the community, I don't think
> anyone is going to step up to do that work. Thats really the underlying
> problem
> here, you can solve this problem lots of ways if you have enough manpower, but
> given the resources at hand, doing versioning in the master tree is really the
> only viable solution.
>
> > > > Hope this discussion will bring a clear idea of what should be done with
> > > > which implications.
> > > > Thanks
> > Thanks again
I think my concerns are now well explained.
It was important to expose clearly what a such ABI policy means.
If nobody disagree with your approach, it should be accepted.
Thanks
--
Thomas