Very valid point. Endianness is only aspect. Accidental size changes on the wire is different.
But this drifted very far astray from my original question: if we keep shuffling the bits on wire regardless, how much of a value does a meticulous CRC tracking process provides ? i see it as actively harmful in that case, since it creates false expectations. --a > On 16 May 2020, at 14:02, Christian Hopps <cho...@chopps.org> wrote: > > As a user of these APIs I would hugely appreciate work be done on the API > framework to make it more usable (don't require callbacks) fix the automatic > endian conversion it already generates, improve support for argument > additions (these are easy to support in a backward compatible way), and maybe > arg size changes, into non-issues, rather than punting on this particular > endian aspect and thinking everything is all-good. > > Thanks, > Chris. > >> On May 16, 2020, at 7:49 AM, Andrew Yourtchenko <ayour...@gmail.com> wrote: >> >> I was hesitating to write exactly that - so at least I am not alone in >> thinking this... :-) >> >> With my RelMgr hat on I would be seeing this as to have a -2’d patch in >> gerrit whose existence I can announce in 20.05 release notes, which would >> apply to both stable/2005 branch and to master branch, with an optional >> add-on patch that was master specific - so that the change can be >> concentrated. Probably even a separate make verify job(s) that would >> cherrypick this on top of whatever patch is being verified - such that we >> have some assurance that things don’t break. >> >> This patch would add a single boolean “api is big endian”, and create >> api_htonX/api_ntohX versions of the endianness functions which will be >> no-ops if that boolean is false. >> >> This patch would also add a new api call “is_vpp_api_is_big_endian”, which >> would return a the value of the above Boolean. >> >> Its presence means that the whatever abstraction layer is there needs to >> query it and to if it returns true, to make htonX()/ntohX calls in the APIs >> to be a a conditional no-op in the same fashion as in VPP. >> >> This will make for a backwards compatible change that can even be even >> backported to earlier branches if needs to... >> >> Sometime in July this could be going into master. >> >> Then that change goes into 2009, with the default still being big-endian, as >> well as is committed to 2005 (it’s an api addition that doesn’t break >> anything else by default, and by then the folks must have seen it on master). >> >> Then somewhere before 2101, say in November, we flip the default. >> >> This gives two releases to iron out the kinks, and 2109 LTS ships with the >> default little endian API. >> >> 2201 then removes the wrappers and makes is_vpp_api_is_big_endian always >> return 0. >> >> Granted, it’s a long roadmap, but it’s not exactly a simple thing that we’d >> be doing and doing so in a graceful fashion will allow to avoid annoying the >> downstream users.. >> >> Also of course this implies that we MUST start having a longer term road map >> for strategic stuff like this, that should be published and vetted by the >> community. >> >> Thoughts ? >> >> --a >> >>>> On 16 May 2020, at 13:22, Damjan Marion <dmar...@me.com> wrote: >>> >>> >>> Knowing that even hard-core big-endian systems like PowerPC decided to >>> switch to Little endian and that situation where binary api will be >>> exchanged between BE and LE systems is extremely small, maybe we should >>> consider removing endian conversion from APIs and simply represent data >>> in the native form. Looks like this is fertile ground for new bugs… >>> >>> Thoughts? >>> >>> — >>> Damjan >>> >>> >>>> On 15 May 2020, at 16:20, Andrew Yourtchenko <ayour...@gmail.com> wrote: >>>> >>>> There's a very interesting couple of gerrit changes that just came in >>>> today that is worth discussing, >>>> and they are a perfect case study to further clarify the process - so >>>> I tweaked the subject accordingly.. >>>> The API message itself is relatively minor, but regardless of what is >>>> agreed that makes a good case study. >>>> >>>> Backstory: >>>> >>>> Once upon a time on Aug 20 2019, commit 053204ab changed >>>> sw_interface_set_rx_mode.mode from u8 to an enum, but an htonl >>>> conversion >>>> function didn't make it there (enums are u32 by default as far as I can >>>> see). >>>> >>>> This was after the 19.08 branch pull, and it wasn't ever tackled, so >>>> this (buggy) behavior ended being in 20.01, 20.01.1, and in the >>>> current 20.05-rc1. >>>> >>>> Fast forward a bit, today I was pinged about the two changes - one for >>>> master, one for stable/2001: >>>> >>>> https://gerrit.fd.io/r/c/vpp/+/26879 - in master - forces the enum to be a >>>> u8 >>>> >>>> https://gerrit.fd.io/r/c/vpp/+/26915 - in stable/2001 - adds the >>>> htonl() call and changes the existing ("buggy") >>>> behavior in the 20.01.2 - thus would silently break any API consumers >>>> that coded against the previous "buggy" behavior. >>>> >>>> And then we have a question open about stable/2005, which "by the >>>> letter" can potentially accept only the second approach, since it is >>>> past the API freeze. >>>> >>>> Additional bit: this API is not touched in make test, so this bug had >>>> slipped through. >>>> >>>> So there are the following options: >>>> >>>> 1) Merge both patches, and treat the 20.05 similar to 20.01, thus >>>> making a "silent change" in both, but making the master look closer to >>>> a 19.08 format. >>>> >>>> 2) Leave the 20.05 and 20.01 alone with the "buggy" behavior, and >>>> merge the master patch that shrinks the enum down to 1 byte >>>> >>>> 3) Merge the 20.01 and cherry-pick it to master and 2005 - fixing the >>>> endianness of the u32 enum everywhere, but making an effective "silent >>>> change" in 20.01&20.05 >>>> >>>> 4) merge the patch in master that shrinks the enum down to one byte, >>>> and cherry-pick it to 20.01 and 20.05 - thus breaking the contract of >>>> "no api changes" but at least this gets to be explicitly visible early >>>> on. >>>> >>>> 5) under the current proposal, if the API message is defined as >>>> "production" then there would need to be a new "in-progress" message >>>> in master with either of the two fixes, the buggy message would need >>>> to be marked as "deprecated". And under the current proposal by the >>>> 20.09 the "in-progress" message would become "production", the current >>>> message would be shown as "deprecated", to be deleted in 21.01. >>>> >>>> So, the feedback that I would appreciate on the above: >>>> >>>> 1) the least worst course of action "right now", for this particular >>>> issue. I discussed this with Ole and we have different opinions, so I >>>> would like the actual API users to chime in. Please tell me which is >>>> the least worst option from your point of view :-) >>>> >>>> 2) What is the best course of action in the future. Note, this is also >>>> the simpler case in that there is a way to trigger a CRC-affecting >>>> change by forcing the enum to be a u8. What would have been the best >>>> course of action if it was simply a missing ntohl() for a field that >>>> noone complained about for 1.5 releases. Can we assume that no-one >>>> used that API and "fix the representation" ? >>>> >>>> 3) would there be an interest in having a sort of registry of "who >>>> wants to know about things related to each API" - so that if a bug >>>> like this arises that requires a behavior change to fix, I could poll >>>> the affected folks and maybe be able to get away with (1) or (3) from >>>> above ? >>>> >>>> And a tweak to the process (and potentially tooling) with regards to >>>> going to "production API status": >>>> >>>> An API that is not touched during a "make test" can not be moved >>>> beyond the "in-progress" status. Adding the unit test implies the >>>> commitment from the API contributor to follow the option (5) above >>>> even for "trivial endianness bugs". >>>> >>>> Thoughts ? >>>> >>>> --a >>>> >>>> >>>> >>>> >>>>> On 5/14/20, Christian Hopps <cho...@chopps.org> wrote: >>>>> API stability is borderline critical for a project like VPP I think. Yes, >>>>> it >>>>> can be used stand-alone, but real value is added by building products on >>>>> top >>>>> of it. >>>>> >>>>> Also important is having an API framework that allows for >>>>> backward-compatible changes to the API for making improvements. >>>>> >>>>> I'm not sure if VPP has the latter, but if there's too much pain with your >>>>> proposed rules I think the latter should addressed rather than sacrificing >>>>> the former. So I support your rules. >>>>> >>>>> Thanks, >>>>> Chris. >>>>> >>>>>> On May 14, 2020, at 8:28 AM, Ole Troan <otr...@employees.org> wrote: >>>>>> >>>>>> Andrew and I have discussed a process around API changes with the goal to >>>>>> limit the impact on VPP consumers. >>>>>> One big painpoint in tracking VPP releases for users of the VPP binary >>>>>> has >>>>>> been API changes. >>>>>> >>>>>> We want to ensure: >>>>>> - A production API never changes. >>>>>> >>>>>> - A production API can be deprecated with one release notice. >>>>>> E.g. show_version_2 is introduced in 20.01 and show_version_1 is marked >>>>>> as >>>>>> deprecated. >>>>>> show_version_1 is marked with "option delete_by = "v20.05"; >>>>>> >>>>>> - APIs marked as experimental / not released can change with no notice. >>>>>> Added support for option status = "in_progress"; >>>>>> >>>>>> This patch: >>>>>> https://gerrit.fd.io/r/c/vpp/+/26881 >>>>>> >>>>>> has a tool "crcchecker" that we intend to run as part of the verify job. >>>>>> It will block modifications to existing production APIs. >>>>>> make checkstyle-api >>>>>> It also supports dumping API manifests and make API comparisions across >>>>>> releases/revisions/commits. >>>>>> >>>>>> A production API is defined by having .api semantic major version > 0. >>>>>> New messages can override that by setting status="in_progress". >>>>>> >>>>>> The consequence of this is that developers must maintain two (or more) >>>>>> versions of an API for some time. >>>>>> This is obviously painful, but at least it aligns cost and benefit better >>>>>> than when we forced the API users to do it. >>>>>> >>>>>> Looking back from 17.01 onwards, it looks like we on average have done >>>>>> about 50 backwards incompatible changes per release. >>>>>> That's ignoring the work on API typing, which have resulted in >>>>>> significantly more changes. >>>>>> >>>>>> The hope is that we can start v20.09 development with this process in >>>>>> place. >>>>>> >>>>>> What do people think? >>>>>> >>>>>> Best regards, >>>>>> Ole >>>>> >>>>> >>>> >>> >> >
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#16429): https://lists.fd.io/g/vpp-dev/message/16429 Mute This Topic: https://lists.fd.io/mt/74228149/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-