Re: [DISCUSS - NuttX Workflow]

Xiang Xiao Fri, 20 Dec 2019 04:09:20 -0800

I think we can learn the good practice from PX4, but shouldn't take
all without any justification.
PX4 has the mature workflow, maybe we can extract the core OS part as
our base to boost the initial setup.
The important thing now is to define the high level workflow and vote
in the community.
Then we may adapter the workflow from PX4, even reuse the test
infrastructure if possible.


Thanks
Xiang

On Fri, Dec 20, 2019 at 7:45 PM Alan Carvalho de Assis
<[email protected]> wrote:
>
> Hi David,
>
> On 12/20/19, David Sidrane <[email protected]> wrote:
> > Hi Nathan,
> >
> > On 2019/12/20 02:51:56, Nathan Hartman <[email protected]> wrote:
> >> On Thu, Dec 19, 2019 at 6:24 PM Gregory Nutt <[email protected]> wrote:
> >> > >> ] A bad build system change can cause serious problems for a lot of
> >> people around the world.  A bad change in the core OS can destroy the
> >> good
> >> reputation of the OS.
> >> > > Why is this the case? Users should not be using unreleased code or be
> >> encouraged to use it.. If they are one solution is to make more frequent
> >> releases.
> >> > I don't think that the number of releases is the factor.  It is time in
> >> > people's hand.  Subtle corruption of OS real time behavior is not
> >> > easily
> >> > testing.   You normally have to specially instrument the software and
> >> > setup a special test environment perhaps with a logic analyzer to
> >> > detect
> >> > these errors.  Errors in the core OS can persists for months and in at
> >> > least one case I am aware of, years, until some sets up the correct
> >> > instrumented test.
> >>
> >> And:
> >>
> >> On Thu, Dec 19, 2019 at 4:20 PM Justin Mclean <[email protected]>
> >> wrote:
> >> > > ] A bad build system change can cause serious problems for a lot of
> >> people around the world.  A bad change in the core OS can destroy the
> >> good
> >> reputation of the OS.
> >> >
> >> > Why is this the case? Users should not be using unreleased code or be
> >> encouraged to use it.. If they are one solution is to make more frequent
> >> releases.
> >>
> >> Many users are only using released code. However, whatever is in "master"
> >> eventually gets released. So if problems creep in unnoticed, downstream
> >> users will be affected. It is only delayed.
> >>
> >> I can personally attest that those kinds of errors are extremely
> >> difficult
> >> to detect and trace. It does require a special setup with logic analyzer
> >> or
> >> oscilloscope, and sometimes other tools, not to mention a whole setup to
> >> produce the right stimuli, several pieces of software that may have to be
> >> written specifically for the test....
> >>
> >> I have been wracking my brain on and off thinking about how we could set
> >> up
> >> an automated test system to find errors related to timing etc.
> >> Unfortunately unlike ordinary software for which you can write an
> >> automated
> >> test suite, this sort of embedded RTOS will need specialized hardware to
> >> conduct the tests. That's a subject for another thread and i don't know
> >> if
> >> now is the time, but I will post my thoughts eventually.
> >>
> >> Nathan
> >>
> >
> > From the proposal
> >
> > "Community
> >
> > NuttX has a large, active community.  Communication is via a Google group at
> > https://groups.google.com/forum/#!forum/nuttx where there are 395 members as
> > of this writing.  Code is currently maintained at Bitbucket.org at
> > https://bitbucket.org/nuttx/.  Other communications are through Bitbucket
> > issues and also via Slack for focused, interactive discussions."
> >
> >
> >> Many users are only using released code.
> >
> > Can we ask the 395 members?
> >
> > I can only share my experience with NuttX since I began working on the
> > project in 2012 for multiple companies.
> >
> > Historically (based on my time on the project) releases - were build tested
> > - by this I mean that the configurations were updated and the thus created a
> > set of "Build Test vectors" BTV. Given the number of permutations solely
> > based on the load time of
> > (http://nuttx.org/doku.php?id=documentation:configvars) with 95,338 CONFIG_*
> > hits. Yes there are duplicates on the page and dependencies. This is just
> > meant to give a number of bits....
> >
> > The total space is very large
> >
> > The BTV space was very sparse coverage.
> >
> > IIRC Greg gave the build testing task a day of time. It was repeated after
> > errors were found.  I am not aware of any other testing. Are you?
> >
> > There were no Release Candidate (rc) nor alpha nor beta test that ran this
> > code one real systems and very little, if any Run Test Vectors (RTV) - I
> > have never seen a test report - has anyone?
> >
> > One way to look at this is Sporadic Integration. (SI) with limited BTV and
> > minimal RTV.  Total Test Vector Coverage TTVC = BTV + RTV;  The ROI of  way
> > of working, from a reliability perspective was and is very small.
> >
> > A herculean effort Greg's part with little return: We released code with
> > many significant and critical errors in it. See the ReleaseNotes and the
> > commit log.
> >
> > Over the years Greg referred to TRUNK (yes it was on SVN) and master as his
> > "own sandbox" stating is should not be considered stable or build-able. This
> > is evident in the commit log.
> >
>
> Please stop focusing on the people (Greg) and let talk about how the workflow.
> We are here to discuss how we can improve the process, we are not
> talking about throw away NuttX Build System and move to PX4.
>
> You are picturing something that is not true.
>
> We have issues, as FreeRTOS, MBEB and Zephyr also have. But it is not
> Greg or the Build System guilt.
>
> Please, stop! It is disgusting!
>
> > I have personally never used a release from a tarball. Given the above why
> > would I? It is less stable then master at TC = N
> > (https://www.electronics-tutorials.ws/rc/rc_1.html) where N Is some number
> > of days after a release. - unfortunately based on the current practices (a
> > very unprofessional workflow)  N is also dictated by when apps and nuttx
> > actually building for a given target's set of BTV.
> >
>
> It is not "unprofessional" it was what we could do based or our
> hardware limitations.
>
> > With the tools and resources that exist in our work today, Quite frankly:
> > This unacceptable and is an embarrassment.
> >
>
> Oh my Gosh! Please don't do it.
>
>
> > I suspect this is why there is a Tizen. The modern era - gets it.
> > (Disclaimer I am an old dog - I am learning to get it)
> >
>
> Tizen exists because companies want to have control.
> This is the same logic why Redhat and others maintain their own Linux
> kernel by themselves.
>
> > --- Disclaimer ---
> >
> > In the following, I'm am not bragging about PX4 or selling tools, I am
> > merely trying to share our experiences for the betterment of NuttX.
> >
> > From what I understand PX4 has the most instances of NuttX running on real
> > HW in the world. Over 300K. (I welcome other users to share their numbers)
> >
> > PX4's Total TTVC is still limited, but much, much greater than NuttX.
> >
> > We use Continuous integration (CI) on Nuttx on PX4 on every commit on PRs.
> >
> >       C/C++ CI / build (push) Successful in 3m
> >       Compile MacOS Pending — This commit is being built
> >       Compile All Boards — This commit looks good
> >       Hardware Test — This commit looks good
> >       SITL Tests — This commit looks good
> >       SITL Tests (code coverage) — This commit looks good
> >       ci/circleci — Your tests passed on CircleCI!
> >       continuous-integration/appveyor/pr — AppVeyor build succeeded
> >       continuous-integration/jenkins/pr-head — This commit looks good
> >
> >
> > We run tests on HW.
> >
> > http://ci.px4.io:8080/blue/organizations/jenkins/PX4_misc%2FFirmware-hardware/detail/pr-mag-str-preflt/1/pipeline
> >
> > I say limited because of the set of arch we use and the way we configure the
> > OS.
> >
> > I believe this to be true of all users.
> >
> > The benefit of a community is that the sum of all TTVC that finds the
> > problems and fix them.
> >
> > Why not maximize TTVC - if it will have a huge ROI and it is free:
> >
> > PX4 will contribute all that we have. We just need to build temporally
> > consistent build. Yeah he is on the submodule thing AGAIN :)
> >
>
> Just to make the history short: we already have solutions for SW and HW CI.
>
> Besides the buildbot (https://buildbot.net) that was implemented and
> tested by Fabio Balzano, Xiaomi also has a build test for NuttX.
>
> At end of the day, it is not only Greg testing the system, we all are
> testing it as well.
>
> Don't try to push PX4 down your throat, it will not work this way.
> Let's keep the Apache way, it is a democracy!
>
> BR,
>
> Alan

Re: [DISCUSS - NuttX Workflow]

Reply via email to