Re: [PROPOSAL] Improving the quality of CouchDB

Sebastian Rothbucher Tue, 08 Sep 2015 09:05:29 -0700

Hi Robert,

also many thanks from my side for writing all this down! There are in fact
some integration tests coming with CouchDB using the HTTP API (can be run
via `make javascript`) which need to be transferred to run against the
cluster port (instead of local). I started working on this on my fork, but
there's quite some work left.


Maybe that's also a part of the puzzle besides having more unit tests or
leveraging PouchDB et all...

Best
    Sebastian

On Tue, Sep 8, 2015 at 11:38 AM, Jason Smith <jason.h.sm...@gmail.com>
wrote:

> Hi, Robert. Well, that was very long and I read every word! Thank you for
> your thoughtful assessment and feedback!
>
> You make many points across several different areas of concern. It sounds
> to me like the primary thrust of your argument is about continuous
> integration--somehow making it more easy and more transparent for
> developers (seasoned and novice) to see the effects of their change. Is
> that right?
>
> So, what is the immediate next step that you propose to fix or improve?
> What is the most squeaky wheel, or the most modest gain that we can make?
>
> Thanks again!
>
> On Mon, Sep 7, 2015 at 9:49 PM, Robert Kowalski <r...@kowalski.gd> wrote:
>
> > Hi list,
> >
> > I had my first real programming job in a Web Agency. We were building
> > mobile websites for our external customers. The development process
> > followed a strict waterfall model, after you finished your project it was
> > thrown over the fence to the QA folks, which tested the website manually.
> > At some point, when all found bugs were fixed, the product got released.
> > Sometimes a bugfix took several iterations of going back and forth
> between
> > QA and dev. Sometimes the QA folks forgot to test if the bug was really
> > fixed after the third iteration as humans make errors, especially when it
> > is a boring task.
> >
> > After a lifetime of 2 years, the customer usually wanted a redesign, and
> > the next developer throwed everything of the old website away, as not
> much
> > of it was reusable. When something broke during the life of such an
> > application/website, the company and developers noticed that by the
> > customer calling them.
> >
> > These days I think often about my time in my first job and the customers
> > calling the company regarding bugs. Right now the Erlang core of CouchDB
> > reminds me quite often how our customers called us when something broke.
> > For CouchDB 2 it is not the users that call us right now, but PouchDB and
> > Fauxton when their testsuites gets red or incredible slow.
> >
> > When I was starting to work on Fauxton we were in a somehow similar
> > situation, a merge usually had 2-3 fixes on the same or next day, because
> > something somehow broke. A proper review for a change took incredible
> much
> > time and every pull request needed a lot of iterations. The most evil
> > changes were the ones where you fixed a bug in one area, and a completely
> > different area broke. These bugs were so evil because they were
> surprising
> > and unexpected.
> >
> > After a real big issue we tried out that every change would need unit
> > and/or integration tests for several weeks.
> >
> > Writing unit tests was very hard these days, most of the code was not
> > designed to get tested afterwards. So most of the tests written in that
> > time were Selenium based integration tests. This was better than nothing
> > but they were slow. Nevertheless, they caught a lot of bugs, even before
> > code was sent to the review. When we made a switch to React and a
> different
> > architecture for our components and their communication, better and
> easier
> > unit/functional testing was one of the many design goals.
> >
> > Writing good Selenium tests is incredible hard, as you basically face the
> > same problems you have in the distributed systems world in the world of
> > modern UIs. State and data changes asynchronously and you never know when
> > the result of your request, which changes the UI is returned. It even
> gets
> > harder by putting this on top of an eventually consistent system :)
> >
> > Sometimes we also pair-programmed in a video call in front of a test
> where
> > coming up with a good test was the problem, as we also had to share and
> > gain knowledge in the team how to write tests. One mistake we made was of
> > not fixing flaky tests immediately and adding more and more flaky tests
> and
> > just restarting the CI for a few weeks. At some point our CI system just
> > collapsed and it took the team a long time to fix those, partly also
> > because tests were a new topic and it is hard to understand why Selenium
> > tests fail in an async JS application with a cluster beneath it which is
> > eventually consistent. Additionally it decreased the confidence in a red
> > tests signaling a bug.
> >
> > After fixing the testsuite in this early crisis we've got pretty stable
> > tests. Additionally reviews take much less time. Reviews got even faster
> > when we added a pre-check for our coding guidelines to the testsuite. If
> > you take a look at a day in the time when we began writing tests, we
> > sometimes lost an hour to write those tests, because it was hard. Looking
> > just at this short timeframe it felt like we were losing time. But if we
> > take a look at a longer timeframe we did not decrease in our speed to
> > deliver features for several reasons: less bugfixes needed after code got
> > on master, less tickets and communication because of less bugs, faster
> > reviews because many bugs are caught before the reviewer takes a look and
> > so on. Some people in the team still hate writing tests but also everyone
> > on the team realized and also mentions that we have no other options.
> >
> > Right now I am seeing the same pattern we observed some time ago in
> > Fauxton for CouchDB's Erlang core :
> >
> >  - not noticing something broke (if you break it, will you even notice?)
> >  - eventually a user in the broader sense notices something broke
> >  - a lot of patching to fix patches where stuff broke in another place
> >  - regressions
> >  - it takes time to review code properly because of a large api surface
> > and a lot of functionality
> >
> > I would like to propose that every PR needs unit, functional AND/OR
> > integration tests for CouchDB's Erlang core, too - given it is not
> covered
> > by an existing test that run. Maybe we could try the new approach for 3
> > months and then decide if we keep it.
> >
> > Most of CouchDB's bugs are getting noticed by interacting with the HTTP
> > API. A simple integration test that CouchDB compiles and the bug got
> fixed
> > / the feature works would be enough as a minimum.
> >
> > Let's take a look what we have for testing now:
> >
> > - Erlang unit tests
> > - Erlang functional tests
> > - integration tests written in Erlang, which use ibrowse [1]
> >
> > - integration tests in JavaScript which test the backdoor ports
> >
> > - Fauxton's Integration tests simulating a user and their browsers
> > - PouchDB's integration tests
> >
> > So there are a lot of different ways to write tests. One problem for
> > testing is our multi-repository approach. The multiple PRs over different
> > repos must somehow get united again. Here is how we could solve it as a
> > first step [2]. The testsuite is kicked off once all PRs are opened by
> > putting the branchnames for the repositories that should not be master
> into
> > the config fields.
> >
> > When the testsuite is green another button could appear, to automate the
> > merge of all repos involved int he change, but that would we an
> additional
> > feature [3].
> >
> > The reviewer and the submitter of the PR would be both responsible that a
> > merge is covered by tests and the testsuite is green.
> >
> > Important is that it is a first step and depending how creative we get it
> > can indefinitely improve, like with Fauxton where a lot changed in the
> very
> > short timeframe since we are requiring tests. The important step is that
> we
> > somehow start at some point, and everybody is welcome and invited (and
> > should feel empowered) to make the processes how we write software and
> > tests better.
> >
> > Flaky tests are a problem and should get fixed immediately, as an
> > unreliable testsuite does not make much sense (remember the crying wolf
> kid
> > story). We learned that quite painfully in Fauxton.
> >
> > The change will need some discipline and maybe also stepping temporarily
> > out of our comfort zone, but I am certain we will gain speed, improve our
> > daily workflows and improve CouchDB's Quality over the long term. I
> think a
> > 3 month timeframe to try the new approach would be a good time to find
> out
> > if we are improving or things get worse.
> >
> >
> > Best,
> > Robert :)
> >
> >
> > [1]
> >
> https://github.com/apache/couchdb-chttpd/blob/master/test/chttpd_db_test.erl#L67
> > [2] https://cloudup.com/cOgxRPbt9aP
> > [3] https://cloudup.com/c0VJDIoYqmI
> >
>

Re: [PROPOSAL] Improving the quality of CouchDB

Reply via email to