Hi Robert,

also many thanks from my side for writing all this down! There are in fact
some integration tests coming with CouchDB using the HTTP API (can be run
via `make javascript`) which need to be transferred to run against the
cluster port (instead of local). I started working on this on my fork, but
there's quite some work left.

Maybe that's also a part of the puzzle besides having more unit tests or
leveraging PouchDB et all...

Best
    Sebastian

On Tue, Sep 8, 2015 at 11:38 AM, Jason Smith <jason.h.sm...@gmail.com>
wrote:

> Hi, Robert. Well, that was very long and I read every word! Thank you for
> your thoughtful assessment and feedback!
>
> You make many points across several different areas of concern. It sounds
> to me like the primary thrust of your argument is about continuous
> integration--somehow making it more easy and more transparent for
> developers (seasoned and novice) to see the effects of their change. Is
> that right?
>
> So, what is the immediate next step that you propose to fix or improve?
> What is the most squeaky wheel, or the most modest gain that we can make?
>
> Thanks again!
>
> On Mon, Sep 7, 2015 at 9:49 PM, Robert Kowalski <r...@kowalski.gd> wrote:
>
> > Hi list,
> >
> > I had my first real programming job in a Web Agency. We were building
> > mobile websites for our external customers. The development process
> > followed a strict waterfall model, after you finished your project it was
> > thrown over the fence to the QA folks, which tested the website manually.
> > At some point, when all found bugs were fixed, the product got released.
> > Sometimes a bugfix took several iterations of going back and forth
> between
> > QA and dev. Sometimes the QA folks forgot to test if the bug was really
> > fixed after the third iteration as humans make errors, especially when it
> > is a boring task.
> >
> > After a lifetime of 2 years, the customer usually wanted a redesign, and
> > the next developer throwed everything of the old website away, as not
> much
> > of it was reusable. When something broke during the life of such an
> > application/website, the company and developers noticed that by the
> > customer calling them.
> >
> > These days I think often about my time in my first job and the customers
> > calling the company regarding bugs. Right now the Erlang core of CouchDB
> > reminds me quite often how our customers called us when something broke.
> > For CouchDB 2 it is not the users that call us right now, but PouchDB and
> > Fauxton when their testsuites gets red or incredible slow.
> >
> > When I was starting to work on Fauxton we were in a somehow similar
> > situation, a merge usually had 2-3 fixes on the same or next day, because
> > something somehow broke. A proper review for a change took incredible
> much
> > time and every pull request needed a lot of iterations. The most evil
> > changes were the ones where you fixed a bug in one area, and a completely
> > different area broke. These bugs were so evil because they were
> surprising
> > and unexpected.
> >
> > After a real big issue we tried out that every change would need unit
> > and/or integration tests for several weeks.
> >
> > Writing unit tests was very hard these days, most of the code was not
> > designed to get tested afterwards. So most of the tests written in that
> > time were Selenium based integration tests. This was better than nothing
> > but they were slow. Nevertheless, they caught a lot of bugs, even before
> > code was sent to the review. When we made a switch to React and a
> different
> > architecture for our components and their communication, better and
> easier
> > unit/functional testing was one of the many design goals.
> >
> > Writing good Selenium tests is incredible hard, as you basically face the
> > same problems you have in the distributed systems world in the world of
> > modern UIs. State and data changes asynchronously and you never know when
> > the result of your request, which changes the UI is returned. It even
> gets
> > harder by putting this on top of an eventually consistent system :)
> >
> > Sometimes we also pair-programmed in a video call in front of a test
> where
> > coming up with a good test was the problem, as we also had to share and
> > gain knowledge in the team how to write tests. One mistake we made was of
> > not fixing flaky tests immediately and adding more and more flaky tests
> and
> > just restarting the CI for a few weeks. At some point our CI system just
> > collapsed and it took the team a long time to fix those, partly also
> > because tests were a new topic and it is hard to understand why Selenium
> > tests fail in an async JS application with a cluster beneath it which is
> > eventually consistent. Additionally it decreased the confidence in a red
> > tests signaling a bug.
> >
> > After fixing the testsuite in this early crisis we've got pretty stable
> > tests. Additionally reviews take much less time. Reviews got even faster
> > when we added a pre-check for our coding guidelines to the testsuite. If
> > you take a look at a day in the time when we began writing tests, we
> > sometimes lost an hour to write those tests, because it was hard. Looking
> > just at this short timeframe it felt like we were losing time. But if we
> > take a look at a longer timeframe we did not decrease in our speed to
> > deliver features for several reasons: less bugfixes needed after code got
> > on master, less tickets and communication because of less bugs, faster
> > reviews because many bugs are caught before the reviewer takes a look and
> > so on. Some people in the team still hate writing tests but also everyone
> > on the team realized and also mentions that we have no other options.
> >
> > Right now I am seeing the same pattern we observed some time ago in
> > Fauxton for CouchDB's Erlang core :
> >
> >  - not noticing something broke (if you break it, will you even notice?)
> >  - eventually a user in the broader sense notices something broke
> >  - a lot of patching to fix patches where stuff broke in another place
> >  - regressions
> >  - it takes time to review code properly because of a large api surface
> > and a lot of functionality
> >
> > I would like to propose that every PR needs unit, functional AND/OR
> > integration tests for CouchDB's Erlang core, too - given it is not
> covered
> > by an existing test that run. Maybe we could try the new approach for 3
> > months and then decide if we keep it.
> >
> > Most of CouchDB's bugs are getting noticed by interacting with the HTTP
> > API. A simple integration test that CouchDB compiles and the bug got
> fixed
> > / the feature works would be enough as a minimum.
> >
> > Let's take a look what we have for testing now:
> >
> > - Erlang unit tests
> > - Erlang functional tests
> > - integration tests written in Erlang, which use ibrowse [1]
> >
> > - integration tests in JavaScript which test the backdoor ports
> >
> > - Fauxton's Integration tests simulating a user and their browsers
> > - PouchDB's integration tests
> >
> > So there are a lot of different ways to write tests. One problem for
> > testing is our multi-repository approach. The multiple PRs over different
> > repos must somehow get united again. Here is how we could solve it as a
> > first step [2]. The testsuite is kicked off once all PRs are opened by
> > putting the branchnames for the repositories that should not be master
> into
> > the config fields.
> >
> > When the testsuite is green another button could appear, to automate the
> > merge of all repos involved int he change, but that would we an
> additional
> > feature [3].
> >
> > The reviewer and the submitter of the PR would be both responsible that a
> > merge is covered by tests and the testsuite is green.
> >
> > Important is that it is a first step and depending how creative we get it
> > can indefinitely improve, like with Fauxton where a lot changed in the
> very
> > short timeframe since we are requiring tests. The important step is that
> we
> > somehow start at some point, and everybody is welcome and invited (and
> > should feel empowered) to make the processes how we write software and
> > tests better.
> >
> > Flaky tests are a problem and should get fixed immediately, as an
> > unreliable testsuite does not make much sense (remember the crying wolf
> kid
> > story). We learned that quite painfully in Fauxton.
> >
> > The change will need some discipline and maybe also stepping temporarily
> > out of our comfort zone, but I am certain we will gain speed, improve our
> > daily workflows and improve CouchDB's Quality over the long term. I
> think a
> > 3 month timeframe to try the new approach would be a good time to find
> out
> > if we are improving or things get worse.
> >
> >
> > Best,
> > Robert :)
> >
> >
> > [1]
> >
> https://github.com/apache/couchdb-chttpd/blob/master/test/chttpd_db_test.erl#L67
> > [2] https://cloudup.com/cOgxRPbt9aP
> > [3] https://cloudup.com/c0VJDIoYqmI
> >
>

Reply via email to