Hi Adam,

As long as someone is working on fixing the Elixir tests, fine. They *are* failing significantly more often on Jenkins than on Travis, for what it's worth.

All our work to build a much better setup for Jenkins may be lost if people don't fix these tests promptly.

Would a suitable workaround be having the CI suites run those tests in a "doesn't block the build" way? Or do you want to put that hurdle in the way of people merging to master/release branches?

Here's the tests that have failed on me during my CI work just today:

Elixir.DesignDocsTest: 4) test POST with empty body (DesignDocsTest)
test/elixir/test/design_docs_test.exs:55
Assertion with == failed

Elixir.PartitionMangoTest: 16) test explain works with non partitioned db (PartitionMangoTest)
test/elixir/test/partition_mango_test.exs:438
Assertion with == failed

(2x) Elixir.AllDocsTest: 5) test POST with empty body (AllDocsTest)
test/elixir/test/all_docs_test.exs:223
Assertion with == failed
code: assert length(Map.get(resp, :body)["rows"]) == 3

JS design_docs.js: doc insertion should have succeeded
402: test/javascript/tests/design_docs.js

Elixir harness setup failure:
Failed to start all the nodes. Check the dev/logs/*.log for errors.

-Joan

On 2019-12-12 5:31 p.m., Adam Kocoloski wrote:
Hi Joan,

I’ve seen the Elixir suite implicated more frequently as well. I haven’t done 
the analysis to see if the failures are concentrated in one or two flakes or if 
they’re more evenly distributed. If it’s a small number of  flaky tests I think 
we have time to fix/disable them rather than knocking out the whole suite.

I agree that we need `make check` to be trustworthy in 3.0 release candidates. 
I would like to keep running the elixir tests in the CI regardless of whether 
they’re in the `check` suite. Cheers,

Adam

On Dec 12, 2019, at 4:39 PM, Joan Touzet <woh...@apache.org> wrote:

Hi again,

As I've been looking more closely at the CI suite for the Jenkins transition, 
I've noticed that our Elixir test cases are actually the most likely to fail. 
In 6 consecutive CI runs, 5 runs failed due to failures in the Elixir suite. 
(The 6th failed due to a JS test failure.)

We started the Elixir effort to retire the JS suite. We reached a decision some 
months ago to put it into `make check` so that people would pay attention to 
its output, and work to fix those tests, accelerating our chances to get rid of 
the JS suite.

Unfortunately, that's not materialised. Our Elixir test porters seem to have 
stopped their work for a while now, and no one is systematically addressing the 
failures in that suite. I've also heard other developers mention (via IRC) that 
some of the test cases hold invalid assumptions about how CouchDB works, 
especially with the Erlang-based clustering code. It sounds to me like the 
effort needs a full code review.

With 3.0 around the corner, I want people to be able to trust the output of 
`make check` when downloading the tarball. If there is no objection, when I 
merge the Erlang version / CI changes on Monday, I will also comment out the 
call to `make elixir` as part of make check.

When the Elixir porting team is more confident in the reliability and 
completeness of their work, and we can successfully retire the JS suite, we can 
reconsider.

-Joan "really wanting to see green, but only seeing red" Touzet

Reply via email to