Hi Adam,
As long as someone is working on fixing the Elixir tests, fine. They
*are* failing significantly more often on Jenkins than on Travis, for
what it's worth.
All our work to build a much better setup for Jenkins may be lost if
people don't fix these tests promptly.
Would a suitable workaround be having the CI suites run those tests in a
"doesn't block the build" way? Or do you want to put that hurdle in the
way of people merging to master/release branches?
Here's the tests that have failed on me during my CI work just today:
Elixir.DesignDocsTest: 4) test POST with empty body (DesignDocsTest)
test/elixir/test/design_docs_test.exs:55
Assertion with == failed
Elixir.PartitionMangoTest: 16) test explain works with non partitioned
db (PartitionMangoTest)
test/elixir/test/partition_mango_test.exs:438
Assertion with == failed
(2x) Elixir.AllDocsTest: 5) test POST with empty body (AllDocsTest)
test/elixir/test/all_docs_test.exs:223
Assertion with == failed
code: assert length(Map.get(resp, :body)["rows"]) == 3
JS design_docs.js: doc insertion should have succeeded
402: test/javascript/tests/design_docs.js
Elixir harness setup failure:
Failed to start all the nodes. Check the dev/logs/*.log for errors.
-Joan
On 2019-12-12 5:31 p.m., Adam Kocoloski wrote:
Hi Joan,
I’ve seen the Elixir suite implicated more frequently as well. I haven’t done
the analysis to see if the failures are concentrated in one or two flakes or if
they’re more evenly distributed. If it’s a small number of flaky tests I think
we have time to fix/disable them rather than knocking out the whole suite.
I agree that we need `make check` to be trustworthy in 3.0 release candidates.
I would like to keep running the elixir tests in the CI regardless of whether
they’re in the `check` suite. Cheers,
Adam
On Dec 12, 2019, at 4:39 PM, Joan Touzet <woh...@apache.org> wrote:
Hi again,
As I've been looking more closely at the CI suite for the Jenkins transition,
I've noticed that our Elixir test cases are actually the most likely to fail.
In 6 consecutive CI runs, 5 runs failed due to failures in the Elixir suite.
(The 6th failed due to a JS test failure.)
We started the Elixir effort to retire the JS suite. We reached a decision some
months ago to put it into `make check` so that people would pay attention to
its output, and work to fix those tests, accelerating our chances to get rid of
the JS suite.
Unfortunately, that's not materialised. Our Elixir test porters seem to have
stopped their work for a while now, and no one is systematically addressing the
failures in that suite. I've also heard other developers mention (via IRC) that
some of the test cases hold invalid assumptions about how CouchDB works,
especially with the Erlang-based clustering code. It sounds to me like the
effort needs a full code review.
With 3.0 around the corner, I want people to be able to trust the output of
`make check` when downloading the tarball. If there is no objection, when I
merge the Erlang version / CI changes on Monday, I will also comment out the
call to `make elixir` as part of make check.
When the Elixir porting team is more confident in the reliability and
completeness of their work, and we can successfully retire the JS suite, we can
reconsider.
-Joan "really wanting to see green, but only seeing red" Touzet