I'm soliciting ideas on how to improve the reliability and usability
(creating new tests in particular) of our current test harness.
Hudson and some users are seeing intermittent failures, primarily due to
timing related issue in test setup; the test starts a server, runs some
tests, then shuts down the server, loop to the next test. There is some
code in ClientBase that's supposed to provide a latch for the server
startup, but we also have a number of "sleeps" in the test setup,
without which the tests fail more frequently (so something is still
busted). In particular I want to make it easier to write server tests
and to remove the need for sleeps as this causes the unit tests to run
slowly.
Additionally we are seeing a need to tests clients in addition to the
server (much of our current testing is related to verifying the correct
function of the zk server). In this case we are not currently able to
test any client failure handling cases (such as disconnect handling) as
we are running against a fully functional zk server.
I'm thinking we should do two things:
1) create a better test harness for the server
2) implement a mock ZooKeeper.java that has similar semantics as zk
server proper but has the capability to inject/reproduce/verify various
error scenarios for client testing.
If you have any ideas/suggestions/comments/etc.. or would like to work
on/with please let me know.
Patrick