continuous random walk execution
--------------------------------
Key: ACCUMULO-400
URL: https://issues.apache.org/jira/browse/ACCUMULO-400
Project: Accumulo
Issue Type: Improvement
Reporter: Adam Fuchs
Random walk is finding bugs like a boss, but we can anticipate future usage in
which the current setup will be limiting. In particular, with a larger
development team knocking off bugs and writing new tests we might get to the
point where the most obvious bug is the only one that we find in a given run of
all of the random walkers. Consider hundreds of random walkers walking over all
of the tests. Many of these tests will find bugs non-deterministically. If we
add one test that finds one bug with high probability, all of the walkers will
find that bug and halt. None of the other bugs will be found until the one bug
is fixed or the test is removed.
Here are some things we could do to improve this situation and migrate to more
of a continual random walk setup:
1. Stop executing a test after some number of walkers have found a bug when
running it.
2. Store the random walk graph in a database and have the walkers re-query it
with some regularity. This will let us add new tests to running walkers.
3. Have the walkers snapshot the relevant parts of the overall system when they
find a bug. We currently rely on the walkers halting to preserve the state of
the system so that we can manually extract all of the relevant details that may
have led to the bug. Dynamically snapshotting the system makes it possible to
continue to run tests without rolling over logs and forensic information.
Exactly what information needs to be kept is TBD.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira