> > That sounds great. Using Whirr to bring up a system is a good use of > it; have you seen the (fledgling) benchmark framework introduced in > https://issues.apache.org/jira/browse/WHIRR-92? It's a demonstration > of how to run the full testing lifecycle on a cluster. You might > extend or generalize this for black box testing.
I will. Running a benchmark in order to see that the system is working as expected sounds like a great idea. > >> >> What else should I test? Could you point me to some important open JIRAs? > > I think Herriot may be relevant here too. Have you looked at that? > http://wiki.apache.org/hadoop/HowToUseSystemTestFramework I believe it needs too much visibility inside the system. I want to keep it as system agnostic as possible. > > It provides ways to inject known faults into Hadoop and run tests to > check that they are handled appropriately. Using Whirr to run Herriot > tests would be a good approach for running Hadoop system tests. It > might be generalizable to other systems too. Sounds like a great idea to run Herriot tests using Whirr. It's not really what I'm trying to accomplish but I will give it a try if I will have enough time. > > I think Herriot's complementary to Gremlins, which is for > non-deterministic faults. (As I understand it, Cos and Todd can > correct me on this.) Indeed. I want the framework to be able to inject non-deterministic faults while running a system check or a benchmark. My objective it's not to have repeatable test scenarios, I just want to be able to inject a large number of typical faults in a small interval of time on a small cluster and hopefully find strange bugs. For diagnostics I'm planning to rely only on log files, machine specific status information and maybe memory dumps or let the user define custom collectors. > > Cheers, > Tom > Best, -- Andrei Savu -- http://www.andreisavu.ro/
