Recently I've been noticing that Mahout's unit tests generally take a considerably long time to run, generally longer than what is reported in the individual test output. I took a look as to why this was the case and found a couple things:
Mahout does per-test forking, which means we're forking off a new JVM for each unit text execution, this adds overhead to tests that takes 0.2s to complete. Is per-test forking strictly needed? I captured the command-line used to execute one of the forked tests (InMemInputSplitTest) by running mvn -X and executed it from the shell repeatedly using time see what was going on. In one of every few invocations, the test in question would report completion in 3s, but time reported a wall time 30s (!) or so. Running through strace showed that something was attempting to reading from /dev/random. Sometimes it ran fine, but at least 25-30% it ended up blocking until the entropy pool is refilled. To test I moved /dev/random, and created a link from /dev/urandom to /dev/random (the former doesn't block, but isn't cryptographically secure). It looks as if this could be related to the loading of the SecureRandomSeedGenerator class. I'm running on Ubuntu 9.04, kernel 2.6.28-17-server with the latest patches. Is anyone else experiencing similar slowness? Drew