Recently I've been noticing that Mahout's unit tests generally take a
considerably long time to run, generally longer than what is reported
in the individual test output. I took a look as to why this was the
case and found a couple things:

Mahout does per-test forking, which means we're forking off a new JVM
for each unit text execution, this adds overhead to tests that takes
0.2s to complete. Is per-test forking strictly needed?

I captured the command-line used to execute one of the forked tests
(InMemInputSplitTest) by running mvn -X and executed it from the shell
repeatedly using time see what was going on. In one of every few
invocations, the test in question would report completion in 3s, but
time reported a wall time 30s (!) or so. Running through strace showed
that something was attempting to reading from /dev/random. Sometimes
it ran fine, but at least 25-30% it ended up blocking until the
entropy pool is refilled. To test I moved /dev/random, and created a
link from /dev/urandom to /dev/random (the former doesn't block, but
isn't cryptographically secure). It looks as if this could be related
to the loading of the SecureRandomSeedGenerator class.

I'm running on Ubuntu 9.04, kernel 2.6.28-17-server with the latest patches.

Is anyone else experiencing similar slowness?

Drew

Reply via email to