I completed a successful 24hr run of the Fluo stress test on a 10 node EC2 cluster. For the test 1 billion random integers were loaded via map reduce and then 370 million were loaded by Fluo. This resulted in ~1.3 billion transaction executing and ~13 million collisions. Fluo commit dbad51d was used for the test. Below is the final output from the test.
*****Verifying Fluo & MapReduce results match***** Success! Fluo & MapReduce both calculated 1369064132 unique integers During the test CPU utilization was not uniformly high. Looking at the Accumulo monitor some nodes would have lots of queued scans. Running jstack on that nodes showed lots of threads trying to reserve open files. However there were only a few threads actually running scans. This seemed very odd and I plan to investigate further. I had set the max open files to 1000 and all tablets had only 3 to 4 files. Therefore if 1000 files were reserved I would have expected to see lots of scans running, however this was not what I saw. Below is a gist with info about config used for the test. https://gist.github.com/keith-turner/e28ee6cd4941210f34e5cd0e6a6b3106