Patricia Shanahan wrote:
My last job (I was a graduate student from 2002 to late last year) was
as a large SPARC server platform architect. To improve prototype system
testing, I wrote an extremely silly but extremely useful program called
"parstore". It just block stores the floating point registers on a
specified number of processors to memory, repeatedly, as fast as it can.
The effect is to fill queues, and generally disturb and stress the
interconnect. It never detected any errors, but prototypes were more
likely to crash and operating system stress tests were more likely to
fail while it was running.
If one of the River developers has an intranet test environment it may
be possible to simulate the effect of running over the Internet by a
similar trick. Create some workload that keeps the network very busy,
and run it in parallel with a quality assurance test.
In some cases it may not matter which of two transactions is done first,
but it is important to make sure there is a consistent order between them.
More recently, one of my most favorite test environments is to bring up open
solaris on an i7 processor based machine with some reasonable amount of memory
(8GB or more) and then put 8 or more instances of linux on it all running the
same build, and then test there with appropriate loading. You'll get latency
injection because of machine resource contention, but you'll also get 8,
independent OS and Java VM layers that will be readily able to provide just
about any unexplainable behavior you need to test with :-)
Gregg Wonderly