Vinayak,

Any ideas how to better utilize the disk and cpu? The system does not scale
to four threads when the data exceeds local memory. The query performance
is the same for both two and four threads. The results are the same when
using one or two disks.

We are utilizing a system that has one drive and four physical cores. The
specs on the drive show it has an average read/write of 156 MB/s. I set up
a few test to show how different processes make use of the drive.

Base test case with linux's dd command to find the read speed.
 - 160 MB/s with 20% cpu utilization

Next, I wrote a slimmed down version of our XML parser that reads the file
and parses the XML without saving the output.
 - 35 MB/s disk average, 122 MB/s disk max, 5 % cpu utilization, 1 thread
 - 30 MB/s disk average, 107 MB/s disk max, 7 % cpu utilization, 2 threads
 - 31 MB/s disk average, 102 MB/s disk max, 7 % cpu utilization, 4 threads

These are for an XQuery that parses and produces the XDM instance, but does
nothing with the result:
 - 9 MB/s disk average, 35 MB/s disk max, 5 % cpu utilization, 1 thread
 - 10 MB/s disk average, 34 MB/s disk max, 7 % cpu utilization, 2 threads
 - 10 MB/s disk average, 35 MB/s disk max, 5 % cpu utilization, 4 threads

Finally, the numbers for a full query processed through VXQuery:
 - 8 MB/s disk average, 29 MB/s disk max, 5 % cpu utilization, 1 thread
 - 9 MB/s disk average, 29 MB/s disk max, 7 % cpu utilization, 2 threads
 - 9 MB/s disk average, 29 MB/s disk max, 7 % cpu utilization, 4 threads

I did notice slight improvement for 2 and 4 threads when adding a character
buffer of 32M. The parser already has a character buffer of 8000 by
default. Any ideas how to get better utilization of the disk and cpu? Would
Java NIO be better option to the standard IO library? It seems we could be
getting 5 times more out of the system.

Thanks,
Preston

Reply via email to