Hi Preston,

I am assuming that for each of the readings below, the disk rate and CPU usage is for the entire system rather than per thread. If that is the case, adding threads does not seem to help at all.

How long does the test shown below run for? Is the disk throughput and CPU utilization being reported over a sizable length of time over which the test runs?

Finally, given that you have a good micro-benchmark, using YourKit will show you where time is being spent for the various cases. For the times it appears that all computation is happening with equal concurrency regardless of the number of threads used. Can you check if there is contention on any monitors? YourKit should readily show you that information.

Vinayak


On 7/16/14, 2:51 PM, Eldon Carman wrote:
Vinayak,

Any ideas how to better utilize the disk and cpu? The system does not scale
to four threads when the data exceeds local memory. The query performance
is the same for both two and four threads. The results are the same when
using one or two disks.

We are utilizing a system that has one drive and four physical cores. The
specs on the drive show it has an average read/write of 156 MB/s. I set up
a few test to show how different processes make use of the drive.

Base test case with linux's dd command to find the read speed.
  - 160 MB/s with 20% cpu utilization

Next, I wrote a slimmed down version of our XML parser that reads the file
and parses the XML without saving the output.
  - 35 MB/s disk average, 122 MB/s disk max, 5 % cpu utilization, 1 thread
  - 30 MB/s disk average, 107 MB/s disk max, 7 % cpu utilization, 2 threads
  - 31 MB/s disk average, 102 MB/s disk max, 7 % cpu utilization, 4 threads

These are for an XQuery that parses and produces the XDM instance, but does
nothing with the result:
  - 9 MB/s disk average, 35 MB/s disk max, 5 % cpu utilization, 1 thread
  - 10 MB/s disk average, 34 MB/s disk max, 7 % cpu utilization, 2 threads
  - 10 MB/s disk average, 35 MB/s disk max, 5 % cpu utilization, 4 threads

Finally, the numbers for a full query processed through VXQuery:
  - 8 MB/s disk average, 29 MB/s disk max, 5 % cpu utilization, 1 thread
  - 9 MB/s disk average, 29 MB/s disk max, 7 % cpu utilization, 2 threads
  - 9 MB/s disk average, 29 MB/s disk max, 7 % cpu utilization, 4 threads

I did notice slight improvement for 2 and 4 threads when adding a character
buffer of 32M. The parser already has a character buffer of 8000 by
default. Any ideas how to get better utilization of the disk and cpu? Would
Java NIO be better option to the standard IO library? It seems we could be
getting 5 times more out of the system.

Thanks,
Preston


Reply via email to