I found that I posted bad numbers for the average values. It should be 3x the value I have in the e-mail. The correction has helped me identify the next test and will post more about it after the test completes.
On Wed, Jul 16, 2014 at 8:45 PM, Vinayak Borkar <[email protected]> wrote: > Hi Preston, > > > I am assuming that for each of the readings below, the disk rate and CPU > usage is for the entire system rather than per thread. If that is the case, > adding threads does not seem to help at all. > > How long does the test shown below run for? Is the disk throughput and CPU > utilization being reported over a sizable length of time over which the > test runs? > > The test is for a single run after file system cache clear. The numbers are gathered from the dstat command. Dstat will generate csv output based on polling results during the test executions. > Finally, given that you have a good micro-benchmark, using YourKit will > show you where time is being spent for the various cases. For the times it > appears that all computation is happening with equal concurrency regardless > of the number of threads used. Can you check if there is contention on any > monitors? YourKit should readily show you that information. > > I will also look at the monitors in YourKit. > Vinayak > > > > On 7/16/14, 2:51 PM, Eldon Carman wrote: > >> Vinayak, >> >> Any ideas how to better utilize the disk and cpu? The system does not >> scale >> to four threads when the data exceeds local memory. The query performance >> is the same for both two and four threads. The results are the same when >> using one or two disks. >> >> We are utilizing a system that has one drive and four physical cores. The >> specs on the drive show it has an average read/write of 156 MB/s. I set up >> a few test to show how different processes make use of the drive. >> >> Base test case with linux's dd command to find the read speed. >> - 160 MB/s with 20% cpu utilization >> >> Next, I wrote a slimmed down version of our XML parser that reads the file >> and parses the XML without saving the output. >> - 35 MB/s disk average, 122 MB/s disk max, 5 % cpu utilization, 1 thread >> - 30 MB/s disk average, 107 MB/s disk max, 7 % cpu utilization, 2 >> threads >> - 31 MB/s disk average, 102 MB/s disk max, 7 % cpu utilization, 4 >> threads >> >> These are for an XQuery that parses and produces the XDM instance, but >> does >> nothing with the result: >> - 9 MB/s disk average, 35 MB/s disk max, 5 % cpu utilization, 1 thread >> - 10 MB/s disk average, 34 MB/s disk max, 7 % cpu utilization, 2 threads >> - 10 MB/s disk average, 35 MB/s disk max, 5 % cpu utilization, 4 threads >> >> Finally, the numbers for a full query processed through VXQuery: >> - 8 MB/s disk average, 29 MB/s disk max, 5 % cpu utilization, 1 thread >> - 9 MB/s disk average, 29 MB/s disk max, 7 % cpu utilization, 2 threads >> - 9 MB/s disk average, 29 MB/s disk max, 7 % cpu utilization, 4 threads >> >> I did notice slight improvement for 2 and 4 threads when adding a >> character >> buffer of 32M. The parser already has a character buffer of 8000 by >> default. Any ideas how to get better utilization of the disk and cpu? >> Would >> Java NIO be better option to the standard IO library? It seems we could be >> getting 5 times more out of the system. >> >> Thanks, >> Preston >> >> >
