VXQuery Performance Study

Eldon Carman Fri, 28 Feb 2014 10:47:32 -0800

Last time we discussed running two types of parallel tests: speed up and
batch scale up.


For review:
 speed up: the total data size is constant.
 batch scale up: the data is constant for each machine.

The test are run with increasing number of machines. Depending on the test,
the result speed will decrease log rhythmically or stay the same. Since our
last conversation we have added a feature that allows local partitioning of
data. That means each node can have more than one partition. I wanted to
see how this affected our testing.

We have several options:
 * single partition, many machines (previously discussed)
 * many partitions, single machine
 * many partitions, many machines

The above tests could be run for each of these scenarios. I think there is
value in testing both single partition, many machines and many partitions,
single machine. Should we do a combination? Also what about number of disks?

Cluster background: 5 machines each with 4 cores and 2 drives.

The preliminary results show we can benefit from using all the available
cores with a slight boost from using multiple drives.

Here is a suggested starting list of tests:

Single Machine - Multiple Partitioning
 - Run these tests for 1, 2, 4 and 8 partitions on a machine with 4 cores
and one drive.
 - Run these tests for 2, 4 and 8 partitions on a machine with 4 cores and
two drives.
 - Each partition will a separate working thread.
 * Speed Up
 * Batch Scale Out

Cluster - Single Partition
 - Run these tests for 1, 2, 3, 4 and 5 machine with 4 cores, one drive and
one partition.
 - Each machine will be a separate node controller.
 * Speed Up
 * Batch Scale Out

Should we create cluster tests matching the single machine test?
How about cluster tests that use multiple partitions?

VXQuery Performance Study

Reply via email to