
Let's say I know the following parameters of my system and cluster:
- number of nodes and their CPUs;
- per node size and total size;
- number of caches;
- number of entries in the caches;
- network bandwidth.

And I want to tune a number of partitions per cache to gain much possible performance of my cluster.

The first obvious thing we know is that the number of partitions mustn't be less than the number of nodes.

Next possible suggestion is that if average partition size is measured in tens/hundreds(?) of gigabytes and more then we should set more partitions to reduce this size. I have the following case in mind for this suggestion. Let's say we have partition "10" which size is around 20 GB. If to increase the number of partitions in a such a way that this 20 GB will be split among two or three partitions located on different nodes then the rebalancing should happen faster because the same amount of data will be preloaded from different nodes rather than from a single one. Is my understanding correct? Am I missing something?

Is anyone else have other suggestions in mind taking into account the parameters from the list above?


Reply via email to