I don't have a great answer for you. For us, we found a common divisor, not
necessarily a whole gigabyte, of the available memory of the different
hardware and used that as the amount of memory per worker and scaled the
number of cores accordingly so that every core in the system has the same
amount of memory. The quotient of the available memory and the common
divisor, hopefully a whole number to reduce waste, was the number of
workers we spun up. Therefore, if you have 64G, 30G, and 15G available
memory on your machines, the divisor could be 15G and you'd have 4, 2 and 1
worker per machine. Every worker on all the machines would have the same
number of cores, set to what you think is a good value.

Hope that helps.

On Wed, Dec 3, 2014 at 7:44 AM, <kartheek.m...@gmail.com> wrote:

> Hi Victor,
>
> I want to setup a heterogeneous stand-alone spark cluster. I have hardware
> with different memory sizes and varied number of cores per node. I could
> get all the nodes active in the cluster only when the size of memory per
> executor is set as the least available memory size of all nodes and is same
> with no.of cores/executor. As of now, I configure one executor per node.
>
> Can you please suggest some path to set up a stand-alone heterogeneous
> cluster such that I can efficiently use the available hardware.
>
> Thank you
>
>
>
>
> _____________________________________
> Sent from http://apache-spark-user-list.1001560.n3.nabble.com
>
>

Reply via email to