Hi,

Well, you could use Mesos or Yarn2 to define  resources per Job - you can
give only as much resources (cores, memory etc.) per machine as your
"worst" machine has. The rest is done by Mesos or Yarn. By doing this you
avoid a per machine resource assignment without any disadvantages. You can
run without any problems run other jobs in parallel and older machines
won't get overloaded.

however, you should take care that your cluster does not get too
heterogeneous.

Best regards,
Jörn
Le 21 août 2014 16:55, "anthonyjschu...@gmail.com" <
anthonyjschu...@gmail.com> a écrit :

> I've got a stack of Dell Commodity servers-- Ram~>(8 to 32Gb) single or
> dual
> quad core processor cores per machine. I think I will have them loaded with
> CentOS. Eventually, I may want to add GPUs on the nodes to handle linear
> alg. operations...
>
> My Idea has been:
>
> 1) to find a way to configure Spark to allocate different resources
> per-machine, per-job. -- at least have a "standard executor"... and allow
> different machines to have different numbers of executors.
>
> 2) make (using vanilla spark) a pre-run optimization phase which benchmarks
> the throughput of each node (per hardware), and repartition the dataset to
> more efficiently use the hardware rather than rely on Spark Speculation--
> which has always seemed a dis-optimal way to balance the load across
> several
> differing machines.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12581.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to