I've got a stack of Dell Commodity servers-- Ram~>(8 to 32Gb) single or dual
quad core processor cores per machine. I think I will have them loaded with
CentOS. Eventually, I may want to add GPUs on the nodes to handle linear
alg. operations...

My Idea has been:

1) to find a way to configure Spark to allocate different resources
per-machine, per-job. -- at least have a "standard executor"... and allow
different machines to have different numbers of executors.

2) make (using vanilla spark) a pre-run optimization phase which benchmarks
the throughput of each node (per hardware), and repartition the dataset to
more efficiently use the hardware rather than rely on Spark Speculation--
which has always seemed a dis-optimal way to balance the load across several
differing machines.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/heterogeneous-cluster-hardware-tp11567p12581.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to