Re: Ruminations on SparkGraphComputer at Scale

HadoopMarc Mon, 01 Feb 2016 13:38:31 -0800

Hi Marko,

Thanks for your enthousiastic and useful report! We had similiar 
experiences over here. SparkGraphcomputer seems to like small chunks of 
data of 128MB or so, even if you have 8 or 16 Gb in your executors.


In addition, when running Spark/Yarn, you need a high 
spark.yarn.executor.memoryOverhead 
value of about 20%, while 6-10% is mentioned in the SparkYarn reference 
https://spark.apache.org/docs/1.5.2/running-on-yarn.html . 
<https://spark.apache.org/docs/1.5.2/running-on-yarn.html>
Otherwise, the executor starves when Yarn is set to police queues.
I am sorry I cannot provide any quantative data, but I thought I'd mention 
it anyway, to give people a hint which knobs to tune.

Cheers,     Marc

Re: Ruminations on SparkGraphComputer at Scale

Reply via email to