Hi list,

We (scrapinghub) are planning to deploy spark in a 10+ node cluster, mainly
for processing data in HDFS and kafka streaming. We are thinking of using
mesos instead of yarn as the cluster resource manager so we can use docker
container as the executor and makes deployment easier. But there is one
import thing before making the decision: data locality.

If we run spark on mesos, can it achieve good data locality when processing
HDFS data? I think spark on yarn can achieve that out of the box, but not
sure whether spark on mesos could do that.

I've searched through the archive of the list, but didn't find a helpful
answer yet. Any reply is appreciated.

Regards,
Shuai

Reply via email to