subject:"\[General Question\] \[Hadoop \+ Spark at scale\] Spark Rack Awareness \?"

Re: [General Question] [Hadoop + Spark at scale] Spark Rack Awareness ?

2015-07-19 Thread Sandy Ryza

Hi Mike, Spark is rack-aware in its task scheduling. Currently Spark doesn't honor any locality preferences when scheduling executors, but this is being addressed in SPARK-4352, after which executor-scheduling will be rack-aware as well. -Sandy On Sat, Jul 18, 2015 at 6:25 PM, Mike Frampton

[General Question] [Hadoop + Spark at scale] Spark Rack Awareness ?

2015-07-18 Thread Mike Frampton

I wanted to ask a general question about Hadoop/Yarn and Apache Spark integration. I know that Hadoop on a physical cluster has rack awareness. i.e. It attempts to minimise network traffic by saving replicated blocks within a rack. i.e. I wondered whether, when Spark is configured to use