data locality in spark

Grandl Robert Mon, 27 Apr 2015 08:31:37 -0700

Hi guys,
I am running some SQL queries, but all my tasks are reported as either 
NODE_LOCAL or PROCESS_LOCAL. 
In case of Hadoop world, the reduce tasks are RACK or NON_RACK LOCAL because 
they have to aggregate data from multiple hosts. However, in Spark even the 
aggregation stages are reported as NODE/PROCESS LOCAL.
Do I miss something, or why the reduce-like tasks are still NODE/PROCESS LOCAL ?
Thanks,Robert

data locality in spark

Reply via email to