Re: All executors run on just a few nodes

2014-10-20 Thread Tao Xiao
Raymond, Thank you. But I read from other thread http://apache-spark-user-list.1001560.n3.nabble.com/When-does-Spark-switch-from-PROCESS-LOCAL-to-NODE-LOCAL-or-RACK-LOCAL-td7091.html that PROCESS_LOCAL means the data is in the same JVM as the code that is running. When data is in the same JVM

Re: All executors run on just a few nodes

2014-10-20 Thread raymond
when the data’s source host is not one of the registered executors, it will also be marked as PROCESS_LOCAL too, though it should have a different NAME for this. I don’t know did someone change this name very recently. but for 0.9, it is the case . When I say satisfy, yes, if the executors

All executors run on just a few nodes

2014-10-19 Thread Tao Xiao
Hi all, I have a Spark-0.9 cluster, which has 16 nodes. I wrote a Spark application to read data from an HBase table, which has 86 regions spreading over 20 RegionServers. I submitted the Spark app in Spark standalone mode and found that there were 86 executors running on just 3 nodes and it