Spark would be much faster on process_local instead of node_local. Node_local references data from local harddisk, process_local references data from in-memory thread.
Mayur Rustagi Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com @mayur_rustagi <https://twitter.com/mayur_rustagi> On Tue, Apr 22, 2014 at 4:45 PM, Joe L <selme...@yahoo.com> wrote: > I got the following performance is it normal in spark to be like this. some > times spark switchs into node_local mode from process_local and it becomes > 10x faster. I am very confused. > > scala> val a = sc.textFile("/user/exobrain/batselem/LUBM1000") > scala> f.count() > > Long = 137805557 > took 130.809661618 s > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/help-me-tp4598.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. >