Hi all,
I am new to Spark, and have one problem that, no computations run on
workers/slave_servers in the standalone cluster mode.
The Spark version is 1.6.0, and environment is CentOS. I run the example codes,
e.g.
https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/s
Hi all,
I am working with Spark 1.6, scala and have a big dataset divided into several
small files.
My question is: right now the read operation takes really long time and often
has RDD warnings. Is there a way I can read the files in parallel, that all
nodes or workers read the file at the same