The spark context will be reused, so the spark context initialization won't
affect the throughput test.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Why-does-spark-take-so-much-time-for-simple-task-without-calculation-tp27628p27657.html
Sent from the
My Detail test process:
1. In initialization, it will create 100 string RDDs and distribute
them in spark workers.
for (int i = 1; i <= numOfRDDs; i++) {
JavaRDD rddData =
sc.parallelize(Arrays.asList(Integer.toString(i))).coalesce(1);
I install a spark standalone and run the spark cluster(one master and one
worker) in a windows 2008 server with 16cores and 24GB memory.
I have done a simple test: Just create a string RDD and simply return it. I
use JMeter to test throughput but the highest is around 35/sec. I think
spark is