I would like to test the latency (tasks/s) perceived in a simple application
on Apache Spark.

The idea: The workers will generate random data to be placed in a list. The
final action (count) will count the total number of data points generated.

Below, the numberOfPartitions is equal to the number of datapoints which
need to be generated (datapoints are integers). 

Although the code works as expected, a total of 119 spark executors were
killed while running with 64 slaves. I feel this is because since spark
assigns executors to each node, the amount of total partitions each node is
assigned to compute may be larger than the available memory on that node.
This causes these executors to be killed and therefore, the latency
measurement I would like to analyze is inaccurate.

Any assistance with code cleanup below or how to fix the above issue to
decrease the number of killed executors, would be much appreciated.

    



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Latency-experiment-without-losing-executors-tp26981.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to