Hi, According with the research paper bellow of Mathei Zaharia, Spark's creator, http://people.csail.mit.edu/matei/papers/2013/sosp_spark_streaming.pdf
He says on page 10 that: Grep is network-bound due to the cost to replicate the input data to multiple nodes. So, I guess a can be a good initial recommendation. But I would like to know others workloads too. Best Regards. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Interconnect-benchmarking-tp8467p8470.html Sent from the Apache Spark User List mailing list archive at Nabble.com.