I had this question come up and I'm not sure how to answer it. A user said that, for a big job, he thought it would be better to use MapReduce since it writes to disk between iterations instead of keeping the data in memory the entire time like Spark generally does.
I mentioned that Spark can cache to disk as well, but I'm not sure about the overarching question (which I realize is vague): for a typical job, would Spark use more memory than a MapReduce job? Are there any memory usage inefficiencies from either? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Does-Spark-use-more-memory-than-MapReduce-tp25030.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org