Thanks Akhil. I searched DISK_AND_MEMORY_SER trying to figure out how it works, and I cannot find any documentation on that. Do you have a link for that?
If what DISK_AND_MEMORY_SER does is reading and writing to the disk with some memory caching, does that mean the output will be written to disk for each join, and then read back into memory for the next join? If so, how it is more performant than HIVE query model? Again I am new to this, so I might ask something stupid. Thanks, JT -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-requirement-of-using-Spark-tp17177p17204.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org