Hey guys, According to the docs for 1.5.1, when an executor is removed for dynamic allocation, the cached data is gone. If I use off-heap storage like tachyon, conceptually there isn't this issue anymore, but is the cached data still available in practice? This would be great because then we would be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be quite small.
================== In addition to writing shuffle files, executors also cache data either on disk or in memory. When an executor is removed, however, all cached data will no longer be accessible. There is currently not yet a solution for this in Spark 1.2. In future releases, the cached data may be preserved through an off-heap storage similar in spirit to how shuffle files are preserved through the external shuffle service. ==================