It is quite a bit of work. Again, I think going through the file system API is more ideal in the long run. In the long run, I don't even think the current offheap API makes much sense, and we should consider just removing it to simplify things.
On Tue, Nov 3, 2015 at 1:20 PM, Justin Uang <justin.u...@gmail.com> wrote: > Alright, we'll just stick with normal caching then. > > Just for future reference, how much work would it be to get it to retain > the partitions in tachyon. This is especially helpful in a multitenant > situation, where many users each have their own persistent spark contexts, > but where the notebooks can be idle for long periods of time while holding > onto cached rdds. > > On Tue, Nov 3, 2015 at 10:15 PM Reynold Xin <r...@databricks.com> wrote: > >> It is lost unfortunately (although can be recomputed automatically). >> >> >> On Tue, Nov 3, 2015 at 1:13 PM, Justin Uang <justin.u...@gmail.com> >> wrote: >> >>> Thanks for your response. I was worried about #3, vs being able to use >>> the objects directly. #2 seems to be the dealbreaker for my use case right? >>> Even if it I am using tachyon for caching, if an executor is lost, then >>> that partition is lost for the purposes of spark? >>> >>> On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <r...@databricks.com> wrote: >>> >>>> I don't think there is any special handling w.r.t. Tachyon vs in-heap >>>> caching. As a matter of fact, I think the current offheap caching >>>> implementation is pretty bad, because: >>>> >>>> 1. There is no namespace sharing in offheap mode >>>> 2. Similar to 1, you cannot recover the offheap memory once Spark >>>> driver or executor crashes >>>> 3. It requires expensive serialization to go offheap >>>> >>>> It would've been simpler to just treat Tachyon as a normal file system, >>>> and use it that way to at least satisfy 1 and 2, and also substantially >>>> simplify the internals. >>>> >>>> >>>> >>>> >>>> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <justin.u...@gmail.com> >>>> wrote: >>>> >>>>> Yup, but I'm wondering what happens when an executor does get removed, >>>>> but when we're using tachyon. Will the cached data still be available, >>>>> since we're using off-heap storage, so the data isn't stored in the >>>>> executor? >>>>> >>>>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams < >>>>> ryan.blake.willi...@gmail.com> wrote: >>>>> >>>>>> fwiw, I think that having cached RDD partitions prevents executors >>>>>> from being removed under dynamic allocation by default; see >>>>>> SPARK-8958 <https://issues.apache.org/jira/browse/SPARK-8958>. The >>>>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config >>>>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation> >>>>>> controls this. >>>>>> >>>>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <justin.u...@gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hey guys, >>>>>>> >>>>>>> According to the docs for 1.5.1, when an executor is removed for >>>>>>> dynamic allocation, the cached data is gone. If I use off-heap storage >>>>>>> like >>>>>>> tachyon, conceptually there isn't this issue anymore, but is the cached >>>>>>> data still available in practice? This would be great because then we >>>>>>> would >>>>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be >>>>>>> quite small. >>>>>>> >>>>>>> ================== >>>>>>> In addition to writing shuffle files, executors also cache data >>>>>>> either on disk or in memory. When an executor is removed, however, all >>>>>>> cached data will no longer be accessible. There is currently not yet a >>>>>>> solution for this in Spark 1.2. In future releases, the cached data may >>>>>>> be >>>>>>> preserved through an off-heap storage similar in spirit to how shuffle >>>>>>> files are preserved through the external shuffle service. >>>>>>> ================== >>>>>>> >>>>>> >>>> >>