Re: Off-heap storage and dynamic allocation

Justin Uang Tue, 03 Nov 2015 13:28:48 -0800

Cool, thanks for the dev insight into what parts of the codebase are
worthwhile, and which are not =)


On Tue, Nov 3, 2015 at 10:25 PM Reynold Xin <[email protected]> wrote:

> It is quite a bit of work. Again, I think going through the file system
> API is more ideal in the long run. In the long run, I don't even think the
> current offheap API makes much sense, and we should consider just removing
> it to simplify things.
>
> On Tue, Nov 3, 2015 at 1:20 PM, Justin Uang <[email protected]> wrote:
>
>> Alright, we'll just stick with normal caching then.
>>
>> Just for future reference, how much work would it be to get it to retain
>> the partitions in tachyon. This is especially helpful in a multitenant
>> situation, where many users each have their own persistent spark contexts,
>> but where the notebooks can be idle for long periods of time while holding
>> onto cached rdds.
>>
>> On Tue, Nov 3, 2015 at 10:15 PM Reynold Xin <[email protected]> wrote:
>>
>>> It is lost unfortunately (although can be recomputed automatically).
>>>
>>>
>>> On Tue, Nov 3, 2015 at 1:13 PM, Justin Uang <[email protected]>
>>> wrote:
>>>
>>>> Thanks for your response. I was worried about #3, vs being able to use
>>>> the objects directly. #2 seems to be the dealbreaker for my use case right?
>>>> Even if it I am using tachyon for caching, if an executor is lost, then
>>>> that partition is lost for the purposes of spark?
>>>>
>>>> On Tue, Nov 3, 2015 at 5:53 PM Reynold Xin <[email protected]> wrote:
>>>>
>>>>> I don't think there is any special handling w.r.t. Tachyon vs in-heap
>>>>> caching. As a matter of fact, I think the current offheap caching
>>>>> implementation is pretty bad, because:
>>>>>
>>>>> 1. There is no namespace sharing in offheap mode
>>>>> 2. Similar to 1, you cannot recover the offheap memory once Spark
>>>>> driver or executor crashes
>>>>> 3. It requires expensive serialization to go offheap
>>>>>
>>>>> It would've been simpler to just treat Tachyon as a normal file
>>>>> system, and use it that way to at least satisfy 1 and 2, and also
>>>>> substantially simplify the internals.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Nov 3, 2015 at 7:59 AM, Justin Uang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Yup, but I'm wondering what happens when an executor does get
>>>>>> removed, but when we're using tachyon. Will the cached data still be
>>>>>> available, since we're using off-heap storage, so the data isn't stored 
>>>>>> in
>>>>>> the executor?
>>>>>>
>>>>>> On Tue, Nov 3, 2015 at 4:57 PM Ryan Williams <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> fwiw, I think that having cached RDD partitions prevents executors
>>>>>>> from being removed under dynamic allocation by default; see
>>>>>>> SPARK-8958 <https://issues.apache.org/jira/browse/SPARK-8958>. The
>>>>>>> "spark.dynamicAllocation.cachedExecutorIdleTimeout" config
>>>>>>> <http://spark.apache.org/docs/latest/configuration.html#dynamic-allocation>
>>>>>>> controls this.
>>>>>>>
>>>>>>> On Fri, Oct 30, 2015 at 12:14 PM Justin Uang <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hey guys,
>>>>>>>>
>>>>>>>> According to the docs for 1.5.1, when an executor is removed for
>>>>>>>> dynamic allocation, the cached data is gone. If I use off-heap storage 
>>>>>>>> like
>>>>>>>> tachyon, conceptually there isn't this issue anymore, but is the cached
>>>>>>>> data still available in practice? This would be great because then we 
>>>>>>>> would
>>>>>>>> be able to set spark.dynamicAllocation.cachedExecutorIdleTimeout to be
>>>>>>>> quite small.
>>>>>>>>
>>>>>>>> ==================
>>>>>>>> In addition to writing shuffle files, executors also cache data
>>>>>>>> either on disk or in memory. When an executor is removed, however, all
>>>>>>>> cached data will no longer be accessible. There is currently not yet a
>>>>>>>> solution for this in Spark 1.2. In future releases, the cached data 
>>>>>>>> may be
>>>>>>>> preserved through an off-heap storage similar in spirit to how shuffle
>>>>>>>> files are preserved through the external shuffle service.
>>>>>>>> ==================
>>>>>>>>
>>>>>>>
>>>>>
>>>
>

Re: Off-heap storage and dynamic allocation

Reply via email to