I sent it prematurely.

They are already pluggable, or at least in the process to be more
pluggable. In 1.4, instead of calling the external system's API directly,
we added an API for that.  There is a patch to add support for HDFS
in-memory cache.

Somewhat orthogonal to this, longer term, I am not sure whether it makes
sense to have the current off heap API, because there is no namespacing and
the benefit to end users is actually not very substantial (at least I can
think of much simpler ways to achieve exactly the same gains), and yet it
introduces quite a bit of complexity to the codebase.




On Mon, Jul 20, 2015 at 9:34 PM, Reynold Xin <r...@databricks.com> wrote:

> They are already pluggable.
>
>
> On Mon, Jul 20, 2015 at 9:32 PM, Prashant Sharma <scrapco...@gmail.com>
> wrote:
>
>> +1 Looks like a nice idea(I do not see any harm). Would you like to work
>> on the patch to support it ?
>>
>> Prashant Sharma
>>
>>
>>
>> On Tue, Jul 21, 2015 at 2:46 AM, Alexey Goncharuk <
>> alexey.goncha...@gmail.com> wrote:
>>
>>> Hello Spark community,
>>>
>>> I was looking through the code in order to understand better how RDD is
>>> persisted to Tachyon off-heap filesystem. It looks like that the Tachyon
>>> filesystem is hard-coded and there is no way to switch to another in-memory
>>> filesystem. I think it would be great if the implementation of the
>>> BlockManager and BlockStore would be able to plug in another filesystem.
>>>
>>> For example, Apache Ignite also has an implementation of in-memory
>>> filesystem which can store data in on-heap and off-heap formats. It would
>>> be great if it could integrate with Spark.
>>>
>>> I have filed a ticket in Jira:
>>> https://issues.apache.org/jira/browse/SPARK-9203
>>>
>>> If it makes sense, I will be happy to contribute to it.
>>>
>>> Thoughts?
>>>
>>> -Alexey (Apache Ignite PMC)
>>>
>>
>>
>

Reply via email to