Here is one blog illustrating how to use Spark on Alluxio for this purpose.
Hope it will help:

http://www.alluxio.com/2016/04/getting-started-with-alluxio-and-spark/

On Mon, Jul 18, 2016 at 6:36 AM, Gene Pang <gene.p...@gmail.com> wrote:

> Hi,
>
> If you want to use Alluxio with Spark 2.x, it is recommended to write to
> and read from Alluxio with files. You can save an RDD with saveAsObjectFile
> with an Alluxio path (alluxio://host:port/path/to/file), and you can read
> that file from any other Spark job. Here is additional information on how
> to run Spark with Alluxio:
> http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html
>
> Hope that helps,
> Gene
>
> On Mon, Jul 18, 2016 at 12:11 AM, condor join <spark_ker...@outlook.com>
> wrote:
>
>> Hi All,
>>
>> I have some questions about OFF_HEAP Caching. In Spark 1.X when we use
>> *rdd.persist(StorageLevel.OFF_HEAP)*,that means rdd caching in
>> Tachyon(Alluxio). However,in Spark 2.X,we can directly use OFF_HEAP  For
>> Caching
>>
>> (
>> https://issues.apache.org/jira/browse/SPARK-13992?jql=project%20%3D%20SPARK%20AND%20text%20~%20%22off-heap%20caching%22).
>> I am confuse about this and I have follow questions:
>>
>> 1.In Spark 2.X, how should we use Tachyon for caching?
>>
>> 2.Is there any reason that must change in this way(I mean use off_heap
>> directly instead of using Tachyon)
>>
>> Thanks a lot!
>>
>>
>>
>>
>

Reply via email to