Here is one blog illustrating how to use Spark on Alluxio for this purpose. Hope it will help:
http://www.alluxio.com/2016/04/getting-started-with-alluxio-and-spark/ On Mon, Jul 18, 2016 at 6:36 AM, Gene Pang <gene.p...@gmail.com> wrote: > Hi, > > If you want to use Alluxio with Spark 2.x, it is recommended to write to > and read from Alluxio with files. You can save an RDD with saveAsObjectFile > with an Alluxio path (alluxio://host:port/path/to/file), and you can read > that file from any other Spark job. Here is additional information on how > to run Spark with Alluxio: > http://www.alluxio.org/docs/master/en/Running-Spark-on-Alluxio.html > > Hope that helps, > Gene > > On Mon, Jul 18, 2016 at 12:11 AM, condor join <spark_ker...@outlook.com> > wrote: > >> Hi All, >> >> I have some questions about OFF_HEAP Caching. In Spark 1.X when we use >> *rdd.persist(StorageLevel.OFF_HEAP)*,that means rdd caching in >> Tachyon(Alluxio). However,in Spark 2.X,we can directly use OFF_HEAP For >> Caching >> >> ( >> https://issues.apache.org/jira/browse/SPARK-13992?jql=project%20%3D%20SPARK%20AND%20text%20~%20%22off-heap%20caching%22). >> I am confuse about this and I have follow questions: >> >> 1.In Spark 2.X, how should we use Tachyon for caching? >> >> 2.Is there any reason that must change in this way(I mean use off_heap >> directly instead of using Tachyon) >> >> Thanks a lot! >> >> >> >> >