Hi Michael, Yes, you can use Alluxio to share Spark RDDs. Here is a blog post about getting started with Spark and Alluxio ( http://www.alluxio.com/2016/04/getting-started-with-alluxio-and-spark/), and some documentation ( http://alluxio.org/documentation/master/en/Running-Spark-on-Alluxio.html).
I hope that helps, Gene On Tue, May 17, 2016 at 8:36 AM, Michael Segel <msegel_had...@hotmail.com> wrote: > Thanks for the response. > > That’s what I thought, but I didn’t want to assume anything. > (You know what happens when you ass u me … :-) > > > Not sure about Tachyon though. Its a thought, but I’m very conservative > when it comes to design choices. > > > On May 16, 2016, at 5:21 PM, John Trengrove <john.trengr...@servian.com.au> > wrote: > > If you are wanting to share RDDs it might be a good idea to check out > Tachyon / Alluxio. > > For the Thrift server, I believe the datasets are located in your Spark > cluster as RDDs and you just communicate with it via the Thrift > JDBC Distributed Query Engine connector. > > 2016-05-17 5:12 GMT+10:00 Michael Segel <msegel_had...@hotmail.com>: > >> For one use case.. we were considering using the thrift server as a way >> to allow multiple clients access shared RDDs. >> >> Within the Thrift Context, we create an RDD and expose it as a hive table. >> >> The question is… where does the RDD exist. On the Thrift service node >> itself, or is that just a reference to the RDD which is contained with >> contexts on the cluster? >> >> >> Thx >> >> -Mike >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > >