On 5/16/2016 12:12 PM, Michael Segel wrote:
For one use case.. we were considering using the thrift server as a way to 
allow multiple clients access shared RDDs.

Within the Thrift Context, we create an RDD and expose it as a hive table.

The question  is… where does the RDD exist. On the Thrift service node itself, 
or is that just a reference to the RDD which is contained with contexts on the 
cluster?


You can share RDDs using Apache Ignite - it is a distributed memory grid/cache with tons of additional functionality. The advantage is extra resilience (you can mirror caches or just partition them), you can query the contents of the caches in standard SQL etc. Since the caches persist past the existence of the Spark app, you can share them (obviously). You also get read/write through to SQL or NoSQL databases on the back end for persistence and loading/dumping caches to secondary storage. It is written in Java so very easy to use from Scala/Spark apps.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to