On 5/16/2016 12:12 PM, Michael Segel wrote:
For one use case.. we were considering using the thrift server as a way to
allow multiple clients access shared RDDs.
Within the Thrift Context, we create an RDD and expose it as a hive table.
The question is… where does the RDD exist. On the Thrift service node itself,
or is that just a reference to the RDD which is contained with contexts on the
cluster?
You can share RDDs using Apache Ignite - it is a distributed memory
grid/cache with tons of additional functionality. The advantage is extra
resilience (you can mirror caches or just partition them), you can query
the contents of the caches in standard SQL etc. Since the caches persist
past the existence of the Spark app, you can share them (obviously). You
also get read/write through to SQL or NoSQL databases on the back end
for persistence and loading/dumping caches to secondary storage. It is
written in Java so very easy to use from Scala/Spark apps.
---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org