Hi, Tachyon <http://tachyon-project.org> manages memory off heap which can help prevent long GC pauses. Also, using Tachyon will allow the data to be shared between Spark jobs if they use the same dataset.
Here's <http://www.meetup.com/Tachyon/events/222485713/> a production use case where Baidu runs Tachyon to get 30x performance improvement in their SparkSQL workload. Hope this helps, Calvin On Fri, Aug 7, 2015 at 9:42 AM, Muler <mulugeta.abe...@gmail.com> wrote: > Spark is an in-memory engine and attempts to do computation in-memory. > Tachyon is memory-centeric distributed storage, OK, but how would that help > ran Spark faster? >