Exactly! The sharing part is used in the Spark Notebook (this one <https://github.com/andypetrella/spark-notebook/blob/master/notebooks/Tachyon%20Test.snb>) so we can share stuffs between notebooks which are different SparkContext (in diff JVM).
OTOH, we have a project that creates micro services on genomics data, for several reasons we used Tachyon to server genomes cubes (ranges across genomes), see here <https://github.com/med-at-scale/high-health>. HTH andy On Fri, Aug 7, 2015 at 8:36 PM Calvin Jia <jia.cal...@gmail.com> wrote: > Hi, > > Tachyon <http://tachyon-project.org> manages memory off heap which can > help prevent long GC pauses. Also, using Tachyon will allow the data to be > shared between Spark jobs if they use the same dataset. > > Here's <http://www.meetup.com/Tachyon/events/222485713/> a production use > case where Baidu runs Tachyon to get 30x performance improvement in their > SparkSQL workload. > > Hope this helps, > Calvin > > On Fri, Aug 7, 2015 at 9:42 AM, Muler <mulugeta.abe...@gmail.com> wrote: > >> Spark is an in-memory engine and attempts to do computation in-memory. >> Tachyon is memory-centeric distributed storage, OK, but how would that help >> ran Spark faster? >> > > -- andy