Hi Takeshi, Thank you for getting back to me. If this is not possible then perhaps you can help me with the root problem that caused me to ask this question.
Basically I have a job where I'm loading/persisting an RDD and running queries against it. The problem I'm having is that even though there is plenty of space in memory, the RDD is not fully persisting. Once I run multiple queries against it the RDD fully persists, but this means that the first 4/5 queries I run are extremely slow. Is there any way I can make sure that the entire RDD ends up in memory the first time I load it? Thank you On Thu, Mar 24, 2016 at 1:21 AM Takeshi Yamamuro <linguin....@gmail.com> wrote: > just re-sent, > > > ---------- Forwarded message ---------- > From: Takeshi Yamamuro <linguin....@gmail.com> > Date: Thu, Mar 24, 2016 at 5:19 PM > Subject: Re: Forcing data from disk to memory > To: Daniel Imberman <daniel.imber...@gmail.com> > > > Hi, > > We have no direct approach; we need to unpersist cached data, then > re-cache data as MEMORY_ONLY. > > // maropu > > On Thu, Mar 24, 2016 at 8:22 AM, Daniel Imberman < > daniel.imber...@gmail.com> wrote: > >> Hi all, >> >> So I have a question about persistence. Let's say I have an RDD that's >> persisted MEMORY_AND_DISK, and I know that I now have enough memory space >> cleared up that I can force the data on disk into memory. Is it possible >> to >> tell spark to re-evaluate the open RDD memory and move that information? >> >> Thank you >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Forcing-data-from-disk-to-memory-tp26585.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> > > > -- > --- > Takeshi Yamamuro > > > > -- > --- > Takeshi Yamamuro >