Hi Takeshi,

Thank you for getting back to me. If this is not possible then perhaps you
can help me with the root problem that caused me to ask this question.

Basically I have a job where I'm loading/persisting an RDD and running
queries against it. The problem I'm having is that even though there is
plenty of space in memory, the RDD is not fully persisting. Once I run
multiple queries against it the RDD fully persists, but this means that the
first 4/5 queries I run are extremely slow.

Is there any way I can make sure that the entire RDD ends up in memory the
first time I load it?

Thank you
On Thu, Mar 24, 2016 at 1:21 AM Takeshi Yamamuro <linguin....@gmail.com>
wrote:

> just re-sent,
>
>
> ---------- Forwarded message ----------
> From: Takeshi Yamamuro <linguin....@gmail.com>
> Date: Thu, Mar 24, 2016 at 5:19 PM
> Subject: Re: Forcing data from disk to memory
> To: Daniel Imberman <daniel.imber...@gmail.com>
>
>
> Hi,
>
> We have no direct approach; we need to unpersist cached data, then
> re-cache data as MEMORY_ONLY.
>
> // maropu
>
> On Thu, Mar 24, 2016 at 8:22 AM, Daniel Imberman <
> daniel.imber...@gmail.com> wrote:
>
>> Hi all,
>>
>> So I have a question about persistence. Let's say I have an RDD that's
>> persisted MEMORY_AND_DISK, and I know that I now have enough memory space
>> cleared up that I can force the data on disk into memory. Is it possible
>> to
>> tell spark to re-evaluate the open RDD memory and move that information?
>>
>> Thank you
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Forcing-data-from-disk-to-memory-tp26585.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>
> --
> ---
> Takeshi Yamamuro
>
>
>
> --
> ---
> Takeshi Yamamuro
>

Reply via email to