Re: [Spark] RDDs are not persisting in memory

diplomatic Guru Tue, 11 Oct 2016 03:24:38 -0700

@Song, I have called an action but it did not cache as you can see in the
provided screenshot on my original email. It has cahced into Disk but not
memory.


@Chin Wei Low, I have 15GB memory allocated which is more than the dataset
size.

Any other suggestion please?


Kind regards,

Guru

On 11 October 2016 at 03:34, Chin Wei Low <lowchin...@gmail.com> wrote:

> Hi,
>
> Your RDD is 5GB, perhaps it is too large to fit into executor's storage
> memory. You can refer to the Executors tab in Spark UI to check the
> available memory for storage for each of the executor.
>
> Regards,
> Chin Wei
>
> On Tue, Oct 11, 2016 at 6:14 AM, diplomatic Guru <diplomaticg...@gmail.com
> > wrote:
>
>> Hello team,
>>
>> Spark version: 1.6.0
>>
>> I'm trying to persist done data into memory for reusing them. However,
>> when I call rdd.cache() OR  rdd.persist(StorageLevel.MEMORY_ONLY())  it
>> does not store the data as I can not see any rdd information under WebUI
>> (Storage Tab).
>>
>> Therefore I tried rdd.persist(StorageLevel.MEMORY_AND_DISK()), for which
>> it stored the data into Disk only as shown in below screenshot:
>>
>> [image: Inline images 2]
>>
>> Do you know why the memory is not being used?
>>
>> Is there a configuration in cluster level to stop jobs from storing data
>> into memory altogether?
>>
>>
>> Please let me know.
>>
>> Thanks
>>
>> Guru
>>
>>
>

Re: [Spark] RDDs are not persisting in memory

Reply via email to