Re: [Spark] RDDs are not persisting in memory

2016-10-11 Thread diplomatic Guru
Hello team. so I found and resolved the issue. In case if someone run into same problem this was the problem. >>Each nodes were allocated 1397MB memory for storages. 16/10/11 13:16:58 INFO storage.MemoryStore: MemoryStore started with capacity 1397.3 MB >> However, my RDD exceeded the storage

Re: [Spark] RDDs are not persisting in memory

2016-10-11 Thread diplomatic Guru
@Song, I have called an action but it did not cache as you can see in the provided screenshot on my original email. It has cahced into Disk but not memory. @Chin Wei Low, I have 15GB memory allocated which is more than the dataset size. Any other suggestion please? Kind regards, Guru On 11

Re: [Spark] RDDs are not persisting in memory

2016-10-10 Thread Chin Wei Low
Hi, Your RDD is 5GB, perhaps it is too large to fit into executor's storage memory. You can refer to the Executors tab in Spark UI to check the available memory for storage for each of the executor. Regards, Chin Wei On Tue, Oct 11, 2016 at 6:14 AM, diplomatic Guru

[Spark] RDDs are not persisting in memory

2016-10-10 Thread diplomatic Guru
Hello team, Spark version: 1.6.0 I'm trying to persist done data into memory for reusing them. However, when I call rdd.cache() OR rdd.persist(StorageLevel.MEMORY_ONLY()) it does not store the data as I can not see any rdd information under WebUI (Storage Tab). Therefore I tried