Re: Caching small Rdd's take really long time and Spark seems frozen

2018-08-23 Thread Guillermo Ortiz
it's a complex DAG before the point I cache the RDD, they are some joins, filter and maps before caching data, but most of the times it doesn't take almost time to do it. I could understand if it would take the same time all the times to process or cache the data. Besides it seems random and they

Re: Caching small Rdd's take really long time and Spark seems frozen

2018-08-23 Thread Sonal Goyal
How are these small RDDs created? Could the blockage be in their compute creation instead of their caching? Thanks, Sonal Nube Technologies On Thu, Aug 23, 2018 at 6:38 PM, Guillermo Ortiz wrote: > I use spark with caching with

Caching small Rdd's take really long time and Spark seems frozen

2018-08-23 Thread Guillermo Ortiz
I use spark with caching with persist method. I have several RDDs what I cache but some of them are pretty small (about 300kbytes). Most of time it works well and usually lasts 1s the whole job, but sometimes it takes about 40s to store 300kbytes to cache. If I go to the SparkUI->Cache, I can see

Re: How to deal with context dependent computing?

2018-08-23 Thread Sonal Goyal
Hi Junfeng, Can you please show by means of an example what you are trying to achieve? Thanks, Sonal Nube Technologies On Thu, Aug 23, 2018 at 8:22 AM, JF Chen wrote: > For example, I have some data with timstamp marked as