Re: Not fully cached when there is enough memory

2014-06-12 Thread Daniel Siegmann
I too have seen cached RDDs not hit 100%, even when they are DISK_ONLY.
Just saw that yesterday in fact. In some cases RDDs I expected didn't show
up in the list at all. I have no idea if this is an issue with Spark or
something I'm not understanding about how persist works (probably the
latter).

If I figure out the reason for this I'll let you know.


On Wed, Jun 11, 2014 at 8:54 PM, Shuo Xiang shuoxiang...@gmail.com wrote:

 Xiangrui, clicking into the RDD link, it gives the same message, say only
 96 of 100 partitions are cached. The disk/memory usage are the same, which
 is far below the limit.
 Is this what you want to check or other issue?


 On Wed, Jun 11, 2014 at 4:38 PM, Xiangrui Meng men...@gmail.com wrote:

 Could you try to click one that RDD and see the storage info per
 partition? I tried continuously caching RDDs, so new ones kick old
 ones out when there is not enough memory. I saw similar glitches but
 the storage info per partition is correct. If you find a way to
 reproduce this error, please create a JIRA. Thanks! -Xiangrui





-- 
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning

440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY 10001
E: daniel.siegm...@velos.io W: www.velos.io


Re: Not fully cached when there is enough memory

2014-06-11 Thread Xiangrui Meng
Could you try to click one that RDD and see the storage info per
partition? I tried continuously caching RDDs, so new ones kick old
ones out when there is not enough memory. I saw similar glitches but
the storage info per partition is correct. If you find a way to
reproduce this error, please create a JIRA. Thanks! -Xiangrui


Re: Not fully cached when there is enough memory

2014-06-11 Thread Shuo Xiang
Xiangrui, clicking into the RDD link, it gives the same message, say only
96 of 100 partitions are cached. The disk/memory usage are the same, which
is far below the limit.
Is this what you want to check or other issue?


On Wed, Jun 11, 2014 at 4:38 PM, Xiangrui Meng men...@gmail.com wrote:

 Could you try to click one that RDD and see the storage info per
 partition? I tried continuously caching RDDs, so new ones kick old
 ones out when there is not enough memory. I saw similar glitches but
 the storage info per partition is correct. If you find a way to
 reproduce this error, please create a JIRA. Thanks! -Xiangrui