RE: Question about RDD cache, unpersist, materialization

2014-06-12 Thread innowireless TaeYun Kim
) compute_that_rdd; do_actual_unpersist(); } From: Daniel Siegmann [mailto:daniel.siegm...@velos.io] Sent: Friday, June 13, 2014 5:38 AM To: user@spark.apache.org Subject: Re: Question about RDD cache, unpersist, materialization I've run into this issue. The goal of caching

Re: Question about RDD cache, unpersist, materialization

2014-06-12 Thread Nicholas Chammas
(); } *From:* Daniel Siegmann [mailto:daniel.siegm...@velos.io] *Sent:* Friday, June 13, 2014 5:38 AM *To:* user@spark.apache.org *Subject:* Re: Question about RDD cache, unpersist, materialization I've run into this issue. The goal of caching / persist seems to be to avoid recomputing an RDD when

RE: Question about RDD cache, unpersist, materialization

2014-06-12 Thread innowireless TaeYun Kim
; do_actual_unpersist(); } From: Daniel Siegmann [mailto:daniel.siegm...@velos.io] Sent: Friday, June 13, 2014 5:38 AM To: user@spark.apache.org Subject: Re: Question about RDD cache, unpersist, materialization I've run into this issue. The goal of caching / persist seems to be to avoid recomputing

RE: Question about RDD cache, unpersist, materialization

2014-06-12 Thread innowireless TaeYun Kim
(I¡¯ve clarified the statement (1) of my previous mail. See below.) From: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr] Sent: Friday, June 13, 2014 10:05 AM To: user@spark.apache.org Subject: RE: Question about RDD cache, unpersist, materialization Currently I use

RE: Question about RDD cache, unpersist, materialization

2014-06-11 Thread Nick Pentreath
If you want to force materialization use .count() Also if you can simply don't unpersist anything, unless you really need to free the memory  — Sent from Mailbox On Wed, Jun 11, 2014 at 5:13 AM, innowireless TaeYun Kim taeyun@innowireless.co.kr wrote: BTW, it is possible that rdd.first()

RE: Question about RDD cache, unpersist, materialization

2014-06-10 Thread innowireless TaeYun Kim
BTW, it is possible that rdd.first() does not compute the whole partitions. So, first() cannot be uses for the situation below. -Original Message- From: innowireless TaeYun Kim [mailto:taeyun@innowireless.co.kr] Sent: Wednesday, June 11, 2014 11:40 AM To: user@spark.apache.org