Sean,
Thanks.
It's a developer API and doesn't appear to be exposed.
Ewan
On 07/12/15 15:06, Sean Owen wrote:
I'm not sure if this is available in Python but from 1.3 on you should
be able to call ALS.setFinalRDDStorageLevel with level "none" to ask
it to unpersist when it is done.
On Mon,
Jonathan,
Did you ever get to the bottom of this? I have some users working with
Spark in a classroom setting and our example notebooks run into problems
where there is so much spilled to disk that they run out of quota. A
1.5G input set becomes >30G of spilled data on disk. I looked into how
I'm not sure if this is available in Python but from 1.3 on you should
be able to call ALS.setFinalRDDStorageLevel with level "none" to ask
it to unpersist when it is done.
On Mon, Dec 7, 2015 at 1:42 PM, Ewan Higgs wrote:
> Jonathan,
> Did you ever get to the bottom of
you,
Ilya Ganelin
-Original Message-
From: Stahlman, Jonathan [jonathan.stahl...@capitalone.com]
Sent: Wednesday, July 22, 2015 01:42 PM Eastern Standard Time
To: user@spark.apache.org
Subject: Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel
Hello again
@spark.apache.orgmailto:user@spark.apache.org
user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel
Hi Jonathan,
I believe calling persist with StorageLevel.NONE doesn't do anything. That's
why the unpersist has an if statement before it.
Could you
Hi Jonathan,
I believe calling persist with StorageLevel.NONE doesn't do anything.
That's why the unpersist has an if statement before it.
Could you give more information about your setup please? Number of cores,
memory, number of partitions of ratings_train?
Thanks,
Burak
On Wed, Jul 22, 2015
...@capitalone.com]
Sent: Wednesday, July 22, 2015 01:42 PM Eastern Standard Time
To: user@spark.apache.org
Subject: Re: How to unpersist RDDs generated by ALS/MatrixFactorizationModel
Hello again,
In trying to understand the caching of intermediate RDDs by ALS, I looked into
the source code
Hello again,
In trying to understand the caching of intermediate RDDs by ALS, I looked into
the source code and found what may be a bug. Looking here:
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/recommendation/ALS.scala#L230
you see that ALS.train()