Hello; I am trying to get the optimal number of factors in ALS. To that end, i am scanning various values and evaluating the RSE. DO i need to un-perisist the RDD between loops or will the resources (memory) get automatically deleted and re-assigned between iterations.
for i in range(5): rank = 5 +int(i ) #imodel = ALS.trainImplicit(smallratings, rank, numIterations) imodel = ALS.train(smallratings, rank, numIterations) predictions = imodel.predictAll(testdata).map(lambda r: ((r[0], r[1]), r[2])) ratesAndPreds = smallratings.map(lambda r: ((r[0], r[1]), r[2])).join(predictions) MSE = ratesAndPreds.map(lambda r: (r[1][0] - r[1][1])**2).reduce(lambda x, y: x + y) / ratesAndPreds.count() print "ho ho ", rank, " ", MSE predictions.unpersist() ratesAndPreds.unpersist() -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Un-persist-RDD-in-a-loop-tp23414.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org