On Mar 14, 2014, at 5:52 PM, Michael Allman <m...@allman.ms> wrote:

> I also found that the product and user RDDs were being rebuilt many times
> over in my tests, even for tiny data sets. By persisting the RDD returned
> from updateFeatures() I was able to avoid a raft of duplicate computations.
> Is there a reason not to do this?

This sounds like a good thing to add, though I’d like to understand why these 
are being recomputed (it seemed that the code would only use each one once). Do 
you have any sense why that is?

Matei

Reply via email to