On 19.06.2013 01:29, Ted Dunning wrote: > On Tue, Jun 18, 2013 at 11:01 PM, Sebastian Schelter <s...@apache.org> wrote: > >> We could also move the sampling directly to RowSimilarityJob if people >> consider this more useful. > > It will have a large effect on the time for the RowSimilarityJob for some > data.
I put the sampling into PreparePreferenceMatrixJob, because I considered it to be usecase specific for recommendations. > Does anybody have an idea about how much of the total time is in > RowSimilarityJob? What do you mean by total time? Compared to the rest of the jobs in ItemSimilarityJob and RecommenderJob? -sebastian