It depends on how big a subset of the data you are using to evaluate,
and also how much you are using for test versus training. Yes, that
kind of range is undesirable. How are you executing the evaluation?

On Tue, Mar 9, 2010 at 4:50 PM,  <[email protected]> wrote:
> Hello,
>
> When testing the mahout example BookCrossingRecommender with default settings
> (GenericUserBasedRecommender, PearsonCorrelationSimilarity,
> NearestNUserNeighborhood), I noticed that the result of the evaluation
> (AverageAbsoluteDifferenceRecommenderEvaluator) are
> changing randomly, from one test to another. I get scores between 2.1 and 4.8.
>
> Considering the size of the input (about 100000 users and 100000 books), I 
> can't
> imagine that the randomness in the algorithms can lead to huge evaluation
> differences like that.
>
> What do you think?
>

Reply via email to