I see the same variance, but I believe it's due to a small input size.
At the moment it's using only 5% of the total input, or about 50,000
ratings over 5,000 users. That's fairly small. From there, it's also
looking at only 5% of those users to form neighborhoods. These are
just too low, and I have increased the amount of data the evaluation
uses in a few ways, and get much more stable results.

I also switched the algorithm it uses, since the average difference
was 4 out of 10, which is pretty poor. I think with more research one
could pick the optimal algorithm, but I just picked something that
worked a little better (< 3) for now.

On Tue, Mar 9, 2010 at 6:30 PM, Sean Owen <[email protected]> wrote:
> I see, that definitely doesn't sound right. Let me run it myself
> tonight when I am home and see what I observe.
>
> On Tue, Mar 9, 2010 at 5:40 PM,  <[email protected]> wrote:
>> I did not change anything from the example provided in mahout-example,
>> development version. It uses 5% for evaluation, which is 5000 instances. With
>> such test set size, the range should not be that big. I suspect that there is
>> something wrong somewhere.
>

Reply via email to