Hi Serega:

    We have also tried the mahout 0.9 RecommenderJob, and also found the
the result is not good either. We are now debugging into the source code to
find the possible issues. So how about the output of mahout 0.7? we will
switch to this version if the result is acceptable, thanks.

Best
Wei

On Tue, Nov 4, 2014 at 8:00 PM, Serega Sheypak <serega.shey...@gmail.com>
wrote:

> Hi, i used org.apache.mahout.cf.taste.hadoop.item.RecommenderJob in mahout
> 0.7 (CDH4)
> Here are parameters:
> numRecommendations=1000
> threshold=0.91
> maxSimilaritiesPerItem=1000
> maxPrefsPerUserInItemSimilarity=10
> similarityClassname=SIMILARITY_LOGLIKELIHOOD
>
> Then I migrated to 0.9 (CDH5)
> I've found one difference:
> maxPrefsPerUserInItemSimilarity renamed to maxPrefsInItemSimilarity
>
> The other thing is how it works.
> I see this output in 0.7:
>
> USER_RATINGS_NEGLECTED=14954083
>
> USER_RATINGS_USED=32355513
>
> =====
>
> COOCCURRENCES=72 503 210
>
> PRUNED_COOCCURRENCES=0
>
>
> output in 0.9:
>
> NEGLECTED_OBSERVATIONS=39 175 989
>
> ROWS=4 937 362
>
> USED_OBSERVATIONS=10 840 138
>
> =====
>
>
> org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters
> COOCCURRENCES=17 645 029
>
> PRUNED_COOCCURRENCES=0
>
>
> And 0.9 gives me awful result, just trash.
>
> I run  over the same dataset
>
> mahout 0.7 is on old production CDH4 cluster,
>
> mahout 0.9 is on new CDH5 cluster.
>
>
>
> Why there is so huge difference? Is there any possibility to fix it?
>

Reply via email to