Hi, i used org.apache.mahout.cf.taste.hadoop.item.RecommenderJob in mahout 0.7 (CDH4) Here are parameters: numRecommendations=1000 threshold=0.91 maxSimilaritiesPerItem=1000 maxPrefsPerUserInItemSimilarity=10 similarityClassname=SIMILARITY_LOGLIKELIHOOD
Then I migrated to 0.9 (CDH5) I've found one difference: maxPrefsPerUserInItemSimilarity renamed to maxPrefsInItemSimilarity The other thing is how it works. I see this output in 0.7: USER_RATINGS_NEGLECTED=14954083 USER_RATINGS_USED=32355513 ===== COOCCURRENCES=72 503 210 PRUNED_COOCCURRENCES=0 output in 0.9: NEGLECTED_OBSERVATIONS=39 175 989 ROWS=4 937 362 USED_OBSERVATIONS=10 840 138 ===== org.apache.mahout.math.hadoop.similarity.cooccurrence.RowSimilarityJob$Counters COOCCURRENCES=17 645 029 PRUNED_COOCCURRENCES=0 And 0.9 gives me awful result, just trash. I run over the same dataset mahout 0.7 is on old production CDH4 cluster, mahout 0.9 is on new CDH5 cluster. Why there is so huge difference? Is there any possibility to fix it?