Hi Brain, The parameter "maxPrefsPerUserInItemSimilarity" is in RecommenderJob, from the text of comment, It is the same as the paramter "maxPrefsPerUser " in ItemSimilarityJob.
The second question is not easy to answer. It is decided by your recommendation scenario and input data features. The most important is the quality of you data (for example , the accuracy of prefer value), not these parameters. These parameters are more relate to the performance of similarity calculation. Thanks. 2013/9/12 Brian Arnold <barnold4...@gmail.com> > Hi, > > Thank you for the response! What you said makes sense. Here is a link to > the other property: > > http://grepcode.com/file/repo1.maven.org/maven2/org.apache.mahout/mahout-core/0.6/org/apache/mahout/cf/taste/hadoop/item/RecommenderJob.java#RecommenderJob.0DEFAULT_MAX_SIMILARITIES_PER_ITEM > > Supposing I have a sufficiently large cluster to process the data, would > increasing the values necessarily give me a better recommendation? Which > do you feel would have the largest impact on the quality of the > recommendation? > > Brian > > > On Thu, Sep 12, 2013 at 7:05 AM, 林伟 <linwe...@gmail.com> wrote: > > > Hi Brian *& *Miliauskas, > > > > I am a data mining engineer form Taobao recommendation team. In past one > > month, I have read all the code of mahout itemCF. > > So maybe I can answer this question. > > > > We consider the input of itemCF for one user is a item vector, like this > > (the notation is from Json object model): > > <userid, [ {item1, perf(u, i1)}, {item2, perf(u, i2)}, ..... {itemN, > > perf(u, in)} ]> > > So, maxPrefsPerUser means max length of item vector. If > > user preferred more than this number items, there a sample will be > applied > > the make sure the limitation. > > > > We also consider the output of ItemCF for one item is a similarity > vector, > > like this: > > <item1, [ {item2, sim(2,1}, {item3, sim(3,1), .... {itemK, sim(K,1)} ]> > > So, maxSimilaritiesPerItem means max length of similarity vector, if > > item1 has more similar items than this number, mahout just output top > > 'maxSimilaritiesPerItem' > > items. > > > > For parameter 'maxPrefsPerUserItemSimilarity', I haven't find it. Can > you > > give me a link to find it. > > > > Thanks > > > > > > > > 2013/9/12 Darius Miliauskas <dariui.miliaus...@gmail.com> > > > > > Hi, Brian, > > > > > > this question is also relevant for me. Perhaps somebody will give more > > > details because I am just learning myself. But, I guess you can try to > > > change the parameters, and check the performance, and write here about > it > > > that everybody would get more knowledge! > > > > > > In general, if these values are lower, the performance should be faster > > > because mahout based on some algorithms of hadoop. I think it could > help > > if > > > you will try the algorithms with several pieces of data, and look if > you > > > are missing some important recommendations. Let's say if you choose " > > > maxSimilaritiesPerItem" as 4, and you miss some recommendations, then > you > > > should increase the value. It is a balance between performance and > better > > > results, and you should find that balance. Hope, you to share more > > details > > > about what you will find out because I noticed that here (in the > mailing > > > list of mahout) everybody is asking but only few replying, and sharing. > > > > > > > > > Thanks, > > > > > > Darius > > > > > > > > > 2013/9/12 Brian Arnold <barnold4...@gmail.com> > > > > > > > Hi, > > > > > > > > I am currently trying to run the distributed Item Based Collaborative > > > > filtering algorithm on our Hadoop cluster, and I have a few questions > > > > regarding tweaking the various properties of the algorithm. For the > > > > maxPrefsPerUser,maxSimilaritiesPerItem, and > > maxPrefsPerUserItemSimilarity > > > > properties I was wondering if I could get a more detailed explanation > > of > > > > what these properties control. I saw the description in the code, > but > > I > > > am > > > > just wondering how changing these values will affect the results of > the > > > > algorithm, and will increasing them result in a better > recommendation. > > > > > > > > Thanks > > > > > > > > > >