Hi Sean, thanks for the tips. Will give it a go tomorrow.
Paul. On Mon, Apr 27, 2009 at 10:17 PM, Sean Owen <[email protected]> wrote: > Yeah the problem here is that all the ratings are '1', and a > correlation-based similarity metric like Pearson will return a "NaN" > for the similarity between all users as a result. > > You want to take advantage of the situation by using the bits of code > that assume you are in this situation, where all the ratings are the > same or 1 or don't matter. Support for this mode is still a bit > evolving, but basically you want to: > > - Use BooleanTanimotoCoefficientSimilarity instead of Pearson. > - Omit the ",1" in the data file -- in fact you need to to get this to > work. > - Also separately I might generally discourage people from trying > PreferenceInferrer unless you know you need or want it; I don't really > like this technique. In fact for the similarity implementation above > it won't be supported. So just remove that line. > > If any problems come up write back, might have missed a detail there. > > 2009/4/27 Paul Loy <[email protected]>: > > Hi, > > > > I want to create recommendations for my customers based on boolean data. > > Essencially whether they bought a product. > > > > So this will create a csv containing: > > > > acctId, itemId, 1 > > > > There is an entry in the CSV for each sale. So all entries will have a > > 'rating' of 1. Using the following example: > > > > DataModel model = new FileDataModel(new File("data.txt")); > > > > PearsonCorrelationSimilarity userSimilarity = new > > PearsonCorrelationSimilarity(model); > > userSimilarity.setPreferenceInferrer(new > > AveragingPreferenceInferrer(model)); > > > > UserNeighborhood neighborhood = > > new NearestNUserNeighborhood(1, userSimilarity, model); > > > > Recommender recommender = > > new GenericUserBasedRecommender(model, neighborhood, > > userSimilarity); > > Recommender cachingRecommender = new > > CachingRecommender(recommender); > > > > List<RecommendedItem> recommendations = > > cachingRecommender.recommend("1967128", 10); > > > > for (RecommendedItem item : recommendations) { > > System.out.println(item); > > } > > > > I get 0 recommendations even when I have seeded the file with obvious > > correlations. I'm guessing this is because all 'ratings' are 1. Is there > any > > way to infer that all other items have a rating of 0, thus giving the > > algorithms something to correlate? > > > > Thanks, > > > > Paul > > > > > > > > -- > > --------------------------------------------- > > Paul Loy > > [email protected] > > http://www.keteracel.com/paul > > > -- --------------------------------------------- Paul Loy [email protected] http://www.keteracel.com/paul
