On Fri, Apr 17, 2009 at 5:55 PM, Anthony Lymer <[email protected]> wrote: > I'm planning to use the concept of collaborative filtering to recommend user > interface elements to users. > The users implicitly rate the user interfaces by choosing them. The rating > scale probably will be binary (1 = chosen, 0 = rejected/unchosen).
Sounds OK. I would differentiate between 'rejected' and 'unchosen'. I would use, say, 1 for chosen, 0 for rejected, and of course simply no value, no preference at all, if it is unchosen. > I'm considering a user-based approach, because I can use some additional > data describing the user, which should be relevant to the choice of user > interface elements. So the user similarity can be calculated as a > combination of their ratings and the additional data. Yes you can easily implement UserSimilarity yourself to create whatever notion of user-user similarity you like. This computation could factor in the conventional values computed by something like PearsonCorrelationUserSimilarity, with your own values, in some kind of weighted average. It's entirely up to you. In this case a user-based recommender is indeed appropriate. > There are 30 user interface elements, which can be chosen independently of > each other. > In the literature I read, there are always a lot more items that are rated. > (e.g. two million books) Meh, there's no real right or wrong number of items to have. I can imagine cases where the number of items, or the number of users, is much greater. More data generally means better results I guess. > Are there any problems to use collaborative filtering with only 30 items? > Are there too little items to calculate an accurate correlation between > users? I would try it and see what happens. It does not strike me as obviously too low. To get a correlation between two users you need at least two items that they both prefer... and to generalize extremely, I'd say 10 or so items would give a fine estimate of the correlation. I would also consider item-based recommenders and slope one. It would not allow you to add your notion of user similarity, but, in these algorithms, few items is an advantage rather than a liability. > Could the additional data about the users be weighted differently than the > ratings of the user interface elements? Sure you can do whatever you like in the UserSimilarity implementation you write.
