On Fri, Apr 17, 2009 at 5:55 PM, Anthony Lymer <[email protected]> wrote:
> I'm planning to use the concept of collaborative filtering to recommend user
> interface elements to users.
> The users implicitly rate the user interfaces by choosing them. The rating
> scale probably will be binary (1 = chosen, 0 = rejected/unchosen).

Sounds OK. I would differentiate between 'rejected' and 'unchosen'. I
would use, say, 1 for chosen, 0 for rejected, and of course simply no
value, no preference at all, if it is unchosen.


> I'm considering a user-based approach, because I can use some additional
> data describing the user, which should be relevant to the choice of user
> interface elements. So the user similarity can be calculated as a
> combination of their ratings and the additional data.

Yes you can easily implement UserSimilarity yourself to create
whatever notion of user-user similarity you like. This computation
could factor in the conventional values computed by something like
PearsonCorrelationUserSimilarity, with your own values, in some kind
of weighted average. It's entirely up to you.

In this case a user-based recommender is indeed appropriate.


> There are 30 user interface elements, which can be chosen independently of
> each other.
> In the literature I read, there are always a lot more items that are rated.
> (e.g. two million books)

Meh, there's no real right or wrong number of items to have. I can
imagine cases where the number of items, or the number of users, is
much greater. More data generally means better results I guess.


> Are there any problems to use collaborative filtering with only 30 items?
> Are there too little items to calculate an accurate correlation between
> users?

I would try it and see what happens. It does not strike me as
obviously too low. To get a correlation between two users you need at
least two items that they both prefer... and to generalize extremely,
I'd say 10 or so items would give a fine estimate of the correlation.

I would also consider item-based recommenders and slope one. It would
not allow you to add your notion of user similarity, but, in these
algorithms, few items is an advantage rather than a liability.


> Could the additional data about the users be weighted differently than the
> ratings of the user interface elements?

Sure you can do whatever you like in the UserSimilarity implementation
you write.

Reply via email to