This results in no information for universally preferred items, which
is indeed what I was looking for. It looks like this should also work
for other values or explicit preferences--item prices, ratings,
etc..

Intuition says this will result in a lower precision related cross
validation measure since you are discounting the obvious
recommendations. I have no experience with measuring something like
this, any you have would be appreciated.


(this is just guesswork, so I could be terribly wrong)

In a non-IDF-weighted recommender, if you take out the top N% of items
(items with more occurrences in the user-item matrix) precision will
suffer badly, since the recommender will miss opportunities to recommend "easy targets" (items with high probability of occurrence in the test set).

In an IDF-weighted recommender, it could improve precision instead, since you take items highly likely to be in the testset that were not going to be recommended in top positions due to their strong IDF down-weight. This would be a hint that the IDF weight is working to suppress the "obvious" recommendations.

In this last case, precision would tend to go up as you keep removing a bigger share of top items, until you reach a diminishing returns point, in which the growing reduction of data relating user & items provoked by removing top item interactions spoils any advantage of taking them out of the picture. This might be the point in which you decide pruning top items is best. So you could use that % of top items pruned as the place for your "canonical" precision value.

Highly application- and domain-dependent, anyway

Paulo




Reply via email to