Am Mittwoch, 31. März 2010 17:37:11 schrieb Sean Owen: Hello,
first of all, thanks for your feedback. I'm starting to understand how
Mahouts recommendations work :)
> You could skip the comparison, sure, but didn't think it worth
> complicating the code here to make that optimization, which wouldn't
> do much for overall speed.
ok
But at least i understood the code correctly.
> > Better would be to have two DataModels which are divided at a
> > certain point in time and then we make recommendations based on the
> > older one and check if these occur in the newer one, correct?
> > This way we would have a way to tell which ones are "good"
> > recommendations and which one are not.
>
> I agree that seems somewhat more coherent. At least the training
> model is a set of preferences that actually existed, together, at
> some point. The framework does not have timing information in
> general though, so that's why it's not appearing.
>
> You could modify the code to use this info for your purposes if you
> like.
I'll try to implement this.
What i would do is:
with two data sets (dataModel and testDataModel)
select a few users from dataModel
for each user u
select a few of his itemIds
for each of this itemIds
{
make recommendations
check whether these recommendations appear in the testDataModel
for
this user (maybe check if they are not already in dataModel)
}
wdyt?
> But I think it still has the same basic flaw, which may still give
> you low and unuseful results: it's just judging how well the
> recommender recommends those items the user went on to encounter.
> While those are probably good recommendations, they're not
> necessarily the best.
>
> As an evaluation, it's still better than nothing, though I think it's
> hard to get a meaningful result from the eval this way.
Well if you take the items as web pages it makes more sense, doesn't it?
> Hmm, I should write about this in the chapter more, eh.
I could think of a few cases where evaluating a boolean recommender with
two time separated dataModels makes sense.
For example with shopping or other recommender data it makes sense to
me. You would have to track the recommendation "clicks" and fill the
testDataModel with this information.
Of course this would only work if you already had some data.
In the web page setting this is actually "easy", just split the log
file/session data at some point in time (for each user if needed).
regards
Christoph Hermann
--
Christoph Hermann
Institut für Informatik
Tel: +49 761-203-8171 Fax: +49 761-203-8162
e-mail: [email protected]
smime.p7s
Description: S/MIME cryptographic signature
