It depends on what the values really mean. If they are something like
ratings, using the most recent version makes most sense. (This is what the
implementations do now.) If they are some kind of sampled reading it might
make sense to take an average. If the input is based on observed activity,
it may be best to accumulate (sum) the data, perhaps with some decay factor.

On Tue, Aug 7, 2012 at 1:14 PM, Dominik Lahmann <
dominik.lahm...@fu-berlin.de> wrote:

> Hi,
>
> I would like to know how I can deal with multiple preference values
> for the same (user, item)-pair from a machine learning perspective?
> That means, I have got more than one rating from a user u for an item i
> available.
> Of course using any kind of average (maybe also taking date information
> into account, e.g. by using a weighted/exponential moving average)
> would be possible.
>
> I am interested in if any more sophisticated methods are used.
>
> Probably it would already be very helpful to know which term to
> look/search for or have some papers on that topic.
>
> As far a I noticed Mahout would always just take the newest preference
> value. Is that correct?
>
> Thanks a lot,
> Dominik
>

Reply via email to