If your input is clicks, carts, etc. yes you ought to get generally
good results from something meant to consume implicit feedback, like
ALS (for implicit feedback, yes there are at least two main variants).
I think you are talking about the implicit version since you mention
0/1.

lambda is the regularization parameter. It is defined a bit
differently in the various papers though. Test a few values if you
can.
But you said "no weights in the regularization"... what do you mean?
you don't want to disable regularization entirely.

On Mon, Mar 25, 2013 at 2:14 PM, Koobas <koo...@gmail.com> wrote:
> On Mon, Mar 25, 2013 at 9:52 AM, Sean Owen <sro...@gmail.com> wrote:
>
>> On Mon, Mar 25, 2013 at 1:41 PM, Koobas <koo...@gmail.com> wrote:
>> >> But the assumption works nicely for click-like data. Better still when
>> >> you can "weakly" prefer to reconstruct the 0 for missing observations
>> >> and much more strongly prefer to reconstruct the "1" for observed
>> >> data.
>> >>
>> >
>> > This does seem intuitive.
>> > How does the benefit manifest itself?
>> > In lowering the RMSE of reconstructing the interaction matrix?
>> > Are there any indicators that it results in better recommendations?
>> > Koobas
>>
>> In this approach you are no longer reconstructing the interaction
>> matrix, so there is no RMSE vs the interaction matrix. You're
>> reconstructing a matrix of 0 and 1. Because entries are weighted
>> differently, you're not even minimizing RMSE over that matrix -- the
>> point is to take some errors more seriously than others. You're
>> minimizing a *weighted* RMSE, yes.
>>
>> Yes of course the goal is better recommendations.  This broader idea
>> is harder to measure. You can use mean average precision to measure
>> the tendency to predict back interactions that were held out.
>>
>> Is it better? depends on better than *what*. Applying algorithms that
>> treat input like ratings doesn't work as well on click-like data. The
>> main problem is that these will tend to pay too much attention to
>> large values. For example if an item was clicked 1000 times, and you
>> are trying to actually reconstruct that "1000", then a 10% error
>> "costs" (0.1*1000)^2 = 10000. But a 10% error in reconstructing an
>> item that was clicked once "costs" (0.1*1)^2 = 0.01. The former is
>> considered a million times more important error-wise than the latter,
>> even though the intuition is that it's just 1000 times more important.
>>
>> Better than algorithms that ignore the weight entirely -- yes probably
>> if only because you are using more information. But as in all things
>> "it depends".
>>
>
> Let's say the following.
> Classic market basket.
> Implicit feedback.
> Ones and zeros in the input matrix, no weights in the regularization,
> lambda=1.
> What I will get is:
> A) a reasonable recommender,
> B) a joke of a recommender.

Reply via email to