confidence = 1 + alpha * |rating| here (so, c1 means confidence - 1),
so alpha = 1 doesn't specially mean high confidence. The loss function
is computed over the whole input matrix, including all missing "0"
entries. These have a minimal confidence of 1 according to this
formula. alpha controls how much more confident you are in what the
entries that do exist in the input mean. So alpha = 1 is low-ish and
means you don't think the existence of ratings means a lot more than
their absence.

I think the explicit case is similar, but not identical -- here. The
cost function for the explicit case is not the same, which is the more
substantial difference between the two. There, ratings aren't inputs
to a confidence value that becomes a weight in the loss function,
during this factorization of a 0/1 matrix. Instead the rating matrix
is the thing being factorized directly.

On Sun, Jul 26, 2015 at 6:45 AM, Debasish Das <debasish.da...@gmail.com> wrote:
> Hi,
>
> Implicit factorization is important for us since it drives recommendation
> when modeling user click/no-click and also topic modeling to handle 0 counts
> in document x word matrices through NMF and Sparse Coding.
>
> I am a bit confused on this code:
>
> val c1 = alpha * math.abs(rating)
> if (rating > 0) ls.add(srcFactor, (c1 + 1.0)/c1, c1)
>
> When the alpha = 1.0 (high confidence) and rating is > 0 (true for word
> counts), why this formula does not become same as explicit formula:
>
> ls.add(srcFactor, rating, 1.0)
>
> For modeling document, I believe implicit Y'Y needs to stay but we need
> explicit ls.add(srcFactor, rating, 1.0)
>
> I am understanding confidence code further. Please let me know if the idea
> of mapping implicit to handle 0 counts in document word matrix makes sense.
>
> Thanks.
> Deb
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to