See now: https://issues.apache.org/jira/browse/SPARK-6710

On Mon, Apr 6, 2015 at 4:27 AM, Reynold Xin <r...@databricks.com> wrote:
> Adding Jianping Wang to the thread, since he contributed the SVDPlusPlus
> implementaiton.
>
> Jianping,
>
> Can you take a look at this message? Thanks.
>
>
> On Fri, Apr 3, 2015 at 8:41 AM, Michael Malak <
> michaelma...@yahoo.com.invalid> wrote:
>
>> I believe that in the initialization portion of GraphX SVDPlusPluS, the
>> initialization of biases is incorrect. Specifically, in line
>>
>> https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/lib/SVDPlusPlus.scala#L96
>> instead of
>> (vd._1, vd._2, msg.get._2 / msg.get._1, 1.0 / scala.math.sqrt(msg.get._1))
>> it should be
>> (vd._1, vd._2, msg.get._2 / msg.get._1 - u, 1.0 /
>> scala.math.sqrt(msg.get._1))
>>
>> That is, the biases bu and bi (both represented as the third component of
>> the Tuple4[] above, depending on whether the vertex is a user or an item),
>> described in equation (1) of the Koren paper, are supposed to be small
>> offsets to the mean (represented by the variable u, signifying the Greek
>> letter mu) to account for peculiarities of individual users and items.
>>
>> Initializing these biases to wrong values should theoretically not matter
>> given enough iterations of the algorithm, but some quick empirical testing
>> shows it has trouble converging at all, even after many orders of magnitude
>> additional iterations.
>>
>> This perhaps could be the source of previously reported trouble with
>> SVDPlusPlus.
>>
>> http://apache-spark-user-list.1001560.n3.nabble.com/GraphX-SVDPlusPlus-problem-td12885.html
>>
>> If after a day, no one tells me I'm crazy here, I'll go ahead and create a
>> Jira ticket.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to