Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Ted Dunning Fri, 29 Jul 2011 01:51:57 -0700

Starting with the equation governing the fixed point distribution:

r_k = cMr_k + (1-c)e_k


Multiply on left by identity matrix:

I r_k = c M r_k + (1-c) e_k

0 = (c M - I) r_k + (1-c) e_k

(c-1) e_k = (c M - I) r_k

On Thu, Jul 28, 2011 at 11:02 PM, Lance Norskog <[email protected]> wrote:

> In this reformulation, is the capital I a misprint for one? Or is it
> the identity matrix? If the latter, which algebraic manipulations
> produced this? I can't find any equations in my books involving
> "matrix M - identity".
>
> On Mon, Jul 25, 2011 at 9:06 AM, Ted Dunning <[email protected]>
> wrote:
> > This is the same as solving this system:
> >
> >   (cM - I) r_k = (c-1) e_k
> >
> > To do that, you can use any decomposition technique to compute an
> > approximate functional inverse of (cM - I).
> >
> > SVD works so you can use the stochastic SVD algorithm that Dimitriy did.
> >  You probably can short-cut that substantially, however.
> >
> >      A = (cM - I)
> >
> >      Y = A \Omega
> >      Q_1 R_1 = Y
> >      B = Q_1^T A
> >
> > Normally, you compute the SVD decomposition of B here, but I think you
> might
> > be able to solve your problem directly at this point.  Of course, the SVD
> > decomposition of B is very cheap if it fits into memory.
> >
> > On Mon, Jul 25, 2011 at 8:34 AM, Sebastian Schelter <[email protected]>
> wrote:
> >
> >> I don't know as I'm not familiar with the technique of Random Projection
> to
> >> be honest. I know of a paper doing the walk in a lower-dimensional
> space:
> >> http://www.cs.cmu.edu/~htong/**pdf/ICDM06_tong.pdf<
> http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf>
> >>
> >> I plan to base the implementation on the formula from the Pegasus paper,
> >> where one searches for the vector r_k that satisfies the following
> equation:
> >>
> >> r_k = cMr_k + (1-c)e_k
> >>
> >> with c being the probability not to teleport away from a reached node, M
> >> being the transposed, row-normalized adjacency matrix (similar to the
> one
> >> used in PageRank) and e_k being a vector with the same dimension as the
> >> number of vertices with the kth element being 1 and all others 0.
> >>
> >> This formulation maps kinda nice to Map/Reduce as we simply need to
> create
> >> the adjacency matrix and then just need to iteratively multiply it by
> the
> >> steadystate probability vector (similar to PageRank). So we could
> include
> >> this algorithm without having to write a lot of custom code because we
> can
> >> express a lot of it through our already existing linear algebra code.
> >>
> >> --sebastian
> >>
> >>
> >> On 25.07.2011 17:17, Ted Dunning wrote:
> >>
> >>> Can this be done with a random projection?
> >>>
> >>> On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
> >>> <[email protected]>wrote:
> >>>
> >>>
> >>>>     [
> >>>> https://issues.apache.org/**jira/browse/MAHOUT-773?page=**
> >>>> com.atlassian.jira.plugin.**system.issuetabpanels:all-**tabpanel<
> https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> >
> >>>> ]
> >>>>
> >>>> Sebastian Schelter updated MAHOUT-773:
> >>>> ------------------------------**--------
> >>>>
> >>>>    Description:
> >>>> I'll create an implementation of Random Walk with Restarts as
> described
> >>>> in
> >>>> Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining
> System
> >>>> -
> >>>> Implementation and Observations"
> >>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<
> http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
> >>>>
> >>>> The algorithm is a random walk similar to PageRank with the difference
> >>>> that
> >>>> you start at and teleport to a certain node. The probabilities it
> >>>> computes
> >>>> can be seen as a measure of proximity between the start node and a
> >>>> reached
> >>>> node. To my knowledge RWR can be e.g used for link predicition in
> social
> >>>> networks.
> >>>>
> >>>> I will try to create an implementation that is able to do several
> walks
> >>>> in
> >>>> parallel and I will assume that a steadystate probability vector fits
> in
> >>>> memory.
> >>>>
> >>>> I don't plan to use the implementation details from the paper but I'll
> >>>> model the algorithm as an iterative multiplication between the
> adjacency
> >>>> matrix of the graph and the matrix created from the steadystate
> >>>> probability
> >>>> vectors for the vertices we compute the random walks for.
> >>>>
> >>>>  was:
> >>>> I'll create an implementation of Random Walk with Restarts as
> described
> >>>> in
> >>>> Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining
> System
> >>>> -
> >>>> Implementation and Observations"
> >>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<
> http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
> >>>>
> >>>> The algorithm is a random walk similar to PageRank with the difference
> >>>> that
> >>>> you start at and teleport to a certain node. The probabilities it
> >>>> computes
> >>>> can be seen as a measure of proximity between the start node and a
> >>>> reached
> >>>> node. To my knowledge RWR can be e.g used for link predicition in
> social
> >>>> networks.
> >>>>
> >>>> I will try to create an implementation that is able to do several
> walks
> >>>> in
> >>>> parallel and I will assume that a steadystate probability vector fits
> in
> >>>> memory.
> >>>>
> >>>>
> >>>>  Implement Random Walk with Restarts
> >>>>> ------------------------------**-----
> >>>>>
> >>>>>                 Key: MAHOUT-773
> >>>>>                 URL:
> https://issues.apache.org/**jira/browse/MAHOUT-773<
> https://issues.apache.org/jira/browse/MAHOUT-773>
> >>>>>             Project: Mahout
> >>>>>          Issue Type: New Feature
> >>>>>          Components: Graph
> >>>>>    Affects Versions: 0.6
> >>>>>            Reporter: Sebastian Schelter
> >>>>>            Assignee: Sebastian Schelter
> >>>>>
> >>>>> I'll create an implementation of Random Walk with Restarts as
> described
> >>>>>
> >>>> in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph Mining
> >>>> System
> >>>> - Implementation and Observations"
> >>>> http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**pegasus.pdf<
> http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
> >>>>
> >>>>> The algorithm is a random walk similar to PageRank with the
> difference
> >>>>>
> >>>> that you start at and teleport to a certain node. The probabilities it
> >>>> computes can be seen as a measure of proximity between the start node
> and
> >>>> a
> >>>> reached node. To my knowledge RWR can be e.g used for link predicition
> in
> >>>> social networks.
> >>>>
> >>>>> I will try to create an implementation that is able to do several
> walks
> >>>>>
> >>>> in parallel and I will assume that a steadystate probability vector
> fits
> >>>> in
> >>>> memory.
> >>>>
> >>>>> I don't plan to use the implementation details from the paper but
> I'll
> >>>>>
> >>>> model the algorithm as an iterative multiplication between the
> adjacency
> >>>> matrix of the graph and the matrix created from the steadystate
> >>>> probability
> >>>> vectors for the vertices we compute the random walks for.
> >>>>
> >>>> --
> >>>> This message is automatically generated by JIRA.
> >>>> For more information on JIRA, see: http://www.atlassian.com/**
> >>>> software/jira <http://www.atlassian.com/software/jira>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
>
>
> --
> Lance Norskog
> [email protected]
>

Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Reply via email to