This is the same as solving this system:
(cM - I) r_k = (c-1) e_k
To do that, you can use any decomposition technique to compute an
approximate functional inverse of (cM - I).
SVD works so you can use the stochastic SVD algorithm that Dimitriy did.
You probably can short-cut that substantially, however.
A = (cM - I)
Y = A \Omega
Q_1 R_1 = Y
B = Q_1^T A
Normally, you compute the SVD decomposition of B here, but I think you
might be able to solve your problem directly at this point. Of course,
the SVD decomposition of B is very cheap if it fits into memory.
On Mon, Jul 25, 2011 at 8:34 AM, Sebastian Schelter<[email protected]
<mailto:[email protected]>> wrote:
I don't know as I'm not familiar with the technique of Random
Projection to be honest. I know of a paper doing the walk in a
lower-dimensional space:
http://www.cs.cmu.edu/~htong/_**_pdf/ICDM06_tong.pdf<http://www.cs.cmu.edu/~htong/__pdf/ICDM06_tong.pdf>
<http://www.cs.cmu.edu/~htong/**pdf/ICDM06_tong.pdf<http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf>
I plan to base the implementation on the formula from the Pegasus
paper, where one searches for the vector r_k that satisfies the
following equation:
r_k = cMr_k + (1-c)e_k
with c being the probability not to teleport away from a reached
node, M being the transposed, row-normalized adjacency matrix
(similar to the one used in PageRank) and e_k being a vector with
the same dimension as the number of vertices with the kth element
being 1 and all others 0.
This formulation maps kinda nice to Map/Reduce as we simply need to
create the adjacency matrix and then just need to iteratively
multiply it by the steadystate probability vector (similar to
PageRank). So we could include this algorithm without having to
write a lot of custom code because we can express a lot of it
through our already existing linear algebra code.
--sebastian
On 25.07.2011 17<tel:25.07.2011%2017>:17, Ted Dunning wrote:
Can this be done with a random projection?
On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
<[email protected]<mailto:[email protected]>>**wrote:
[
https://issues.apache.org/__**jira/browse/MAHOUT-773?page=__**
com.atlassian.jira.plugin.__**system.issuetabpanels:all-__**tabpanel<https://issues.apache.org/__jira/browse/MAHOUT-773?page=__com.atlassian.jira.plugin.__system.issuetabpanels:all-__tabpanel>
<https://issues.apache.org/**jira/browse/MAHOUT-773?page=**
com.atlassian.jira.plugin.**system.issuetabpanels:all-**tabpanel<https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel>
]
Sebastian Schelter updated MAHOUT-773:
------------------------------**__--------
Description:
I'll create an implementation of Random Walk with Restarts
as described in
Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph
Mining System -
Implementation and Observations"
http://www.cs.cmu.edu/~__**christos/PUBLICATIONS/icdm09-_**
_pegasus.pdf<http://www.cs.cmu.edu/~__christos/PUBLICATIONS/icdm09-__pegasus.pdf>
<http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**
pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
The algorithm is a random walk similar to PageRank with the
difference that
you start at and teleport to a certain node. The
probabilities it computes
can be seen as a measure of proximity between the start node
and a reached
node. To my knowledge RWR can be e.g used for link
predicition in social
networks.
I will try to create an implementation that is able to do
several walks in
parallel and I will assume that a steadystate probability
vector fits in
memory.
I don't plan to use the implementation details from the
paper but I'll
model the algorithm as an iterative multiplication between
the adjacency
matrix of the graph and the matrix created from the
steadystate probability
vectors for the vertices we compute the random walks for.
was:
I'll create an implementation of Random Walk with Restarts
as described in
Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph
Mining System -
Implementation and Observations"
http://www.cs.cmu.edu/~__**christos/PUBLICATIONS/icdm09-_**
_pegasus.pdf<http://www.cs.cmu.edu/~__christos/PUBLICATIONS/icdm09-__pegasus.pdf>
<http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**
pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
The algorithm is a random walk similar to PageRank with the
difference that
you start at and teleport to a certain node. The
probabilities it computes
can be seen as a measure of proximity between the start node
and a reached
node. To my knowledge RWR can be e.g used for link
predicition in social
networks.
I will try to create an implementation that is able to do
several walks in
parallel and I will assume that a steadystate probability
vector fits in
memory.
Implement Random Walk with Restarts
------------------------------**__-----
Key: MAHOUT-773
URL:
https://issues.apache.org/__**jira/browse/MAHOUT-773<https://issues.apache.org/__jira/browse/MAHOUT-773>
<https://issues.apache.org/**jira/browse/MAHOUT-773<https://issues.apache.org/jira/browse/MAHOUT-773>
Project: Mahout
Issue Type: New Feature
Components: Graph
Affects Versions: 0.6
Reporter: Sebastian Schelter
Assignee: Sebastian Schelter
I'll create an implementation of Random Walk with
Restarts as described
in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale
Graph Mining System
- Implementation and Observations"
http://www.cs.cmu.edu/~__**christos/PUBLICATIONS/icdm09-_**
_pegasus.pdf<http://www.cs.cmu.edu/~__christos/PUBLICATIONS/icdm09-__pegasus.pdf>
<http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**
pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
The algorithm is a random walk similar to PageRank with
the difference
that you start at and teleport to a certain node. The
probabilities it
computes can be seen as a measure of proximity between the
start node and a
reached node. To my knowledge RWR can be e.g used for link
predicition in
social networks.
I will try to create an implementation that is able to
do several walks
in parallel and I will assume that a steadystate probability
vector fits in
memory.
I don't plan to use the implementation details from the
paper but I'll
model the algorithm as an iterative multiplication between
the adjacency
matrix of the graph and the matrix created from the
steadystate probability
vectors for the vertices we compute the random walks for.
--
This message is automatically generated by JIRA.
For more information on JIRA, see:
http://www.atlassian.com/__**software/jira<http://www.atlassian.com/__software/jira>
<http://www.atlassian.com/**software/jira<http://www.atlassian.com/software/jira>