Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Lance Norskog Thu, 28 Jul 2011 22:56:20 -0700

The word 'teleport' does not appear anywhere in the PEGASUS paper, or
the RWR paper mentioned (reference 36). What does it mean?


On Tue, Jul 26, 2011 at 12:56 AM, Ted Dunning <[email protected]> wrote:
> It is hard to say, but if your skinny forms of the matrix will fit in
> memory, then you will only need three or so map-reduce steps with no
> iteration.  They should take about as long as your current iterations or a
> little faster.
>
> For the matrices quoted in the paper you cited, I think I could even do the
> computation in R.
>
> On Tue, Jul 26, 2011 at 12:08 AM, Sebastian Schelter <[email protected]> wrote:
>
>> Hi Ted,
>>
>> can you estimate how big the performance/scalability gain of using random
>> projection would be compared to using the standard approach of iterative
>> matrix-vector multiplication?
>>
>> I already have working prototype code for the latter one, where I extend
>> the approach to do a multiplication between the adjacency matrix and a
>> matrix of steadystate probability vectors so several random walks can be
>> done in parallel.
>>
>> If the gain would be huge I'd try to dig into random projection though.
>>
>> --sebastian
>>
>>
>> On 25.07.2011 18:06, Ted Dunning wrote:
>>
>>> This is the same as solving this system:
>>>
>>>    (cM - I) r_k = (c-1) e_k
>>>
>>> To do that, you can use any decomposition technique to compute an
>>> approximate functional inverse of (cM - I).
>>>
>>> SVD works so you can use the stochastic SVD algorithm that Dimitriy did.
>>>  You probably can short-cut that substantially, however.
>>>
>>>       A = (cM - I)
>>>
>>>       Y = A \Omega
>>>       Q_1 R_1 = Y
>>>       B = Q_1^T A
>>>
>>> Normally, you compute the SVD decomposition of B here, but I think you
>>> might be able to solve your problem directly at this point.  Of course,
>>> the SVD decomposition of B is very cheap if it fits into memory.
>>>
>>> On Mon, Jul 25, 2011 at 8:34 AM, Sebastian Schelter <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>    I don't know as I'm not familiar with the technique of Random
>>>    Projection to be honest. I know of a paper doing the walk in a
>>>    lower-dimensional space:
>>>    
>>> http://www.cs.cmu.edu/~htong/_**_pdf/ICDM06_tong.pdf<http://www.cs.cmu.edu/~htong/__pdf/ICDM06_tong.pdf>
>>>    
>>> <http://www.cs.cmu.edu/~htong/**pdf/ICDM06_tong.pdf<http://www.cs.cmu.edu/~htong/pdf/ICDM06_tong.pdf>
>>> >
>>>
>>>    I plan to base the implementation on the formula from the Pegasus
>>>    paper, where one searches for the vector r_k that satisfies the
>>>    following equation:
>>>
>>>    r_k = cMr_k + (1-c)e_k
>>>
>>>    with c being the probability not to teleport away from a reached
>>>    node, M being the transposed, row-normalized adjacency matrix
>>>    (similar to the one used in PageRank) and e_k being a vector with
>>>    the same dimension as the number of vertices with the kth element
>>>    being 1 and all others 0.
>>>
>>>    This formulation maps kinda nice to Map/Reduce as we simply need to
>>>    create the adjacency matrix and then just need to iteratively
>>>    multiply it by the steadystate probability vector (similar to
>>>    PageRank). So we could include this algorithm without having to
>>>    write a lot of custom code because we can express a lot of it
>>>    through our already existing linear algebra code.
>>>
>>>    --sebastian
>>>
>>>
>>>    On 25.07.2011 17 <tel:25.07.2011%2017>:17, Ted Dunning wrote:
>>>
>>>        Can this be done with a random projection?
>>>
>>>        On Mon, Jul 25, 2011 at 4:59 AM, Sebastian Schelter (JIRA)
>>>        <[email protected] <mailto:[email protected]>>**wrote:
>>>
>>>
>>>
>>>                 [
>>>            https://issues.apache.org/__**jira/browse/MAHOUT-773?page=__**
>>> com.atlassian.jira.plugin.__**system.issuetabpanels:all-__**tabpanel<https://issues.apache.org/__jira/browse/MAHOUT-773?page=__com.atlassian.jira.plugin.__system.issuetabpanels:all-__tabpanel>
>>>            <https://issues.apache.org/**jira/browse/MAHOUT-773?page=**
>>> com.atlassian.jira.plugin.**system.issuetabpanels:all-**tabpanel<https://issues.apache.org/jira/browse/MAHOUT-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel>
>>> >]
>>>
>>>            Sebastian Schelter updated MAHOUT-773:
>>>            ------------------------------**__--------
>>>
>>>                Description:
>>>            I'll create an implementation of Random Walk with Restarts
>>>            as described in
>>>            Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph
>>>            Mining System -
>>>            Implementation and Observations"
>>>            http://www.cs.cmu.edu/~__**christos/PUBLICATIONS/icdm09-_**
>>> _pegasus.pdf<http://www.cs.cmu.edu/~__christos/PUBLICATIONS/icdm09-__pegasus.pdf>
>>>            <http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**
>>> pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>> >
>>>
>>>            The algorithm is a random walk similar to PageRank with the
>>>            difference that
>>>            you start at and teleport to a certain node. The
>>>            probabilities it computes
>>>            can be seen as a measure of proximity between the start node
>>>            and a reached
>>>            node. To my knowledge RWR can be e.g used for link
>>>            predicition in social
>>>            networks.
>>>
>>>            I will try to create an implementation that is able to do
>>>            several walks in
>>>            parallel and I will assume that a steadystate probability
>>>            vector fits in
>>>            memory.
>>>
>>>            I don't plan to use the implementation details from the
>>>            paper but I'll
>>>            model the algorithm as an iterative multiplication between
>>>            the adjacency
>>>            matrix of the graph and the matrix created from the
>>>            steadystate probability
>>>            vectors for the vertices we compute the random walks for.
>>>
>>>              was:
>>>            I'll create an implementation of Random Walk with Restarts
>>>            as described in
>>>            Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale Graph
>>>            Mining System -
>>>            Implementation and Observations"
>>>            http://www.cs.cmu.edu/~__**christos/PUBLICATIONS/icdm09-_**
>>> _pegasus.pdf<http://www.cs.cmu.edu/~__christos/PUBLICATIONS/icdm09-__pegasus.pdf>
>>>            <http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**
>>> pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>> >
>>>
>>>            The algorithm is a random walk similar to PageRank with the
>>>            difference that
>>>            you start at and teleport to a certain node. The
>>>            probabilities it computes
>>>            can be seen as a measure of proximity between the start node
>>>            and a reached
>>>            node. To my knowledge RWR can be e.g used for link
>>>            predicition in social
>>>            networks.
>>>
>>>            I will try to create an implementation that is able to do
>>>            several walks in
>>>            parallel and I will assume that a steadystate probability
>>>            vector fits in
>>>            memory.
>>>
>>>
>>>                Implement Random Walk with Restarts
>>>                ------------------------------**__-----
>>>
>>>                                 Key: MAHOUT-773
>>>                                 URL:
>>>                
>>> https://issues.apache.org/__**jira/browse/MAHOUT-773<https://issues.apache.org/__jira/browse/MAHOUT-773>
>>>                
>>> <https://issues.apache.org/**jira/browse/MAHOUT-773<https://issues.apache.org/jira/browse/MAHOUT-773>
>>> >
>>>                             Project: Mahout
>>>                          Issue Type: New Feature
>>>                          Components: Graph
>>>                    Affects Versions: 0.6
>>>                            Reporter: Sebastian Schelter
>>>                            Assignee: Sebastian Schelter
>>>
>>>                I'll create an implementation of Random Walk with
>>>                Restarts as described
>>>
>>>            in Kang, Tsourakakis, Faloutsos, "PEGASUS: A Peta-Scale
>>>            Graph Mining System
>>>            - Implementation and Observations"
>>>            http://www.cs.cmu.edu/~__**christos/PUBLICATIONS/icdm09-_**
>>> _pegasus.pdf<http://www.cs.cmu.edu/~__christos/PUBLICATIONS/icdm09-__pegasus.pdf>
>>>            <http://www.cs.cmu.edu/~**christos/PUBLICATIONS/icdm09-**
>>> pegasus.pdf<http://www.cs.cmu.edu/~christos/PUBLICATIONS/icdm09-pegasus.pdf>
>>> >
>>>
>>>                The algorithm is a random walk similar to PageRank with
>>>                the difference
>>>
>>>            that you start at and teleport to a certain node. The
>>>            probabilities it
>>>            computes can be seen as a measure of proximity between the
>>>            start node and a
>>>            reached node. To my knowledge RWR can be e.g used for link
>>>            predicition in
>>>            social networks.
>>>
>>>                I will try to create an implementation that is able to
>>>                do several walks
>>>
>>>            in parallel and I will assume that a steadystate probability
>>>            vector fits in
>>>            memory.
>>>
>>>                I don't plan to use the implementation details from the
>>>                paper but I'll
>>>
>>>            model the algorithm as an iterative multiplication between
>>>            the adjacency
>>>            matrix of the graph and the matrix created from the
>>>            steadystate probability
>>>            vectors for the vertices we compute the random walks for.
>>>
>>>            --
>>>            This message is automatically generated by JIRA.
>>>            For more information on JIRA, see:
>>>            
>>> http://www.atlassian.com/__**software/jira<http://www.atlassian.com/__software/jira>
>>>            
>>> <http://www.atlassian.com/**software/jira<http://www.atlassian.com/software/jira>
>>> >
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>



-- 
Lance Norskog
[email protected]

Re: [jira] [Updated] (MAHOUT-773) Implement Random Walk with Restarts

Reply via email to