[ https://issues.apache.org/jira/browse/FLINK-2375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634119#comment-14634119 ]
ASF GitHub Bot commented on FLINK-2375: --------------------------------------- Github user shghatge commented on the pull request: https://github.com/apache/flink/pull/923#issuecomment-123051521 Updated PR @vasia I removed the Vertex Centric Iteration :) Hope this method is okay. > Add Approximate Adamic Adar Similarity method using BloomFilters > ---------------------------------------------------------------- > > Key: FLINK-2375 > URL: https://issues.apache.org/jira/browse/FLINK-2375 > Project: Flink > Issue Type: Task > Components: Gelly > Reporter: Shivani Ghatge > Assignee: Shivani Ghatge > Priority: Minor > > Just as Jaccard, the Adamic-Adar algorithm measures the similarity between a > set of nodes. However, instead of counting the common neighbors and dividing > them by the total number of neighbors, the similarity is weighted according > to the vertex degrees. In particular, it's equal to log(1/numberOfEdges). > The Adamic-Adar algorithm can be broken into three steps: > 1). For each vertex, compute the log of its inverse degrees (with the formula > above) and set it as the vertex value. > 2). Each vertex will then send this new computed value along with a list of > neighbors to the targets of its out-edges > 3). Weigh the edges with the Adamic-Adar index: Sum over n from CN of > log(1/k_n)(CN is the set of all common neighbors of two vertices x, y. k_n is > the degree of node n). See [2] > Using BloomFilters we increase the scalability of the algorithm. The values > calculated for the edges will be approximate. > Prerequisites: > Full understanding of the Jaccard Similarity Measure algorithm > Reading the associated literature: > [1] http://social.cs.uiuc.edu/class/cs591kgk/friendsadamic.pdf > [2] > http://stackoverflow.com/questions/22565620/fast-algorithm-to-compute-adamic-adar -- This message was sent by Atlassian JIRA (v6.3.4#6332)