[ https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marko A. Rodriguez reassigned TINKERPOP-1783: --------------------------------------------- Assignee: Marko A. Rodriguez > PageRank gives incorrect results for graphs with sinks > ------------------------------------------------------ > > Key: TINKERPOP-1783 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1783 > Project: TinkerPop > Issue Type: Bug > Components: process > Affects Versions: 3.3.0, 3.1.8, 3.2.6 > Reporter: Artem Aliev > Assignee: Marko A. Rodriguez > > {quote} Sink vertices (those with no outgoing edges) should evenly distribute > their rank to the entire graph but in the current implementation it is just > lost. > {quote} > Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm > {quote} In the original form of PageRank, the sum of PageRank over all pages > was the total number of pages on the web at that time > {quote} > I found the issue, while comparing results with the spark graphX. > So this is a copy of https://issues.apache.org/jira/browse/SPARK-18847 > How to reproduce: > {code} > gremlin> graph = TinkerFactory.createModern() > gremlin> g = graph.traversal().withComputer() > gremlin> > g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum() > ==>1.318625 > gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum() > ==>3.4499999999999997 > #inital values: > gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum() > ==>6.0 > {code} > They fixed the issue by normalising values after each step. > The other way to fix is to send the message to it self (stay on the same > page). > To workaround the problem just add self pointing edges: > {code} > gremlin>g.V().as('B').addE('knows').from('B') > {code} > Then you'll get always correct sum. But I'm not sure it is a proper > assumption. -- This message was sent by Atlassian JIRA (v6.4.14#64029)