[
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16173290#comment-16173290
]
ASF GitHub Bot commented on TINKERPOP-1783:
-------------------------------------------
Github user spmallette commented on a diff in the pull request:
https://github.com/apache/tinkerpop/pull/717#discussion_r139991682
--- Diff: docs/src/upgrade/release-3.3.x.asciidoc ---
@@ -32,6 +32,43 @@ Please see the
link:https://github.com/apache/tinkerpop/blob/3.3.1/CHANGELOG.asc
Upgrading for Users
~~~~~~~~~~~~~~~~~~~
+PageRankVertexProgram
+^^^^^^^^^^^^^^^^^^^^^
+
+There were two major bugs in the way in which PageRank was being
calculated in `PageRankVertexProgram`. First, teleportation
+energy was not being distributed correctly amongst the vertices at each
round. Second, terminal vertices (i.e. vertices
+with no outgoing edges) did not have their full gathered energy
distributed via teleportation.
+
+For users upgrading, note that while the relative rank orders will remain
"the same," the actual PageRank values will differ
+from prior TinkerPop versions.
+
+```
+VERTEX iGRAPH TINKERPOP
+marko 0.1119788 0.11375485828040575
+vadas 0.1370267 0.14598540145985406
+lop 0.2665600 0.30472082661863686
+josh 0.1620746 0.14598540145985406
+ripple 0.2103812 0.1757986539008437
+peter 0.1119788 0.11375485828040575
+```
+
+Normalization preserved through computation:
+
+```
+0.11375485828040575 +
+0.14598540145985406 +
+0.30472082661863686 +
+0.14598540145985406 +
+0.1757986539008437 +
+0.11375485828040575
+==>1.00000000000000018
+```
+
+Two other additions to `PageRankVertexProgram` were provided as well.
+
+1. It now calculates the vertex count and thus, no longer requires the
user to specify the vertex count.
+2. It now allows the user to leverage an epsilon-based convergence instead
of having to specify the number of iterations to execute.
+
--- End diff --
Please include the "See: ...." link to the JIRA issue.
> PageRank gives incorrect results for graphs with sinks
> ------------------------------------------------------
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
> Issue Type: Bug
> Components: process
> Affects Versions: 3.3.0, 3.1.8, 3.2.6
> Reporter: Artem Aliev
> Assignee: Marko A. Rodriguez
> Labels: breaking
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute
> their rank to the entire graph but in the current implementation it is just
> lost.
> {quote}
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote} In the original form of PageRank, the sum of PageRank over all pages
> was the total number of pages on the web at that time
> {quote}
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin>
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4499999999999997
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper
> assumption.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)