[jira] [Commented] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Artem Aliev (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16167499#comment-16167499
 ] 

Artem Aliev commented on TINKERPOP-1783:


The work around I proposed is incorrect.
The correct behaviour is "user come to random vertex from the sink vertex"

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #705: make TinkerGraph cloneable

2017-09-15 Thread robertdale
Github user robertdale commented on the issue:

https://github.com/apache/tinkerpop/pull/705
  
That dependency cycle is bad.  It should probably be put in `gremlin-test`. 
Maybe even make it a [Graph 
Feature](http://tinkerpop.apache.org/docs/current/reference/#_features) - 
Cloning.


---


[GitHub] tinkerpop pull request #680: TINKERPOP-1692 Neo4j 3.2.2

2017-09-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/tinkerpop/pull/680


---


[jira] [Commented] (TINKERPOP-1692) Bump to Neo4j 3.2.3

2017-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16167759#comment-16167759
 ] 

ASF GitHub Bot commented on TINKERPOP-1692:
---

Github user asfgit closed the pull request at:

https://github.com/apache/tinkerpop/pull/680


> Bump to Neo4j 3.2.3
> ---
>
> Key: TINKERPOP-1692
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1692
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: neo4j
>Affects Versions: 3.2.5
>Reporter: stephen mallette
> Fix For: 3.3.1
>
>
> There is a newer version of Neo4j available - 
> https://mvnrepository.com/artifact/org.neo4j/neo4j-tinkerpop-api-impl/0.4-3.0.3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (TINKERPOP-1692) Bump to Neo4j 3.2.3

2017-09-15 Thread Robert Dale (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Dale reassigned TINKERPOP-1692:
--

Assignee: Robert Dale

> Bump to Neo4j 3.2.3
> ---
>
> Key: TINKERPOP-1692
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1692
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: neo4j
>Affects Versions: 3.2.5
>Reporter: stephen mallette
>Assignee: Robert Dale
> Fix For: 3.3.1
>
>
> There is a newer version of Neo4j available - 
> https://mvnrepository.com/artifact/org.neo4j/neo4j-tinkerpop-api-impl/0.4-3.0.3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (TINKERPOP-1692) Bump to Neo4j 3.2.3

2017-09-15 Thread Robert Dale (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Dale closed TINKERPOP-1692.
--
Resolution: Fixed

> Bump to Neo4j 3.2.3
> ---
>
> Key: TINKERPOP-1692
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1692
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: neo4j
>Affects Versions: 3.2.5
>Reporter: stephen mallette
>Assignee: Robert Dale
> Fix For: 3.3.1
>
>
> There is a newer version of Neo4j available - 
> https://mvnrepository.com/artifact/org.neo4j/neo4j-tinkerpop-api-impl/0.4-3.0.3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #705: make TinkerGraph cloneable

2017-09-15 Thread spmallette
Github user spmallette commented on the issue:

https://github.com/apache/tinkerpop/pull/705
  
At this point I'd be -1 if we turned this into a "feature". I only thought 
of this as a convenience to TinkerGraph. As I mentioned before I really don't 
see why a `clone()` would make sense in most other graph databases. I sort of 
think of `clone()` as a feature of TinkerGraph the way indexing is a feature of 
TinkerGraph. So I technically preferred the PR as it was as opposed to a 
generalized utility function that will work shoddily for large graphs. 

Anyway, here's the solution I have that should make everyone content. 
@okram liked this as a utility class but ultimately didn't have strong feeling 
about it either way. @mpollmeier seemed to make it clear that this was to help 
with testing. How about we just move `GraphHelper` to gremlin-test. 


https://github.com/apache/tinkerpop/tree/master/gremlin-test/src/main/java/org/apache/tinkerpop/gremlin

Then it is a utility that clearly exists for testing use cases only.  
TinkerGraph depends on gremlin-test and can thus directly test it's capabilties 
- maybe just add your "clone" test to:


https://github.com/apache/tinkerpop/blob/master/tinkergraph-gremlin/src/test/java/org/apache/tinkerpop/gremlin/tinkergraph/structure/TinkerGraphTest.java

@mpollmeier if this is agreeable to you, perhaps wait a few days to see if 
there are other comments before progressing forward. i'd hate for you to make 
more changes and then someone yells -1 at you.


---


[GitHub] tinkerpop issue #715: change behaviour of repeat step to be depth first sear...

2017-09-15 Thread dkuppitz
Github user dkuppitz commented on the issue:

https://github.com/apache/tinkerpop/pull/715
  
Using the modern graph:

```
gremlin> g.V().emit().repeat(both()).times(3).limit(15).path()
==>[v[1]]
==>[v[1],v[3]]
==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[3],v[1]]
==>[v[1],v[3],v[1],v[3]]
==>[v[1],v[3],v[1],v[2]]
==>[v[1],v[3],v[1],v[4]]
==>[v[1],v[3],v[4]]
==>[v[1],v[3],v[4],v[5]]
==>[v[1],v[3],v[4],v[3]]
==>[v[1],v[3],v[4],v[1]]
==>[v[1],v[3],v[6]]
==>[v[1],v[3],v[6],v[3]]
==>[v[1],v[2],v[1]]
```

^ This doesn't look like DFS to me. Row 3 and 4 should come much later.


---


[jira] [Commented] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Marko A. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168044#comment-16168044
 ] 

Marko A. Rodriguez commented on TINKERPOP-1783:
---

So TinkerPop's "PageRank" is not the standard PageRank as defined by the 
eigenvector of {{x = (0.85A + 0.25B)x}}, where {{B}} is the fully connected 
"teleportation graph." However, after a quick thought, I believe it is possible 
to make TinkerPop's PageRank fully legit by doing the following:

1. During the first pass, get a vertex count. This isn't a big deal as we 
already calculate the edge counts on the first pass. We would save that vertex 
count to global {{Memory}}. 
2. Whenever energy is being passed out of a vertex with no edges, that energy 
is put into a "teleport" variable in {{Memory}}.
3. If there is any "teleport" energy from the previous pass, then add it to the 
current vertex's energy as {{teleportEnergy / vertexCount}}.
4. Reset the teleport energy back to 0 in {{Memory}} after every pass.

That will give the classic PageRank result.

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Marko A. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168044#comment-16168044
 ] 

Marko A. Rodriguez edited comment on TINKERPOP-1783 at 9/15/17 3:47 PM:


So TinkerPop's "PageRank" is not the standard PageRank as defined by the 
eigenvector of {{x = (0.85A + 0.15B)x}}, where {{B}} is the fully connected 
"teleportation graph." However, after a quick thought, I believe it is possible 
to make TinkerPop's PageRank fully legit by doing the following:

1. During the first pass, get a vertex count. This isn't a big deal as we 
already calculate the edge counts on the first pass. We would save that vertex 
count to global {{Memory}}. 
2. Whenever energy is being passed out of a vertex with no edges, that energy 
is put into a "teleport" variable in {{Memory}}.
3. If there is any "teleport" energy from the previous pass, then add it to the 
current vertex's energy as {{teleportEnergy / vertexCount}}.
4. Reset the teleport energy back to 0 in {{Memory}} after every pass.

That will give the classic PageRank result.


was (Author: okram):
So TinkerPop's "PageRank" is not the standard PageRank as defined by the 
eigenvector of {{x = (0.85A + 0.25B)x}}, where {{B}} is the fully connected 
"teleportation graph." However, after a quick thought, I believe it is possible 
to make TinkerPop's PageRank fully legit by doing the following:

1. During the first pass, get a vertex count. This isn't a big deal as we 
already calculate the edge counts on the first pass. We would save that vertex 
count to global {{Memory}}. 
2. Whenever energy is being passed out of a vertex with no edges, that energy 
is put into a "teleport" variable in {{Memory}}.
3. If there is any "teleport" energy from the previous pass, then add it to the 
current vertex's energy as {{teleportEnergy / vertexCount}}.
4. Reset the teleport energy back to 0 in {{Memory}} after every pass.

That will give the classic PageRank result.

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (TINKERPOP-1772) "Getting started" page not correct leading to difficulties to start with console

2017-09-15 Thread stephen mallette (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stephen mallette closed TINKERPOP-1772.
---
Resolution: Invalid

Seems like we can close this one as a non-issue at this point. Please re-open 
if I"m missing something.

> "Getting started" page not correct leading to difficulties to start with 
> console 
> -
>
> Key: TINKERPOP-1772
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1772
> Project: TinkerPop
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 3.3.0
> Environment: Windows
>Reporter: Cédric L. Charlier
> Attachments: 001-without-edit-of-gremlin-bat.png, 
> 002-with-edit-of-gremlin-bat.png, Screenshot_2017-09-10_14-37-12.png
>
>
> The [getting started 
> page](http://tinkerpop.apache.org/docs/current/tutorials/getting-started/) is 
> assuming that when you start the console the plugin 
> ```tinkerpop.tinkergraph``` is activated. It's confirmed by the first "screen 
> shot" of the console.
> Unfortunately, it's not the case and when later you start to write your first 
> command
> {code:java}
> graph = TinkerFactory.createModern()
> {code}
> you'll receive an exception. It's really difficult for newbies to understand 
> that they need to load the tinkerpop.tinkergraph plugin and could stop them 
> in their progress with TinkerPop.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (TINKERPOP-1717) Update name and link of DynamoDB storage backend in landing page

2017-09-15 Thread stephen mallette (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stephen mallette closed TINKERPOP-1717.
---
   Resolution: Done
 Assignee: stephen mallette
Fix Version/s: 3.3.1

I just came back across this - I made the adjustment requested. It occurs to me 
now that perhaps we didn't need an official DISCUSS thread for this. We already 
had this provider in the index - we just needed to rename/relink given the name 
change. anyway - done.

> Update name and link of DynamoDB storage backend in landing page
> 
>
> Key: TINKERPOP-1717
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1717
> Project: TinkerPop
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 3.2.3
>Reporter: Alexander Patrikalakis
>Assignee: stephen mallette
>Priority: Minor
> Fix For: 3.3.1
>
>   Original Estimate: 0.5h
>  Remaining Estimate: 0.5h
>
> Amazon have released DynamoDB storage backends compatible with JG 0.1.0 and 
> 0.1.1:
> https://mvnrepository.com/artifact/com.amazonaws/dynamodb-janusgraph-storage-backend/1.0.0
> https://mvnrepository.com/artifact/com.amazonaws/dynamodb-janusgraph-storage-backend/1.1.0
> Also, we have renamed the repository on GitHub to 
> dynamodb-janusgraph-storage-backend for git history continuity. So, can we 
> update the names and links on the landing page? Also minor, I would like to 
> update the capitalization of "storage backend" -> "Storage Backend"
> Before:
> [Titan (Amazon)](https://github.com/awslabs/dynamodb-titan-storage-backend/)  
> - The Amazon DynamoDB storage backend for Titan.
> After:
> [JanusGraph 
> (Amazon)](https://github.com/awslabs/dynamodb-janusgraph-storage-backend/)  - 
> The Amazon DynamoDB Storage Backend for JanusGraph.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Marko A. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168407#comment-16168407
 ] 

Marko A. Rodriguez commented on TINKERPOP-1783:
---

I implemented the "teleporation energy" for dead end vertices and here is the 
result I got for MODERN.

{code}
{a=0.231812503, b=ripple}, 
{a=0.4018125, b=lop}, 
{a=0.19253, b=vadas}, 
{a=0.15002, b=marko}, 
{a=0.15002, b=peter}, 
{a=0.19253, b=josh}
{code}

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Marko A. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168407#comment-16168407
 ] 

Marko A. Rodriguez edited comment on TINKERPOP-1783 at 9/15/17 7:55 PM:


I implemented the "teleportation energy" for dead end vertices and here is the 
result I got for MODERN.

{code}
marko: 0.2535703278398552
vadas: 0.324571208050876
lop: 0.6738708694531045
josh: 0.324571208050876
ripple: 0.38986734860902106
peter: 0.2535703278398552
{code}

Next, I ran PageRank over the GraphML MODERN in iGraph and got:

{code}
marko: 0.1119788 
vadas: 0.1370267 
lop: 0.2665600 
josh: 0.1620746 
ripple: 0.2103812 
peter: 0.1119788 
{code}


was (Author: okram):
I implemented the "teleporation energy" for dead end vertices and here is the 
result I got for MODERN.

{code}
{a=0.231812503, b=ripple}, 
{a=0.4018125, b=lop}, 
{a=0.19253, b=vadas}, 
{a=0.15002, b=marko}, 
{a=0.15002, b=peter}, 
{a=0.19253, b=josh}
{code}

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #715: change behaviour of repeat step to be depth first sear...

2017-09-15 Thread mpollmeier
Github user mpollmeier commented on the issue:

https://github.com/apache/tinkerpop/pull/715
  
That's likely because of the RepeatUnrollStrategy, which kicks in when 
there's a foreseeable number of iterations. Needs to be changed as well I 
guess. 

-Original Message-
From: Daniel Kuppitz 
To: apache/tinkerpop 
Cc: Michael Pollmeier , Author 

Sent: Sat, 16 Sep 2017 1:53
Subject: Re: [apache/tinkerpop] change behaviour of repeat step to be depth 
first search (DFS) (#715)

Using the modern graph:

```
gremlin> g.V().emit().repeat(both()).times(3).limit(15).path()
==>[v[1]]
==>[v[1],v[3]]
==>[v[1],v[2]]
==>[v[1],v[4]]
==>[v[1],v[3],v[1]]
==>[v[1],v[3],v[1],v[3]]
==>[v[1],v[3],v[1],v[2]]
==>[v[1],v[3],v[1],v[4]]
==>[v[1],v[3],v[4]]
==>[v[1],v[3],v[4],v[5]]
==>[v[1],v[3],v[4],v[3]]
==>[v[1],v[3],v[4],v[1]]
==>[v[1],v[3],v[6]]
==>[v[1],v[3],v[6],v[3]]
==>[v[1],v[2],v[1]]
```

^ This doesn't look like DFS to me. Row 3 and 4 should come much later.

-- 
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/tinkerpop/pull/715#issuecomment-329789246



---


[jira] [Comment Edited] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Marko A. Rodriguez (JIRA)

[ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168407#comment-16168407
 ] 

Marko A. Rodriguez edited comment on TINKERPOP-1783 at 9/15/17 7:58 PM:


I implemented the "teleportation energy" for dead end vertices and here is the 
result I got for MODERN.

{code}
marko: 0.2535703278398552
vadas: 0.324571208050876
lop: 0.6738708694531045
josh: 0.324571208050876
ripple: 0.38986734860902106
peter: 0.2535703278398552
{code}

Next, I ran PageRank over the GraphML MODERN in iGraph and got:

{code}
marko: 0.1119788 
vadas: 0.1370267 
lop: 0.2665600 
josh: 0.1620746 
ripple: 0.2103812 
peter: 0.1119788 
{code}

If I renormalize the TinkerPop PageRank vector to 1.0, then the values are more 
aligned.

{code}
0.1142198 
0.1462019 
0.3035426 
0.1462019 
0.1756143 
0.1142198
{code}

...don't know why I'm get this renormalization problem. :/


was (Author: okram):
I implemented the "teleportation energy" for dead end vertices and here is the 
result I got for MODERN.

{code}
marko: 0.2535703278398552
vadas: 0.324571208050876
lop: 0.6738708694531045
josh: 0.324571208050876
ripple: 0.38986734860902106
peter: 0.2535703278398552
{code}

Next, I ran PageRank over the GraphML MODERN in iGraph and got:

{code}
marko: 0.1119788 
vadas: 0.1370267 
lop: 0.2665600 
josh: 0.1620746 
ripple: 0.2103812 
peter: 0.1119788 
{code}

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] tinkerpop issue #715: change behaviour of repeat step to be depth first sear...

2017-09-15 Thread dkuppitz
Github user dkuppitz commented on the issue:

https://github.com/apache/tinkerpop/pull/715
  
It's not. It can't be unrolled, since I'm using emit().


---


[jira] [Assigned] (TINKERPOP-1783) PageRank gives incorrect results for graphs with sinks

2017-09-15 Thread Marko A. Rodriguez (JIRA)

 [ 
https://issues.apache.org/jira/browse/TINKERPOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marko A. Rodriguez reassigned TINKERPOP-1783:
-

Assignee: Marko A. Rodriguez

> PageRank gives incorrect results for graphs with sinks
> --
>
> Key: TINKERPOP-1783
> URL: https://issues.apache.org/jira/browse/TINKERPOP-1783
> Project: TinkerPop
>  Issue Type: Bug
>  Components: process
>Affects Versions: 3.3.0, 3.1.8, 3.2.6
>Reporter: Artem Aliev
>Assignee: Marko A. Rodriguez
>
> {quote} Sink vertices (those with no outgoing edges) should evenly distribute 
> their rank to the entire graph but in the current implementation it is just 
> lost.
> {quote} 
> Wiki: https://en.wikipedia.org/wiki/PageRank#Simplified_algorithm
> {quote}  In the original form of PageRank, the sum of PageRank over all pages 
> was the total number of pages on the web at that time
> {quote} 
> I found the issue, while comparing results with the spark graphX.
> So this is a copy of  https://issues.apache.org/jira/browse/SPARK-18847
> How to reproduce:
> {code}
> gremlin> graph = TinkerFactory.createModern()
> gremlin> g = graph.traversal().withComputer()
> gremlin> 
> g.V().pageRank(0.85).times(40).by('pageRank').values('pageRank').sum()
> ==>1.318625
> gremlin> g.V().pageRank(0.85).times(1).by('pageRank').values('pageRank').sum()
> ==>3.4497
> #inital values:
> gremlin> g.V().pageRank(0.85).times(0).by('pageRank').values('pageRank').sum()
> ==>6.0
> {code}
> They fixed the issue by normalising values after each step.
> The other way to fix is to send the message to it self (stay on the same 
> page).
> To workaround the problem just add self pointing edges:
> {code}
> gremlin>g.V().as('B').addE('knows').from('B')
> {code}
> Then you'll get always correct sum. But I'm not sure it is a proper 
> assumption. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)