Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-29 Thread Avery Ching
We did have a related issue (https://issues.apache.org/jira/browse/GIRAPH-155). On 5/29/12 6:54 AM, Claudio Martella wrote: I'm not sure they will be needed to send them on the first superstep. They'll be created and used in the second superstep if necessary. If they need it in the first supers

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-29 Thread Claudio Martella
I'm not sure they will be needed to send them on the first superstep. They'll be created and used in the second superstep if necessary. If they need it in the first superstep, then i guess they'll put them as a line in the inputfile. I agree with you that this is kind of messed up :) On Tue, May

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-29 Thread Sebastian Schelter
Oh sorry, I didn't know that discussion. The problem I see is that in every implementation, a user might run into this issue, and I don't think its ideal to force users to always run a round of sending empty messages at the beginning. Maybe the system should (somehow) automagically do that for the

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-29 Thread Claudio Martella
About the mapreduce job to prepare the inputset, I did advocate for this solution instead of supporting automatic creation of non-existent vertices implicitly (which I believe adds a logical path in vertex resolution which has some drawbacks e.g you have to check in the hashmap for the existence of

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-29 Thread Sebastian Schelter
On 29.05.2012 13:13, Paolo Castagna wrote: > Hi Sebastian > > Sebastian Schelter wrote: >> Why do you only recompute the pageRank in each second superstep? Can we >> not use the aggregated value of the dangling nodes from the last superstep? > > I removed the computing of PageRank values every ea

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-29 Thread Paolo Castagna
Hi Sebastian Sebastian Schelter wrote: > Why do you only recompute the pageRank in each second superstep? Can we > not use the aggregated value of the dangling nodes from the last superstep? I removed the computing of PageRank values every each second superstep. However, I needed to use a couple

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Paolo Castagna
Sebastian Schelter wrote: > However, the problem with this input is that the dangling vertices that > don't have a line of their own (such as 11) cannot contribute their > accumulated rank, as no vertex for them will be instantiated. So > counting them doesn't help either. No, the 'implicit' dangl

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Sebastian Schelter
Hi Paolo, I see what you mean. However, the problem with this input is that the dangling vertices that don't have a line of their own (such as 11) cannot contribute their accumulated rank, as no vertex for them will be instantiated. So counting them doesn't help either. I think that we should re

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Paolo Castagna
Hi Sebastian, you can try yourself with some simple input (which contains dangling nodes). For example, say I have this adjacency list (with mistakes/repetitions and self-links, ignore those): 1 1 2 2 2 3 5 7 9 11 3 3 3 6 4 5 1 2 3 11 8 10 10 5 How many vertices? 7 or 11? I think this graph has

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Sebastian Schelter
Hi Paolo, Why would getNumVertices() not give back the correct number of vertices? This call should always give back the overall number of vertices (if it doesn't we have to fix it) and you shouldn't have to rely on tricks to count stuff via aggregators. --sebastian On 28.05.2012 21:20, Paolo

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Paolo Castagna
Hi Sebastian Sebastian Schelter wrote: > Could we try to merge this with the patch from > https://issues.apache.org/jira/browse/GIRAPH-191 ? I'll look at doing that, but so far I did not managed to have a correct implementation. Now, that I have an implementation which I know to be correct, I'll

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Paolo Castagna
Hi Sebastian Sebastian Schelter wrote: > I think the code can be improved partially. You don't have to count the > vertices via an aggregator, you can simply use getNumVertices(). No, you cannot use getNumVertices() since it won't count dangling nodes. If you want to count for dangling nodes you

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Sebastian Schelter
I think the code can be improved partially. You don't have to count the vertices via an aggregator, you can simply use getNumVertices(). Why do you only recompute the pageRank in each second superstep? Can we not use the aggregated value of the dangling nodes from the last superstep? Overall I th

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Sebastian Schelter
Hi Paolo, Could we try to merge this with the patch from https://issues.apache.org/jira/browse/GIRAPH-191 ? Best, Sebastian On 28.05.2012 18:39, Paolo Castagna wrote: > Paolo Castagna wrote: >> Sebastian Schelter wrote: >>> I guess that summing up and redistributing the pagerank of dangling >>>

Re: SimplePageRankVertex implementation, dangling nodes and sending messages to all nodes...

2012-05-28 Thread Paolo Castagna
Paolo Castagna wrote: > Sebastian Schelter wrote: >> I guess that summing up and redistributing the pagerank of dangling >> vertices can also be done without an extra superstep in an aggregator. > > Yeah! Why didn't I think about that? > Thanks, great suggestion. > > I am going to give this a go,