Hi Paolo,

I see what you mean.

However, the problem with this input is that the dangling vertices that
don't have a line of their own (such as 11) cannot contribute their
accumulated rank, as no vertex for them will be instantiated. So
counting them doesn't help either.

I think that we should rely on users supplying valid input (a line for
each vertex) and not try to correct for that in the vertex class.

Creating a line for each vertex from such a file is an easy task that is
doable with a single MapReduce pass over the data beforehand.

--sebastian


On 28.05.2012 22:13, Paolo Castagna wrote:
> Hi Sebastian,
> you can try yourself with some simple input (which contains dangling nodes).
> 
> For example, say I have this adjacency list (with mistakes/repetitions and
> self-links, ignore those):
> 
> 1 1 2 2
> 2 3 5 7 9 11
> 3 3 3 6
> 4
> 5 1 2 3 11
> 8 10
> 10 5
> 
> How many vertices? 7 or 11?
> 
> I think this graph has 11 and it's 11 you need to use as number of vertices
> when you compute PageRank.
> 
> Paolo
> 
> Sebastian Schelter wrote:
>> Hi Paolo,
>>
>> Why would getNumVertices() not give back the correct number of vertices?
>> This call should always give back the overall number of vertices (if it
>> doesn't we have to fix it) and you shouldn't have to rely on tricks to
>> count stuff via aggregators.
>>
>> --sebastian
> 
> 

Reply via email to